A free comprehensive guide for evaluating site structures
Why run a tree test? As we saw in Chapter 2, there are two common motivations:
- To baseline an existing tree, discovering where the problems are and establishing a base score.
- To try out some new trees that we’ve come up with, looking for problems and comparing to each other (and the baseline tree, if any).
Baselining an existing tree
If we are testing an existing IA (e.g. the structure of a current website that we’re about to revise), we’re obviously interested in finding out which parts of the current structure work well and which don’t.
Most of the time, we will already have an idea of where some of the problems are. It might be from other usability testing we’ve done, from web analytics, from user feedback, from our own gut feelings, or (most commonly) from some mixture of all of these.
When testing an existing IA, then, we’re likely to be looking for:
- How well the suspected problem areas perform, and
- Which other (unsuspected) areas perform particularly well or poorly
Testing revised trees
If we’re revising a site structure, we will generally be looking for:
- How well the revised parts of the new structure perform, and
- Which other areas perform particularly well or poorly, especially areas that may be indirectly affected by the revisions we made.
Trying out new trees
If we’re creating structures for a new website, we may not have much existing research to inform our IA work. In this case, the main value of tree testing is being able to evaluate one or more structures early in the design process, before the website exists even in beta form.
Whether it’s architecture or brand design or vacuum cleaners, the best designers agree on one thing – generate lots of ideas early, then cheerfully discard the ones that don’t work out.
The same is true with site structures. Early in the design phase, we should think up several different ways of structuring our site. Yes, we will probably have a favorite, but our favorite may not be the best solution for our users. If we create some true alternatives and test them against each other, we’re more likely to produce a better structure.
So, whether we’re testing revised structures or new ones, a main goal of our testing should be to compare multiple candidate structures and determine which performs best.
In our experience, what often happens is that tree A performs best overall, but parts of tree B do better than their counterparts in A. The natural next step is to create a hybrid (tree C), which usually ends up testing better than either A or B. If we only created tree A, how would we ever get to C, the better structure?
For most designers, the main reason to run a tree test is to determine if their main grouping scheme works well.
For example, if we decide to go with a task-based scheme (e.g. installing a product, using it, getting support, uninstalling it, etc.), we want to know if our task-based headings help our users find the page they’re looking for.
If we’re testing several grouping schemes against each other (the recommended approach), we want to find out if one scheme is clearly better than the others. If several schemes work equally well, then we can choose based on other criteria (such as how much effort it will take to rework our content to fit a scheme).
We can also flip our schemes and test which variation works better. For example:
- We could create our first tree to use audiences as our top-level headings (e.g. teachers, students, parents, etc.) and tasks as our second level (choosing a school, enrolling, choosing courses, etc.).
- We could create a second tree that flips this scheme, so that our top level is tasks and our second level is audiences.
For more on grouping schemes, see Chapter 5 - Creating trees.
Another big reason to run a tree test is to test the terms we use. Will our users understand what “contingency planning” means? Will they be able to distinguish between “products” and “solutions”?
Labeling is a critical element of information architecture, and it’s often hard to get right. We must consider:
- The terms we use internally (the organization’s jargon) vs. the terms that users understand and prefer.
For example, we may say “Business Development”, but our customers call it “Sales”. Which term should we use on our website?
- The conflicting terms that our various audiences use.
Doctors may be looking for “deep-vein thrombosis”, but you and I would probably look for “blood clot”.
- The many synonyms we can choose from.
Languages are rich, and there are often several words that we could use. Which works best for our website visitors?
If we test alternative terms against each other, we’ll usually find that one works substantially better than the others. That’s a clear win.
However, we may also find that two terms work equally well, in which case we can decide based on other factors. For example, a consumer-review site in New Zealand considered renaming their “Electronics” section to “Technology”, and this became the subject of prolonged internal arguments about whether users would understand the new term properly. When they ran tree tests, they made sure to include tasks that targeted these alternative terms in their trees. The result was a 51/49 split; both terms worked well, so they could use either depending on other factors.
Sharing and documenting issues and goals
When we tree test, there are several problems we typically want to fix, but yours may not completely overlap with mine. To run a good study, we need to be clear about what we're trying to find out, which means discussing it and writing it down.
To do this, we typically run a short workshop (1 to 1.5 hours) to:
- make a list of the problems we're trying to solve, and rank them
- make a list of our specific goals for this study (some of which will spring directly from our issues list).
We record this list in a shared, public place (a project whiteboard, an online spreadsheet, etc.) so we can keep it handy when designing our tree tests.