A free comprehensive guide for evaluating site structures
It’s not just the tree that may need revising after we get our first round of results. Sometimes the tasks themselves need rework.
Fixing misunderstood tasks
Even though we work hard to write clear and unambiguous tasks, the results sometimes make it plain that a task was not clearly understood (or was clearly misunderstood) by our participants. Because most tree testing is done unmoderated, participants can’t ask for clarification when they have trouble understanding a task.
For example, when we helped a bank run a tree test on their self-service website, we included this task:
How would you get automatically notified when your cheque-account balance goes below $100?
In the results, we saw that:
- Some participants went to the Account Status section to see if there were any alerts.
- Other participants went to the Preferences section to set up a low-balance alert.
When we re-read the task, we saw that it could have interpreted as either getting the alert, or setting it up. (We actually meant the latter.)
If we’re doing another round of testing, we should revise these murky tasks so we can get higher-quality results. This makes before-and-after comparisons harder, but comparisons are less important than clear primary findings about the tree.
In the example above, we clarified the task wording so that we could judge the results better:
How would you set up an automatic notification for when your balance goes below $100?
Tasks with very high success rates
We should also reconsider tasks that (almost) everyone got right. It’s great that participants easily found what they were looking for (after all, that’s our Big Goal), but first let’s make sure those tasks weren’t “gimme’s”. Did we give away the answer by careless word matching, or by phrasing the question in the same browsing sequence that the participant would follow in the tree? (For more on these pitfalls, see Writing a good task in Chapter 7.)
Even if the task passes these tests, we still may want to replace it in the next round. Why? Because we won’t learn much from it in later testing. It works, and we should move on.
This is especially true if we find that another part of the tree is either not performing well or is not being tested enough; it may be better to replace our “golden” task with one that tells us more about what needs improving.
Note that there are two major downsides with replacing high-scoring tasks in a later round of testing:
- The overall success rate may go down.
If we replace a task that scored 90%, the task that replaced it is unlikely to score that high, so our average score will go down. This makes the results harder to explain to the project team and management.
- Before/after comparisons will be harder to make.
When we keep the tasks the same between tests, we can make apples-to-apples comparisons. If we start replacing tasks, we can no longer make broad comparisons across tests.
If these factors are important to our study, we may want to avoid replacing high-scoring tasks. If they're not so important (that is, if our need for specific answers to specific questions outweighs these broader considerations), then replacing high-scoring tasks becomes a way of re-focusing the study on the parts of the tree that need more testing.
Updating correct answers accordingly
If we revise tasks, we must also remember to check the correct answers for those tasks. Our revisions may have changed what we should accept as correct. This is especially true if our revisions were not just minor rewording, but actually changed the meaning or purpose of the task itself.
For more, see Identifying correct answers in Chapter 7.
Next: Tuning survey questions