Identifying correct answers


Tree testing is not just about seeing where people go to find things. It’s about seeing if they go to the right place, where the real site would give them the content they’re looking for.

At some point, then, we need to decide where the right place is. And if we’re using tree-testing software, we need to tell it too, so it can tally up the results for us.


Multiple answers

When it comes to correct answers, the most important thing to remember is that there are often several correct answers for a given task.

When we initially set up the tasks and tick off their correct answers, we’ll spot at least a few tasks that have more than one “hit” in our tree.

We may also miss a few correct answers. After all, some trees are large, a given task can sometimes be interpreted more than one way, and our prep time is often limited.

The good news is that it’s OK to miss a few up front. That’s because our participants will probably find more correct answers during the test. When we analyse the results, we may see that some of these “wrong” answers really would find the right content. Once we confirm these are right, we can then mark them correct and recalculate our results.

Suppose, for example, that we've marked two correct answers for the task below:


When the test is over and we're analyzing the results, we notice that many participants chose Beginner tips as their answer. When we think about it more, we realize that this is a reasonable answer as well, so we go back and mark it correct and recalculate our results.


Intermediate vs. leaf nodes

Occasionally, we may create a task where the answer could be any of the subtopics under a topic.

In the example below, all topics under Learning to ride could be considered correct:


Sometimes, this happens because our task is not specific enough - see Writing a good task earlier in this chapter.

If, however, we decide it’s OK for the entire section to be correct, we’ll need to do one of the following:

  • Mark each subtopic correct, or

  • Mark the parent topic correct

Depending on the tree-testing software we’re using, we may not be allowed to mark a parent (intermediate) topic as correct – only leaf nodes (at the end of a branch).

Even if the software allows us, we recommend NOT marking intermediate topics as correct answers.

There are a few reasons for this:

  • We want to see the full path that the participant takes through the tree, and we want to make sure that, even at the lowest level, they can choose the correct answer. If we let them choose at a higher level, we don’t find this out.

  • Often, not all answers in the subtree are correct, so marking the parent as correct is sloppy.

  • Only allowing leaf nodes to be correct simplifies both the participant experience and the later analysis.

If we’re using software that restricts us to choosing leaf nodes only, and we really want to make an entire section correct, there are two alternatives:

  • Mark each subtopic as correct, or

  • Delete the subtopics and mark the parent topic (now no longer a parent) as correct. This may work if we don’t need those former subtopics for other tasks in our study.

How correct is correct?

Some answers are clearly correct; many more are clearly wrong. But what about those answers that are kinda-sorta-OK-but-not-really-the-one-we-intended correct?

Using our bike example above, suppose that we have this task, with these answers initially marked as correct:


But when we get the results back, we find that many participants chose Commuting by bike > Skills training as their answer. It's not the answer we intended because we don't normally think of kids as commuters. On the one hand, kids can ride their bikes to school, and the skills training course does accept children, but it also assumes they already know how to ride a bike. Hmmm. How do we decide if this answer is correct? In general, how do we decide which borderline answers to accept?

This will be a personal call, of course, but we need to draw the line somewhere. Here’s where we’ve drawn it in our projects:

  • In the real site, if the borderline topic would give them enough content to be helpful (in our opinion), we mark it as correct.

  • If the borderline topic will feature a prominent cross-link to the “right” topic, we mark it as correct.
    (By “prominent”, we mean that nobody going to that page could miss it. We don’t mean a “see-also” link parked at the right or bottom of the page, where it could be missed.)

  • Otherwise, we mark it as wrong.

Whatever criteria we decide on, we need to be consistent across all our tasks and (more importantly) across all our tests. If we aren’t consistent, we won’t be able to compare scores reliably across different trees.


Correct now vs. correct later

Normally, we mark our correct answers just before we launch our test. However, it’s common for participants to reveal more potentially correct answers while the study is running.

For more on this, see Cleaning the data in Chapter 12.


Changing the tree or tasks after marking answers

In a perfect world, we would get our tree right the first time, or at least get our tasks right the first time.

The whole point of tree-testing, however, is that we know our initial creations will need revising, so we test to find out what needs to change.

When we inevitably make changes to our tree or our tasks, we need to check how this affects the correct answers we previously identified. There are 3 common scenarios to watch for:

  • The correct answers may have moved in the tree.
    This is easy to fix - just update the correct answers in the tree.

  • The correct answers may have been removed altogether.
    We know of at least one test where revisions to the tree left one task without any correct answers. While it was instructive to see where people (vainly) went, it's definitely not something we want to do on purpose.   (tongue)

  • Revisions create new correct answers.
    Whether we're revising the tree, the tasks, or both, it's not unusual for new answers to "emerge" from our changes. It's best to identify these before we run the test, but if we don't, our participants will usually spot them for us.


Next: Entering tasks and their answers