How many tasks overall?

There is no single definitive answer to how many tasks we should include in a tree test. The number will vary depending on several factors, including:

The size of our tree
Larger trees typically need more tasks to ensure adequate coverage.
The complexity and variability of the tree
If we have many sections that reuse the same general structure (e.g. several product subtrees that have very similar layouts), then we may not need to test each one.
Our confidence in the tree
If we’re just making minor changes to a tree that has tested well before, we may not need to present many tasks. On the other hand, if it’s a completely redesigned, unconventional, or contentious tree design, we’ll definitely want to “put it through the ringer” by presenting a larger number of tasks.

As a starting point, though, the following numbers are typical of the tree tests that we run:

Size	# of items	# of tasks (overall)
Small tree	1-250	8-12
Medium tree	250-500	10-20
Large tree	500+	15-25

If we have FEWER tasks than this, we need to check coverage of the tree (described in Mapping tasks to the tree later in this chapter). There may be important parts of the tree that we’re missing.

If we have MORE tasks than this:

We should check that each task is answering a specific question we have about the site structure. If it isn’t pulling its weight, or if other tasks already address that question, consider removing it.
Remember that each task takes time to write (up front) and time to analyze (later), and that we’ll need a bigger participant pool to get all those questions answered (see below).

How many tasks per participant?

Give each participant 8-10 tasks. More is risky.

No matter how many tasks we create overall, there is a (relatively low) limit to how many tasks we should ask each participant to do. This limit stems from two factors:

The time and effort required by each participant
The learning effect of browsing the same tree repeatedly

Participant effort

If we ask a participant to do 8–10 tasks in a tree test, that typically works very well – they try a small number of tasks (each one a bit different), they see the tree a few times (but not too often), and they finish in 5 minutes or so, so they’re not tired, bored, or grumpy at the end. They got a short, challenging exercise that was different (and more fun) than a traditional survey, and they leave happy (and probably willing to say yes to the next study we ask them to do).

That’s fine if we’re testing a small tree, where it’s easy to whittle our task list down to 10 or so. But suppose we’re testing a big tree and we’ve decided that we need 25 tasks to get adequate coverage of the parts that need validating.

If we make each participant do 25 tasks, neither we nor they will be happy, for several reasons:

When we invite them to the tree test, we’ll have to tell them it will take 15-20 minutes, not 5. That may drastically cut down the number of respondents we get.
After about 10 tasks, they may start to get tired of the exercise, and not try as hard, perhaps skipping more tasks, perhaps picking answers that will just get the test over with sooner. In any case, task fatigue will likely change their behavior, which puts our results in question.
Fatigue and boredom also means they may be less likely to take part in future studies. This is a particularly important consideration if we have a small pool of participants to begin with.
The learning effect becomes a real problem (see below).

The learning effect

The other problem with giving each participant a lot of tasks is the “learning effect”; as they browse the tree for each successive task, they start learning the structure.

This is not a bad thing in itself. After all, users visiting a website for more than a few minutes will likely learn the overall navigation to some extent.

However, in tree testing, we’re asking them to find things over and over, without anything else to look at except the structure, so they come to know it better than they would when they visit the real site. (Most users don’t arrive at a site with 10 successive things to look for.)

While some learning is inevitable, we can do two things to minimize its effect on our results:

Randomize the task order
By putting the tasks in a random order per participant, we make sure that earlier tasks don't "help" later tasks across all participants. See Randomizing the order of tasks later in this chapter.
Limit the number of tasks per participant
If we present each participant with 25 tasks, they’re going to know the tree very well by the end, and this will unrealistically improve their performance on the later tasks. Limiting each participant to fewer tasks (say, 8-10) reduces the learning effect to something closer to what they would learn using the real website.

Setting the number of tasks for each participant

To let us test a large number of tasks overall without overburdening participants, most tree-testing tools let us specify how many of the tasks are shown to each person.

For example, suppose we have a medium-sized tree with 20 tasks that cover it to our satisfaction. That’s too many tasks to give each participant – they’ll get bored or tired, and their tree “learning” will pollute our results. They will be happier (and we’ll get sounder results) if we give them 8-10 tasks each.

This does mean that we’ll need to recruit more participants to get the same number of responses per task. In the example above, if we had 20 tasks overall and showed 10 to each participant, we would need about 100 participants to get 50 responses per task. For more on this, see How many participants? in Chapter 9.

Next: Mapping tasks to the tree

Tree Testing for Websites

How many tasks?

How many tasks overall?

How many tasks per participant?

Participant effort

The learning effect

Setting the number of tasks for each participant