The whole point of running a tree test is to find out which parts of the tree work well and (more urgently) which parts don’t, for the most common and critical activities that our users do on the site.
We largely determine this by:
examining the results of individual tasks
looking for patterns among tasks
~~comparing tasks between different trees we're testing
The most important metric for a task
Inspecting high- and low-success tasks
Did participants agree on where to start, or did they scatter?
Directness scoring, backing up from incorrect vs. correct paths
Task times vs. click times, and why they took longer
Looking for patterns in skipped tasks
Were they sure of their answer, or doubtful?
When to ignore strange results from a few participants
The most common patterns to look for
Next: Analyzing by branches