Which Usability Test Do I Need?

25. August 2022 | User Research

Reading time: 8 minutes

There are two main categories of usability testing – summative usability testing (summarizing results) and formative usability testing (shaping the design). Which type of test is needed depends on what you want to find out. Let’s take a look at the two main usability test categories.

Inhaltsverzeichnis:

Summative Tests: Is our product efficient?
Results of summative tests are often numbers:
Requirements for Summative Tests
Formative Tests: How do users experience the product/design?
Gaining additional information with the method of thinking aloud.
Unmoderated formative tests

Summative Tests: Is our product efficient?

In summative tests, the focus is usually on statistical metrics. These tests are generally about efficiency – finding out or measuring whether the designed solution is efficient.

Typical questions answered include whether the design meets a certain standard or criterion. These can be questions related to the time required to complete a task:

How long do test participants take to complete task X?
How long do test participants take to complete task X?

Can participants complete task X in (for example) under a minute? These answers are particularly relevant in the industrial context, such as medicine, and wherever devices (cars, aeroplanes) are controlled – everywhere where reaction times are important. Summative tests are also used in benchmarking – to make comparisons. Typical questions here might be:

Does our product perform better than the competition’s product?
Does Design A perform better than Design B?
Do more people click on the button in Design A rather than in Design B?

A typical representative of this area is A/B testing.

Results of summative tests are often numbers:

40% of our users completed task X in less than 30 seconds.
Design A has a 40% higher error rate than Design B.
In Design A, 20% more people click on the button than in Design B.

The results answer “how much” or “how long” – but usually do not clarify the “why” behind a specific behaviour. It is called a “summative” test because its goal is to summarize the results.

Requirements for Summative Tests

A finished product or a fully functional prototype is generally required when conducting a summative test. The products must “work” to make a valid statement about efficiency.

Summative tests also require around 20-30 (+/-) participants, depending on the statistical methods used. Therefore, someone familiar with statistics and the requirements for the respective statistical methods is needed.

Formative Tests: How do users experience the product/design?

Formative tests are more common within UX design processes, as they can be part of an iterative design process. The goal of formative tests is to find out something about the experience and behaviour of the participants, for example, if they find something confusing or strange. Unlike summative tests, these tests often answer the “why” behind a specific behaviour.

Formative tests answer questions such as:

How do people experience our design?
Where do they get stuck, and why?
What are the biggest design problems/challenges that we should address next?

Formative tests can help identify the potential for optimisation early in the design process, meaning that formative tests should ideally be done early in the design process, for example, with click-through prototypes, to identify and fix early problems quickly.

A typical result of such a test is usually qualitative rather than quantitative, for example:”Participants had difficulty completing task X because the buttons labelled OK/Cancel are confusing..“

Conduct formative tests when the goal is to uncover problems and identify further UX potential. They help shape the design of a product or service, hence the name “formative”. As we do not have statistical requirements, 7-10 test participants are enough to uncover problems that can then be optimised.

Gaining additional information with the method of thinking aloud.

Experience shows that the main reasons for task aborts are due to users feeling they cannot navigate well or are poorly informed. Formative, moderated usability tests and the method of “thinking aloud” (Ericsson & Simon, 1984) help determine this. In “thinking aloud,” participants provide a constant commentary on their thinking processes while interacting with the system. The aim is to gain information on the cognitive processes of the participants while interacting with the system: What is going through their minds? What questions do they have right now? In what knowledge structures do they classify the presented information? What irritates or confuses them?

Limitations of the Thinking Aloud Method

You can only think aloud what is conscious. However, some – actually very many – thought processes run below the threshold of consciousness and cannot be verbalized (Wilson, 1994).

For this reason, we recommend moderated tests when conducting formative usability tests. Moderated usability tests are an excellent tool to find the WHY behind aborted processes and poor conversion rates.

Reacting to subtle behavioural cues in moderated formative tests

In a moderated test, we have real-time interaction between the moderating usability experts and the test subjects. This means that the usability experts sit in on the test, either remotely via video conference or on-site to guide test subjects through the process. Impossible to do with unmoderated usability tests because there is no real-time interaction with the test subjects. Through continuous observation during a moderated usability test, we can identify, note, and later return to points in the conversation that included subtle behavioural cues in behaviour such as facial expressions (squinting, raised eyebrows, etc.). Regardless of what is “thought aloud” – what is verbalized – these subtle behavioural cues often indicate that users feel uncertain but may not necessarily express it because it is not conscious to them. Therefore, we return to these more subtle points later in the conversation to deeper investigate whether there was a misunderstanding, or if uncertainty prevailed. We often get additional valuable information when returning to the corresponding points and can ask the test subjects to repeat tasks or ask targeted questions.

Unmoderated formative tests

Unmoderated formative usability tests make targeted questioning impossible because participants complete the sessions by themselves, i.e., the test subjects usually conduct the test remotely from home using special online tools. These sessions are recorded in sound and image so usability experts can view and evaluate them afterwards. There is no real-time interaction with the test subjects. Nevertheless, it is possible to include questions in the study. They can be displayed after each task (e.g., “How difficult did you find that?”) or at the end of the session. However, these questions are usually standardized and the same for all test subjects. In unmoderated sessions, there is no possibility to ask detailed questions that relate specifically to the behaviour of a respective participant.

Further disadvantages can also be that less is thought aloud in unmoderated sessions – simply because no one is there to remind the participants. In unmoderated sessions, we have observed that participants become quieter over time, which is unfortunate because we cannot know what participants are thinking while completing a task.

In addition, participants may drop out, skip tasks, or be less motivated to complete the tasks. Often, we rarely find out why a participant drops out, for example. Did the technology not work? Did they lose interest? Were they interrupted, or was the task too difficult? As a result, some sessions may not be usable. In moderated tests, the social pressure of direct observation creates more motivation to perform the tasks or engage with them.

The lack of detailed follow-up questions is the primary disadvantage of unmoderated tests – especially if conducted in an early design phase. The primary reason teams decide to run unmoderated tests is because of the perception that they save time. While it is true that moderators will not spend time interacting with participants on a one-on-one basis, the test results – in our opinion – come with a significant loss of knowledge. Additionally, an unmoderated usability test requires the same – if not more – planning than a moderated test. For teams interested in conducting unmoderated tests despite these disadvantages, we recommend running them exclusively for functional systems such as live websites. Non-functional systems such as clickable prototypes could raise too many questions. Personally, we prefer a moderated session over an unmoderated session, as moderated sessions typically provide more insights.

The decision, of which type of usability test to conduct – summative or formative and moderated or unmoderated – depends on what you want to find out and how. Summative tests can help provide information about the efficiency of a product for a functional prototype or a finished product. Formative tests can help identify problem areas and UX potential. They are helpful during the early stages of the design process or with finished products. In formative tests, moderated tests offer the great advantage of specific inquiries and thus significantly increase the chances of obtaining detailed insights into the user experience, providing valuable insights for improving and optimizing the UX of the system.

Literature

Ericsson, K. A., & Simon, H. A. (1984). Protocol analysis: Verbal reports as data (p. 426). The MIT Press.
Wilson, T. D. (1994). The Proper Protocol: Validity and Completeness of Verbal Reports. Psychological Science, 5(5), 249–252. https://doi.org/10.1111/j.1467-9280.1994.tb00621.x

Illustration

Web illustrations by Storyset

A Petition for an Online Inclusivity Countdown

22. April 2024

A few weeks ago, we attended an event hosted by Digital Media Women Rhein-Neckar and BPW Mannheim-Ludwigshafen, a Future Talk on the Digital Gender Gap with guest panelist Maren Heltsche (Co-founder of speakerinnen.org and special representative for the policy field...

Use personas and journey maps to make informed decisions and prioritize requirements

5. October 2022

Research-based personas and the resulting user/customer journeys simplify requirements and product development enormously, as they help to identify optimization potential. This in turn helps make informed decisions about product features, interactions and navigation...

Ethical aspects and principles in usability testing

27. September 2022

As part of the psychology degree, students must complete a certain amount of what is called subject hours for admission to the thesis. This is about 30 hours in total and this has the function of learning about research also from the perspective of the participants...