Get hands-on practice in all the key areas of UX and prepare for the BCS Foundation Certificate.
Because surveys usually involve hundreds of respondents, many design teams value the findings from a survey more highly than the results from small sample usability tests, user interviews and field visits. But the results of most web surveys are biassed by coverage error and non-response error. This means surveys, like most qualitative data in user research, should be triangulated with other sources of data.
In August this year, some researchers asked people to examine products available on Amazon. The researchers wanted to answer a very specific question: how do people use review and rating information when choosing products?
In one experiment, the researchers asked participants to choose which of two phone cases they would buy. Both phone cases had the same (mediocre) average rating but one of the cases had many more reviews than the other.
In this experiment, the participants should choose the product with fewer reviews. This is because, with just a few reviews, the average rating is more likely to be a statistical glitch. The participants should not choose the product with a large number of ratings, because that just makes it more certain that the product really is poor quality.
It turned out that participants did exactly what they shouldn't. People chose the product that had more reviews over the product that had fewer reviews. The researchers reasoned that this is because people use the number of reviews as an indicator of a product's popularity — and their behaviour is then influenced by the crowd. Interestingly, the authors titled their paper The Love of Large Numbers.
I see this love of large numbers frequently in the field of user research. It's one of the reasons people are more likely to believe the results of a survey of 10,000 people than a usability test of 5. Surely, having such a large sample size must make the data more robust and reliable? In fact, as Caroline Jarrett has pointed out, asking one person the right question is better than asking 10,000 people the wrong question.
The problem lies with two obvious, and two less obvious, sources of error in web surveys.
Most researchers are aware of the importance of asking the right questions (I wrote more about this here). Many surveys are ruined because the user doesn't understand the question, or understands it differently to the intention of the researcher, or doesn't have an answer to the question (and there's not an 'other' or 'not applicable' option). I also come across surveys where the question asked by the researcher doesn't match the purpose of survey, and therefore the validity of the survey is in question.
Sampling error is a second obvious source of error in surveys. When we sample, we select a proportion of people from the total population and we hope that their views are representative of the whole population.
As an example, imagine we have one million customers and we want to find out their opinion on paying for software by subscription versus owning the software outright. Rather than ask all one million, we can take a sample. Remarkably, with a sample size of just 384, we can achieve a margin of error of just 5%. So if 60% of our sample say they would prefer to own software outright rather than pay for it on subscription, we can be confident that the actual number would be somewhere between 55% and 65%. (You can read more about sample sizes here).
You reduce sampling error by increasing the size of your sample. In the previous example, if we increase our sample size to 1066 then our margin of error would be 3%. Ultimately, when your sample is the same size as the whole population (i.e. you have a census) then you no longer have any sampling error.
Most people who commission web surveys know about sampling error. And because it's quite common to get thousands of people responding to a survey, people often think that any errors in their data are small.
Sadly, this isn't the case. This is because sampling is valid only if the sample you take is random. 'Random' means that everybody in your population has an equal likelihood of being in the sample. For example, you might sample every 1000th person in your population of one million and continue to pester them until they respond.
To understand what prevents us getting a truly random sample, we need to examine two further types of error that are less well understood: coverage error and non-response error.
This type of error occurs when the research method you have chosen excludes certain people. For example, carrying out a telephone survey with respondents sampled from a directory of landline telephone numbers will exclude people who use a mobile phone (as well as people who don't have a phone at all).
Similarly, a web survey will exclude people who don't have access to the Internet such as lower socio-economic groups, the homeless and people who just don't use digital channels (currently around 10% of the population in the UK and 13% of the population in the US).
But what if you are interested only in sampling from people who visit your web site today: surely you won't suffer from coverage error then? That depends how you go about asking. Typically, web sites give users a cookie with each survey invitation to prevent the same user being repeatedly bothered by survey invites. Think of this in the context of a web site that provides information about train times. Frequent travellers are likely use the website more than people who travel infrequently. This means many of the frequent travellers will have been given a cookie and so taken out of the sampling frame. In contrast, all of the infrequent travellers who are using the site for the first time will be included in the sampling frame. This has created a coverage error by skewing the sampling frame toward less frequent travellers.
To understand non-response error, imagine we devised a survey to measure people's experience with the Internet. Imagine that the survey invitation appears as a pop-up when people land on a web page. It might be the case that advanced users of the Internet have installed software to block pop-ups. And even those advanced users who allow pop-ups will be more familiar than novice users with finding the 'no thanks' link (often rendered in a small, low contrast font). In contrast, novice users of the internet may think they need to accept the pop-up to proceed. Factors like these bias the sample because experienced internet users will be less likely to take part.
This isn't coverage error because both advanced and novice users are equally likely to be in the sampling 'frame'. This is non-response error: non-respondents (advanced internet users) are different from respondents in a way that matters to the research question.
Non-response error is a serious source of error with web surveys. This is because researchers tend to blast the survey to everyone as it's so easy to do: a sample size of one million doesn't cost any more than a sample size of 1000. But imagine you send the survey to one million people and find that 10,000 (1%) respond (I'm being generous here: you're more likely to get a 0.1% response rate from a pop-up survey). Although the sampling error may be small, the fact that you have such a large non-response error (99%) is a serious source of bias. Those people who responded may not be representative of the total population: they may like taking surveys or they may feel well disposed to your brand or they may just have been sucked in by a pop-up dark pattern.
There are a number of ways you can control these two, less obvious, sources of bias.
First, you should start creating proper samples. Rather than ask everyone to take part and hope for a good response rate, sample around 1,500 people from your customer base and aim for a response rate of around 70% (this will give you a sample size close to 1,066 — a magic number for a population of one million). It's true that even with a 70% reponse rate you could still suffer from non-response error, but it's also true that a 70% response rate from a sample is better than a 10% response rate from everyone. Remember that the key is to select from your population randomly.
Second, control coverage error by making sure that you give everyone in your population an equal likelihood of being asked to take part this means not using cookies to exclude frequent visitors.
Third, to control non-response error, look at ways of encouraging more of your sample to take part in your survey.
If this sounds like a lot of hard work, you're right. Doing a good survey is about more than buying a subscription to a survey tool and mass mailing your customer base.
But there's an alternative.
For many questions in user research, we're happy with a fair degree of error. It's not always necessary to have results that are statistically significant. Many times we just want to know which way the wind is blowing. So an easy alternative is to accept that your sample isn't representative of your total population. Then use your survey as an indicator that you can triangulate with other user research data, like field visits, user interviews and usability tests.
This approach has an added benefit. It will help you avoid thinking that your survey results are somehow more credible because you have a larger sample.
Thanks to Philip Hodgson, David Hamill and Caroline Jarrett for comments on an earlier draft of this article.
Dr. David Travis (@userfocus) has been carrying out ethnographic field research and running product usability tests since 1989. He has published three books on user experience including Think Like a UX Researcher. If you like his articles, you might enjoy his free online user experience course.
Gain hands-on practice in all the key areas of UX while you prepare for the BCS Foundation Certificate in User Experience. More details
Our most recent videos
Our most recent articles
copyright © Userfocus 2021.
Get hands-on practice in all the key areas of UX and prepare for the BCS Foundation Certificate.
We can tailor our user research and design courses to address the specific issues facing your development team.
Users don't always know what they want and their opinions can be unreliable — so we help you get behind your users' behaviour.