Everyone knows that the more people you use in any kind of test then the better your sample is likely to represent the population as a whole. This applies to questionnaires, polls, observations, and also where people look when they look at adverts, packaging, homepages, landing pages, emails etc However, in the real world of budgetary and time constraints, you want to test with as few people as possible whilst still being confident that your results are reliable so that you can make decisions with confidence. There are ways that you can determine the appropriate numbers to achieve this – we won’t get into the statistical theory here – but it is important to note that the minimum number of people to use is dependent on what you are measuring. Traditional Market research often uses samples of 100 or more people. At Think Eyetracking we often test 30 people – why is this? Broadly it is because what we are measuring is entirely different; traditional market research usually asks the consumers conscious mind it’s opinion or attitude, at Think we interrogate what the subconscious mind ‘does’ (rather than what the conscious mind it thinks it does!) by recording fixations and saccades. Fixations and saccades are ‘things’ rather than abstract concepts such as opinions and attitudes; because we are counting ‘things’ we can often use smaller samples than traditional market research and achieve statistically reliable results. As a demonstration, here is part of a more formal Monte Carlo Simulation we did:
We asked 150 people to look through the pages of a magazine as they normally would do; there was a mix of advertisements and editorials. The Dove advertisement below was included.
We then created 4 heatmaps from 4 different (randomly assigned) groups of 30 from the total sample of 150.
You can see in these heatmaps above that each sample of 30 produced the similar pattern of visual behavior. And when you run the numbers these are essentially the same too. So that’s a fairly simple explanation, but let’s look a little closer at what’s involved. There are three key elements that can effect the necessary sample size:
- A) the complexity of the stimuli being shown
- B) the task being set and
- C) the variability in the sample
A stimuli with limited complexity and a directive task both limit the need for a larger sample. For example look at figures 1 and 2 below.
It is easy to see that with a less complex stimuli as in figure 1, there will be less variability in people’s visual behavior, and so a smaller sample can be used than would be necessary for figure 2 to find common patterns. Similarly, if the task looking at figure 2 was to find the two black squares, there would be more common behavior than if it was to simply look at the image, and so a smaller sample can be used to find common visual behavior. We have been running commercial eyetracking studies for over six years now (2002 to 2008 at time of writing) and during this time we have got real good at managing elements A through C – stimuli complexity, task, and sample variability – to ensure that a sample of 30 can be enough to provide statistically significant and robust results.
That said, sometimes it is necessary to test with samples greater than 30 and you will be unsurprised to find out that this is when the differences in the things we are testing are very subtle or the image is very visually complex. In these cases we can advise on what will be an appropriate number to sample.
Posted By Lizzie Maughan and Robert Stevens.