Data Saturation – numbers left out in the rain, or something else?

Data saturation

Data saturation is a term used in research to indicate that no new information is expected to be added that will enhance or change the findings of a study. Data saturation is important to achieve. It is reached when there is enough information to replicate the study, when the ability of obtain additional new information has been attained, and when further coding (identification of themes) is no longer feasible.

Yet the concept of data saturation is considered to be a neglected one. This is because it is a concept that is hard to define. What is data saturation for one is not nearly enough for another.

There are two ways in which data saturation plays itself out in research:

1. Data saturation in sampling

When a researcher chooses respondents for a study (conducts ‘sampling’), they may do so using ‘theoretical sampling’. This means they will continue adding new units to the sample until the study has reached a saturation point; that is, until no new data are produced through inclusion and analysis of new units. Theoretical sampling is an approach to acquiring respondents for research that is related to an approached called ‘grounded theory’ and is characterised by the fact that the collection of data is controlled by the emerging theory. The researcher has to constantly look for new units and data, and justify the theoretical purpose for which each additional group is included in the study. This type of approach to sampling is uncommon due to the constraints of a fixed budget which determines the design of the study and its sampling parameters.

Researchers often struggle with knowing how to estimate how many interviews will be required to reach data saturation and again, are often dictated by project budgets. When deciding on a study design, researchers should aim for one that is explicit regarding how data saturation is reached. To best achieve data saturation, good care should be taken in sampling a cross section of populations of interest, so that a full range of views is likely to be heard.

2. Data saturation in qualitative interview

In-depth interview and focus groups are two commonly used methods of qualitative research. They each involve the search for depth of meaning, unlike a quantitative survey which tends to focus on close-ended questions such as yes/no or rating scales. A focus group or in-depth interview is an exploratory form of research. It is open ended and less formally structured than a survey. The interviewer needs to investigate the topic of interest with the respondent until there is nothing left to add. This may be done by using questions at the end of the interview such as ‘Anything else?’ or ‘Do I need to know anything other than what I have asked you?’ This is done to ensure that saturation has been achieved; that there is nothing else to add to the topic of interest.

Failure to reach data saturation in qualitative research has an impact on the quality of the research and compromises the validity of the content. However, there is no one-size-fits-all approach to obtaining data saturation. There are data collection methods that are more likely to reach data saturation than others, although these methods are highly dependent on the study design.

Unfortunately, data saturation can really only be known after the fact, once qualitative interviews have been conducted and data has been analysed. Yet market research is typically planned, justified and costed ahead of time. So, achieving data saturation in reality, must be a combination of sensible sampling, good research design, well designed research tools, and the reality of the commercial parameters to the project.

Otherwise, your findings may as well be left out in the rain.

Back to All Posts