A Guide to Stratified Sampling (Stratification)
Kimberly Surico |
 09/13/22 |
5 min read

A Guide to Stratified Sampling (Stratification)

When a researcher wants to study a large population, it is usually more feasible to select a small group to represent the whole. To ensure that this small group, known as a sample, really does represent the entire population, the researcher can use any of a number of sampling techniques.

The Importance of Continuous Consumer & Market Intelligence

One of these techniques is stratified sampling, which is popular for its accuracy in studying populations composed of diverse subgroups.

What Is Stratified Sampling (Stratification)?

Stratified sampling, also referred to as stratification, is a method used to select samples from a large target population based on the distinct subgroups contained within it. A subgroup is known as a stratum (pl. strata) and it is made up of individuals with shared characteristics. These include race, age, gender, occupation, etc.

There are many applications of stratification in government, academia, healthcare, and commerce. In business, it can be applied to studies aimed at optimizing service delivery, developing a target customer base, and making strategic investments.

stratified sampling

Stratification divides a population into strata.

The researcher starts by defining a population they want to study based on the general characteristics. Then they stratify it i.e. divide it into subgroups depending on the specific attributes.

The process is as follows: Randomly select a sample from the general population being studied; identify the distinguishing attributes and divide the sample into subgroups based on those qualities; select from each subgroup a smaller sample; study these smaller samples collectively.

To ensure samples are truly representative of the population, they use a formula so that the size of each strata in the sample is proportionate to the size of the population.

For instance, the current human population ratio of males to females is roughly 1:1. Hypothetically, if you were conducting a universal study on a gender-based subject, you might start by selecting a large sample of 1,000,000 people from a target location – half of each gender.

But this may still be too many people for one study. You may decide that you can still get an accurate analysis from just 200,000 people. The next step is to choose 200,000 people from the large sample you have. So, how do you choose?

Use the formula [(target sample size/population size) x stratum size]. Thus, [(200,000/1,000,000) x 500,000)]. Hence, you pick 100,000 males and the same number of females for your stratified sample. What you find from this can be used to judge the entire population.

The same can just as easily be done on more cases where the ratio is not simply 1:1. (Read on to see how we break down a business case example.)

This is referred to as proportionate stratification.

Stratification can also be disproportionate where the ratios of the strata to the sample do not coincide with the population distribution. Generally, proportionate stratification is regarded as the more precise approach.

When Should You Use Stratified Sampling?

Stratification is such an effective method of sampling because it leads the researcher to a more accurate understanding of the population when there are subpopulations within it. This is where the approach is best suited.

Otherwise, it is ineffective if the population cannot be divided into strata. Similarly, if there are too many subgroups the classification of subjects may be so tedious and prone to error that it takes away the advantages of the process.

So, use stratification when you have subgroups within the population under study in order to get a better overall representation of the reality.

Alternatively, you can use simple random sampling which is quicker but may be less accurate in certain situations. For our example from above, a researcher going with this approach would have simply selected 200,000 people from the general public and studied those.

This could potentially be misrepresentative because the sample may comprise more of one gender.

Another sampling technique is cluster sampling which divides the population into clusters. Then entire clusters are studied. Each cluster must be representative of the population.

For instance, you can choose clusters of 200,000 people and study them randomly. This is very similar to stratification in aiming for accuracy.

There is also systematic sampling where subjects are selected randomly based on an index like, say, handing out a survey to every tenth person. In our example, we could collect a database of 1,000,000 people and enter every fifth person into the study in order to achieve our target sample size.

All these methods have a common feature: They rely on predefined criteria to ensure that members have an equal chance of being part of the sample. This is known as probability sampling.

sampling methods

Stratification is a probability sampling technique.

How To Create a Stratified Sample

Let’s say you have a pet business with customers across the United States and you want to know the issues pet owners have so your business can address them. You soon realize that more than 90 million families own at least one pet.

To select your sample, you use NetBase Quid® to discover conversations about pets on social media. There are a ton of these conversations, so you filter out until you end up with conversations of pet owners in the U.S. only.

With further filtering on the platform, you conclude that there are 5 million individual users talking about their pets over the past eight months. Perfect.

NetBase Quid® can give you a 10,000 foot view of the landscape easy-peasy by analyzing these conversations for prevailing themes and sentiments. But maybe you want a closer look through direct feedback on a survey.

You can’t conduct a survey on 5 million people so you decide to select a sample of 5,000 pet owners. But pugs aren’t the only pets you’re interested in. Therefore, your sample needs to represent all the pet owners – depending on your business.

You conduct further external secondary research to find that in the U.S. dogs are an absolute favorite with 60% of pet owners. Next come the cats at 20%, then fish at 12%, and lastly of your target animals, eight percent are birds. (These numbers are only for this illustration, not facts.)

Proportionate stratification requires that your sample represents this distribution. Thus, you select 3,000 dog owners, 1,000 cat owners, 600 fish owners, and 400 bird owners. Total, 5,000 people.

You have a large pool to choose from so ensure that each member of the study belongs to only one class/stratum.

You can then administer your survey through your preferred software. As the feedback rolls in, you can use NetBase Quid®’s Intelligence Connector to integrate your survey software into your data analysis ecosystem.

Now you can observe what your target customer base wants without any concerns as to the accuracy of your study because stratification has allowed you to adequately represent the entire population of pet owners in the U.S.

Advantages and Disadvantages of Stratification

By requiring that strata represent the entire population in proportion, stratification takes into account the key features of the area under study. It is therefore the go-to strategy for studying populations with varied attributes.

However, stratification works under certain conditions which cannot easily be met in all studies. If subgroups cannot be formed, it falls apart. If each member cannot fit into only one subgroup, it falls apart.

Also, if there is no secondary information to set the foundation for the strata sizes, the researcher has to find and classify all the members of a population. This may often be impossible.

All in all, stratification has its place in modern market research, especially aided by advanced technology. See how we were able to easily find 5 million people for our study in the example? That is the power of AI-based consumer and market intelligence technology.

You can have this same power in your hands so you don’t miss out on the benefits of a great sampling technique on account of insufficient or inaccessible data. Reach out for a demo today and we will show you what you have been missing.

The Importance of Continuous Consumer & Market Intelligence

Premier social media analytics platform

Expand your social platform with LexisNexis news media

Power of social analytics for your entire team

Media analytics and market intelligence platform

Enrich your media analytics with social data

Media coverage for historical & real-time monitoring

Data streams & custom KPIs for advanced data science

AI, Image Analytics, Reporting Tools & more

Out-of-the-box integration with other data sources