CEO’s Perspective to Statistical Analysis
Sampling Techniques

This is the second in a series of white papers published by KINDUZ Consulting on Statistical Analysis. This paper is an introduction to population and samples, why we use samples, and the various sampling strategies that can be applied.


[wpdm_package id=’12734′]

Context for the Reader

You are the Global Chief Executive of KINDUZ Corp., a global organization with revenue of USD 10 billion. The vision of KINDUZ is “Creating Universal Prosperity”. In line with its vision, KINDUZ has established itself in multiple industries including Pharmaceuticals, Oil, and Gas, IT/ITES, Automotive manufacturing and Cement industries.

You and your entire team believe that Creating Universal Prosperity is achieved through alignment of organizational goals to individual goals of stakeholders (Employees, Stockholders, Customers, and Societies).

Approach of the Whitepaper

This whitepaper is one in a series of whitepapers published by KINDUZ to enable deeper understanding and application of tools and techniques by CEOs enabling them to deliver sustainable outcomes.

The learning in the whitepaper is structured around cases, where the concepts and its applications are explained through the case description, analysis, and solutions.

Executive Summary

CEOs take important decisions about their organizations based on the questions they ask, the data provided in response to these questions and the analysis of this data. Typically, the data that is used for analysis is a small portion of the entire data. This entire data set is called as population and the small portion selected is called as sample.

It is important to understand why we sometimes use samples instead of the population, and having decided to use samples, what sampling technique will give us the most representative picture.

To understand God’s thoughts,
one must study statistics… the measure of his purpose.”

– Florence Nightingale

This white paper first enables a CEO to understand the difference and similarity of population and sample. Then we discuss the following sampling techniques which are categorized as Probability sampling:

  • Simple random sampling
  • Stratified sampling
  • Systematic sampling
  • Cluster sampling (single stage and two stage)

In this paper, we have focused exclusively on probability sampling techniques. The primary reason for this is that non-probability sampling does not provide an equal probability for all units to be selected and therefore it is not apt for holistic decision making process for a CEO, because the data could be biased.

For example, KINDUZ Pharma had to recall an entire batch of medicines because of assay stability failure. The team at KINDUZ Pharma wants to study why the batch failed on assay stability. If the team collects random samples from batches (including batches that passed the assay stability and the batches that did not pass the assay stability) to study the impact of process variables on assay stability, then it is Probability sampling. If the team collects data samples from the batch that failed on assay stability, then it is non-probability
sampling. The samples collected from the failed batch is not representative of the entire population (all batches produced in a specified time period) and will not enable the KINDUZ Pharma team to holistically find the causes of assay stability failure.

This paper provides five cases that explain population, samples and probability sampling techniques. The situations in which these sampling techniques should be used are summarized in the table below:

Sampling Technique Situation to use in
Simple Random
  • High risk of data manipulation
  • Speed of sampling more important than precision
Stratified
  • Population needs to be divided into homogenous subpopulations to
    identify the specific problem area
  • When the population shows periodicity
Systematic
  • Population does not show any patterns
  • Low risk of data manipulation
Cluster
  • When dealing with a very large population where a large sample size is required (relative to population) and not all members of the population are equally accessible

Based on these cases, here is a list of Do’s and Don’ts for you to leverage:

Do's:
  • Check whether your baselines are based on population data or sample data
  • Always ask about the sampling technique used and the sample size when presented with data and/or it's analysis
  • Ensure the right sampling technique is used and the right sample size is chosen before undertaking any analysis and decisions
  • Collect samples at the lowest granularity to minimize the risk of sampling error
Don'ts:
  • Assume all sampling techniques to have the same precision
  • Accept simple random sampling as the default practice
  • Confuse stratified sampling with cluster sampling (this is a common mistake)
>>Read more
Contact Us

Not readable? Change text. captcha txt

Start typing and press Enter to search