Sampling is the mathematical process in which units are selected from a population of interest so that by studying the sample one can generalize the results back to the population from which they were chosen. It forms the basic part of the statistical practice that relates to the selection of individual readings and noted observations that are meant to provide information about a target population and used for statistical inference.
Each reading would perform measures different properties such as color, weight, age, gender, and location, and so on of the target entity so that they can be identified separately. There are two basic methods used in sampling and these are probability sampling and nonprobability sampling.
Both these sampling processes are made of certain common steps and they are: define the target population or the population that has to be examined; set the sampling frame which is a set of events and items that have to be measured; specify the method that is to be used for sampling for the target items; decide the size of sampling and this could extend from a few numbers to millions; implement the decided sampling plan; carry out the actual process of sampling and the collection of data and finally review the process used for sampling (Deming, 1984). This paper provides a discussion into both types of sampling processes.
Probability Sampling and Applications
The probability sampling method is the process that uses a type of random selection. The method is based on the principle that different types of units in the target population would have an equal probability of being selected. Informal means are to make a random selection such as a random selection of picking a name from a slip of paper in a hat or trying to pick the shortest straw. In both these cases, each slip of paper in the hat or each straw would have an equal probability of being selected.
In the current scenario, powerful computers are used to make the required selection on a random basis. Some basic terms used in probability sampling are: N is the number of items available in the sampling frame; n is the number of individual cases that are available in the sample; NC is the number of possible combinations or the defined subsets of individual cases n from the sample frame N and f is calculated as a ration of n/N and this represents the sampling fraction.
There are a number of methods used in probability sampling techniques and these are: method of simple random; Stratified Sampling; Cluster sampling also called as Multistage Sampling and Systematic Sampling. These terms are briefly explained as below and also examples of marketing applications where the methods are used are also given (Levy, 2008).
Simple random sampling
In this method, a given sample frame would have a number of random samples that have an equal amount of probability. It should be noted that in this method, there are no subdivisions of the sample frame and there are no partitions. The objective of the method would be simple such as the selection of n units from a sample frame of N so that each instance of NC would have the same chance of being selected. This method can be used for simple selection procedures like selecting a winner in a lottery or jackpot. Usually in small events tickets are sold and the running number is printed on the ticket and the counterfoil stub that is retained by the customer.
Typically, all the tickets are placed in a box and a guest is asked to pull out a ticket and the number printed on the ticket is declared as the winner. Microsoft Excel has a function called =RAND() and this function can be used for selecting a random number in a column. This type of sampling is crude and the representative results that are obtained cannot be used for detailed market research and other such activities (Statistics Canada, 2008).
In this type of method, the sample is selected after a specified interval. An example would be to select the 20th name from the telephone directory and the sampling would be called 15th sampling. The method to follow would be to consider assign numbers for the units in the sample frame from 1 to N. The next step is to decide on the sample size and the interval or step. So the interval size k would be N/n and then select a random integer between the values for 1 to k.
Once this is done then each unit that occurs at the Kth position would be taken. The method considers that the list is randomized and presents stratified data but it can be subjected to errors of periodicity. If some phenomenon of periodicity exists, then there would be an error that is equal to the sample interval. This method would be used in cases such as knowing the shopping habit of every say 10th visitor to a shopping mall (Statistics Canada, 2008).
This method is used when the sample frame is organized into different categories. The sample frame is then split into different types of categories or strata as it is called. Individual samples can then be selected by using simple random or systematic sampling from the strata and this would give the stratified sample. This type of sampling is more refined as it allows specific groups in a target to be considered properly and it also improves the efficiency by providing more control on how the sample is composed.
The study can be made more accurate by changing the size of the sample as per the stratified data and if there are more variations in the strata, then the size of the sample can be increased or decreased so that it is proportional to the standard deviation of the strata. For example, if one would want to know the buying patterns of people, based on the starting alphabet of their surnames, then possibly, the maximum occurrences would occur with alphabets such as a, b, c, and so on and relatively less in surnames with alphabets such as x and z. By using this method, it is possible to allocate proportional sample sizes from z and x and also a and b so that the study is representative (Trochim, 2006).
Cluster and Multistage method
In this method, the sample is organized into the cluster by using people either from a specific region or using a certain period. The process has multiple stages as in the first stage, an area is chosen as a sample, and then in the next stage, from this sample, further samples are chosen. There is a certain amount of skill involved in organizing the clusters so that the sample frame does not exclude or include marginal groups. This method would reduce the amount of variability that usually occurs.
When large sample frames are used, this method is useful as it reduces the costs of sampling. As an example, for the presidential elections in the US, to find the overall opinion poll on the candidate that people prefer, it is not possible to go to all areas in the US and ask people whom they would vote for. Rather, only a few cities spread across different areas such as low income, high income, whites, blacks or minority areas and other such strata are framed, and then people are selected by random of other process and their opinion is asked. The results would indicate that X candidate is preferred among blue-collar workers; Y is more popular among whites and so on (Trochim, 2006).
Non Probability Sampling and Applications
In non-probability sampling, random samples are not selected and there is no question of probability. This would mean that there is a certain purpose behind the study and sample selection and it would be possible to select samples that are either favorable or unfavorable to the objective as the method does not use the probability theory and may create a certain amount of bias. This method is used in special circumstances of social research or technology studies where it is not sensible to rely on random sampling.
As an example, the efficiency of a particular drug for a disease would be best found by selecting well-known doctors and health professionals and obtaining their views rather than indulging in the large-scale sampling of nonmedical people who are not qualified to judge the medical aspects of a drug. There are two main methods: accidental or purposive (Stern, 2004).
In this method, accidental or haphazard opinions are sought from the general public and there is no method of controlling the sample. Some examples are opinion polls on various issues such as abortion, the Iraq war, WTO, and other such issues. Random viewers are either approached or encouraged to give their opinions on a certain subject and issue and the replies are then analyzed to frame the results.
In some types of TV shows and reality shows, viewers are asked to call or send an SMS to make a choice or give their opinion about the topic being discussed, and then the results are given as the general opinion. Such methods tend to be biased as only people who are pro or anti-issue may choose to respond. The sample size is not representative of the whole population and the results are frequently skewed (Stern, 2004).
In this method, the sample is selected with a certain intent. The sample size that is selected may is left to the discretion of the researcher who has been given guidelines on whom to pick. Some examples are shopping malls where volunteers seek a specific type of people for market research. Pharma companies or companies that deal with diet foods may pick up certain people who are obese and ask them for information on what they eat. Beauty and cosmetics companies would approach either white or black ladies of a certain age group to find their preferences for cosmetics. As an inducement to participate, the respondents may be given gifts of samples that in turn acts as an advertisement.
This method is very commonly and effectively used by market research companies that want to target specific consumer groups or people and there is no question of errors due to random sampling. There are again a number of different subcategories in this method and they are Modal Instance Sampling, Expert Sampling, Quota Sampling, Heterogeneity Sampling, and Snowball Sampling. These terms are briefly explained below and also examples of marketing applications where the methods are used are also given (Cochran, 1977)
In modal instance sampling, a mode or the typical person who is the most commonly occurring person in the sample frame is considered and responses from this person are sought. The replies are not an expert view but they may form the general view of the population or the majority view. A typical voter in a Black dominated area would have a school-level education, work as a blue-collar job, have a few children and a medium cost car, and so on. A typical voter in a higher-income suburb would have a college degree, have a high-paying white-collar job or an own business, drive an expensive car, and so on. This method created certain stereotypes and this method is severely criticized since it is alleged never to represent any actual person (Cochran, 1977).
In Expert Sampling, a group of experts is formed and their opinion and views about a certain product or issue are formed. This type of method allows researchers to use the opinions of experts who know the third field and have experience and their views can be considered more reliable than the opinions of typical voters. As an example, the effects of a certain drug to counter seizures can be supported or disproved by asking the opinions of a group of doctors who work in this field. But this method is criticized since the very framing of the expert group is open to manipulation and there is a tendency to gather experts who are pro or anti an issue and experts can also be wrong (Cochran, 1977).
In Quota Sampling, several respondents are selected randomly as per a pre-decided quota. The quota can be non-proportional or proportional. In proportional type, it would be known how many people of a certain group are there so if in a population there are 60% men and 40 % women, then the sample size would be selected in the 60:40 ratio. This process is not followed in nonproportional sampling. This method is used in examples like finding the preferences for food products in a store, understanding preference for furnishings, and so on. The survey would be directed to products or issues that affect both men and women. A product such as shaving cream would not be effective if women are considered and a product such as lipstick would make no sense of men are also considered (Cochran, 1977).
In Heterogeneity Sampling, there is no attempt to consider any kind of proportion and a broad spectrum of responses is to be considered. An example would be some specific courses conducted by the college that interests both boys and girls or sports events and so on (Cochran, 1977).
In Snowball Sampling, people who meet certain criteria are selected and then they are asked to recommend others who meet the same criteria and this creates a snowball effect and soon one would have a wide sample of people who meet certain criteria. As an example, researchers may identify a person who plays basketball and asks him to find others who also play the game (Cochran, 1977).
Data Collection Methods
There are a number of the method used for data collection and they are Face-to-face, Computer Assisted Personal Interviewing, Telephone interviews, Computer Assisted Telephone Interviewing, Mail survey, Hand-delivered questionnaire, Electronic Data Reporting, direct observation and so on. There are specific issues and characteristics of each and they are briefly explained below (Levy, 2008).
- Face-to-face: Interviewers visit a person at home or in the office or they may meet them in shopping malls and get their views and preference about certain products or services.
- Computer-Assisted Personal Interviewing: In this method, the interviewer may use a computer or a laptop to note down the responses of the respondents. The information is stored in a database but it is expensive and time-consuming and people may refuse to participate.
- Telephone Interview: Respondents are called up and asked to give answers to a few questions and answers are noted by the researcher.
- Computer Assisted Telephone Interviewing: This is a type of telephone interview where the researcher enters information into a computer even as the responder gives replies
- Mail survey: Questionnaires are mailed to people along with a return envelope and responders need to complete the interview and send it back. The response rate is lower and data quality is not very reliable
- Hand-delivered questionnaire: The instrument is hand-delivered to the responder who then fills it out and either gives it back or mails it. This method gives more results as the responder is induced to complete the instrument.
- Electronic Data Reporting: These are electronic forms stored on websites and people are asked to fill it out. This is a better way since the costs of printing on paper or hiring researchers is avoided.
Sampling for Organisations and Market Research
Sampling is an important tool the market researchers often use to perform market surveys and obtain feedback from people. Both Probability surveys and nonprobability sampling methods are used through non-probability sampling is used when target groups have to be examined. Probability groups are used when for example a company would want to know the preferences of a shopper in large stores where thousands of visitors come and it would not be possible to interview each one.
In such a case, researchers may decide to interview every 50th person who leaves the mall or use stratified sampling to create clusters of customers according to their ethnic origins, age, gender and other characteristics. This would allow the marketing company to understand what a certain group of people of a race or age like and what they prefer to buy and so on (Stern, 2004).
But probability sampling is not extensively used since the results that are obtained are carried out on uncontrolled sample frames. This means that when the survey is done, the subjects are selected randomly and there is a high probability that the selected people may not be the ones who are potential customers. Consider an example for say engine oil that is used in automobiles. Using random sampling methods on a set of people would mean that out of 100 people selected, some percentage of people do not own or drive cars so the sampling has no relevance.
But by using nonprobability methods, an organization can place researchers near petrol pumps and interview vehicle owners to obtain almost 100 % samples of people who own or drive cars. The company can further refine the survey by modal instance sampling or Expert Sampling is used and this kind of survey would provide meaningful data for the company. Marketing companies can further refine the results by selecting owners who would drive certain brands of cars or cars of a certain power or select only female drivers and so on (Stern, 2004).
Examples of Sampling methods for different organizations
Different types of organizations would use different types of sampling methods and these are given as below (Deming, 1984):
- Large soap and cosmetics manufacturer: The organization can take up a purposive non-probability sampling of selected customers in a department store to find their preferences for soaps and cosmetics. This method would ensure that target customers are selected and results that can be used in the research can be obtained.
- Gaming company: The company should use a computer and web-based survey instruments that are hosted on popular gaming sites. Mainly people who play games visit these sites and their views would help gaming companies
- Demographic patterns: The cluster and Multistage methods are used for issues such as finding people’s views about certain events. This would help researchers to create clusters so that specific demographic patterns emerge
- Cosmetic and Toiletries Companies: Modal instance sampling is used when views by users who are non-experts are needed. Products like painkillers, creams, lotions and others are best suited and companies would know what people feel about their products.
- Pharma companies: asking groups of doctors about their views and indications about certain medicines can use Expert sampling here. This would help marketing companies to understand what doctors feel about their products.
The paper has discussed probability and non-probability sampling and analyzed various aspects of each type. The paper has also presented examples of where each type of sampling is used and the benefits and disadvantages of each type. Different methods used for data gathering have also been examined along with examples of what can be used where.
Cochran William G. 1977. Sampling Techniques, 3rd Edition. Wiley Publications.
Deming William Edwards. 1984. Some Theory of Sampling. Dover Publications.
Levy Paul S. 2008. Sampling of Populations: Methods and Applications, Wiley Publications.
Statistics Canada. 2008. Data collection methods. Web.
Stern R, Coe R. 2004. Good Statistical Practice for Natural Resources Research. CABI
Trochim William MK. 2006. Statistical Terms in Sampling. Web.