An open laptop with charts on the screen.

3.2: Sampling Methods

Learning Objectives

Upon completion of this section, you should be able to

  • Identify different sampling methods
  • Select appropriate sampling techniques
  • Identify sources of bias

Introduction

Suppose we are hired by a politician to determine the amount of support they have among the electorate should they decide to run for another term. What population should we study? Every person in the district? Not every person is eligible to vote, and regardless of how strongly someone likes or dislikes the candidate, they don’t have much to do with him or her being re-elected if they are not able to vote.

What about eligible voters in the district? That might be better, but if someone is eligible to vote but does not register by the deadline, they won’t have any say in the election either. What about registered voters? Many people are registered but choose not to vote. What about “likely voters?”

This is the criteria used in many political polling, but it is sometimes difficult to define a “likely voter.” Is it someone who voted in the last election? In the last general election? In the last presidential election? Should we consider someone who just turned 18 a “likely voter?” They weren’t eligible to vote in the past, so how do we judge the likelihood that they will vote in the next election?

In November 1998 former professional wrestler Jesse “The Body” Ventura was elected governor of Minnesota. Up until right before the election, most polls showed he had little chance of winning. There were several contributing factors to the polls not reflecting the actual intent of the electorate:

  • Ventura was running on a third-party ticket and most polling methods are better suited to a two-candidate race.
  • Many respondents to polls may have been embarrassed to tell pollsters that they were planning to vote for a professional wrestler.
  • The mere fact that the polls showed Ventura had little chance of winning might have prompted some people to vote for him in protest to send a message to the major-party candidates.

But one of the major contributing factors was that Ventura recruited a substantial amount of support from young people, particularly college students, who had never voted before and who registered specifically to vote in the gubernatorial election.  The polls did not deem these young people likely voters (since in most cases young people have a lower rate of voter registration and a turnout rate for elections) and so the polling samples were subject to sampling bias: they omitted a portion of the electorate that was weighted in favor of the winning candidate.

Sampling bias

A sampling method is biased if it every member of the population does not have equal likelihood of being in the sample.

In 2016 Reuters reported “pollsters and statisticians gave Hillary Clinton odds of between 75 and 99 percent of winning the U.S. presidential election.” In the end Clinton did win the popular vote, but not the election.

… president and vice president are not elected directly by citizens. Instead, they’re chosen by “electors” through a process called the Electoral College. From https://www.usa.gov/election

So again we ask what happened? Like the Ventura case there were conditions where who the likely voter estimates was incorrect as well as some other factors that changed the outcome. From the American Association for Public Opinion Research (AAPOR):

“there was a strong correlation between education and presidential vote in key states. Voters with higher education levels were more likely to support Clinton. Furthermore, recent studies are clear that people with more formal education are significantly more likely to participate in surveys than those with less education. Many polls – especially at the state level – did not adjust their weights to correct for the over-representation of college graduates in their surveys, and the result was over-estimation of support for Clinton.” From AAPOR Report: An Evaluation of 2016 Election Polls in the U.S.

Does this mean all is not well with using Polls to help predict election results? The answer to that is no. The errors identified didn’t show any major cause for concern as indicated at the annual conference of the American Association of Public Opinion Research (AAPOR).

Errors have happened enough in past elections to know that an upset was well within the realm of possibility in 2016. The Upshot model estimated that a polling misfire was about as likely as a baseball strikeout or a missed midrange field goal in football. It’s not pretty, but it happens and will happen again, and a team wouldn’t release a batter or a kicker because of a strikeout or a missed kick. From A 2016 Review: Why Key State Polls Were Wrong About Trump

What we need to do as consumers of data is understand the surveys and polls conducted cannot be 100% accurate. The data they provide are invaluable, but are also subject to be incorrect. In this section we will look at different forms of bias as well as different methods to conduct surveys in this section. This will hopefully give you a better understanding of what makes for a good survey design and what to watch out for in the results.

Sampling Method

Identifying the population can be a difficult job, but how do we choose an appropriate sample from that population? Remember, although we would prefer to survey all members of the population, this is usually impractical unless the population is very small, so we choose a sample. There are many ways to sample a population, but there is one goal we need to keep in mind: we would like the sample to be representative of the population.

Returning to our hypothetical job as a political pollster, we would not anticipate very accurate results if we drew all of our samples from among the customers at a Starbucks, nor would we expect that a sample drawn entirely from the membership list of the local gym would provide a useful picture of district-wide support for a candidate.

One way to ensure that the sample has a reasonable chance of mirroring the population is to employ randomness. The most basic random method is simple random sampling.

Simple random sample

A simple random sample is one in which each member of the population has an equal chance (probability) of being chosen

Example 1

Which of the following scenarios represent a Simple Random Sample from the population?

Scenario 1: From all likely voters in the state, put each of their names on a piece of paper, toss the slips into a (very large) hat and draw 1000 slips out of the hat

Scenario 2: From all likely voters in the state, divide the voters by congressional district and for each district put each of their names on a piece of a paper, toss the slips into a (very large) hat and draw 100 slips out of the hat.

Scenario 3: From all likely voters in the state, randomly select five different letters from the alphabet. For each letter identify likely voters with a last name starting with that letter and put each of their names on a piece of a paper, toss the slips into a (very large) hat and draw 200 slips out of the hat.

Solution

Scenario 1: This would be a simple random sample as every person in the population would have the same chance of being selected. This of course assumes that the names would be mixed and there was no impediment to selecting a slip from anywhere in the hat.

Scenario 2: Congressional districts in a state are supposed to be equal as possible (by law), but often can have population sizes that differ by up to 10% as the congressional districts are only drawn once every ten years. This would not be a simple random sample as not everyone has an equal chance of being in the survey (somebody living in the smallest district for the state has a higher chance of being selected compared to another living in the largest district).

Scenario 3: Although this seems like everyone would have the same chance initially once a letter is picked it changes the chance of someone to be selected depending on which group they belong to if those groups are not the same size. If there are 10,000 people in the group for letter “Q”, but 200,000 for the group with letter “S”, then anyone in group “S” will have a much lower chance to be in the survey. This would not be a simple random sample.

In practice, computers are better suited for this sort of endeavor than millions of slips of paper and extremely large headgear. Each person in the population is assigned a number and the computer then selects the individuals through a random process.

It is always possible, however, that even a random sample might end up not being totally representative of the population. If we repeatedly take samples of 1000 people from among the population of likely voters in the state of Arizona, some of these samples might tend to have a slightly higher percentage of Democrats (or Republicans) than does the general population; some samples might include more older people and some samples might include more younger people; etc. In most cases, this sampling variability is not significant, but since we are including randomness there is a chance of a sample not being representative of the population.

Sampling variability

The natural variation of samples is called sampling variability.

To help account for variability present in the population, pollsters might instead use a stratified sample where the strata (subgroup of the population) are made up with individuals with the same characteristics to ensure we are including a proportional representative from each characteristic.

Stratified sampling

In stratified sampling, a population is divided into a number of subgroups (or strata). These subgroups would make up the whole population and no person can belong to more than one subgroup (this is also called partitioning the population). Next simple random samples are then taken from each subgroup with sample sizes proportional to the size of the subgroup in the population.

In the next example we will look at how stratified random sampling works.

Example 2

Suppose in a particular state that previous data indicated that the electorate was comprised of 39% Democrats, 37% Republicans and 24% independents. If a pollster was going to take a sample of 1000 people in that population determine the sample size from each of the strata identified (Democrat, Republican, and Independent).

Solution

In a sample of 1000 people we would want see the sample break down in to the same percentages as seen in the population. With stratified sampling this means 39% of the 1000 in the sample should be from the democrats, 37% of the 1000 from Republicans, and 24% of the 1000 from independents.

Number of people to select from Democrats: 1000*0.39 = 390.
Number of people to select from Republicans: 1000*0.37 = 370.
Number of people to select from Independents: 1000*0.24 = 240.

Stratified sampling can also be used to select a sample with people in desired age groups, a specified mix ratio of males and females, etc. By dividing the population into the groups (strata) we can ensure that individuals from those groups are included in the sample. What stratified sampling does not do is simplify the process as it is still a requirement to list out all the population for each strata and conduct a simple random sample from each of those groups. It adds an extra layer to the process as we need some way to identify from the population which category that sit in before doing the sampling.

Is there another approach that gives us some similar characteristics to Stratified Sampling? Answer to that is yes. One variation on this technique is called quota sampling.

In Quota sampling the population is divided into groups (strata), but the sampling methods from within the group is left open on how to proceed (does not require a simple random sampling technique) as well as the size for the quota (does not need to be proportional to what is in the population). If a poll lists quota sampling method as the technique it leaves room for concern as the way the sample was collected from the strata and the quota sizes may not be ideal to ensure the sample is representative of the population. This type of sampling does potentially save time and costs as the researcher determines how to select the subjects from each strata.

Quota sampling

In Quota sampling, a population is divided into a number of subgroups (or strata). A set number (quota) is defined to sample from each subgroup. The sampling method to pick the subjects is left to the researcher.

Example 3

For each scenario determine if the sampling method used was stratified or quota. For both scenarios assume the population for the survey has the following demographics: 15% of the population is under 18, 45% of the population is between 18-40, 25% is between the 41-65, and the remaining is above 65. In both scenarios the sample size is 1000 people.

Scenario 1: A researcher calls people at random from each age group. They want 250 people from each of the identified age groups to be included in the sample.

Scenario 2: A researcher calls people at random from each age group. From the under 18 group 150 are called, from 28-40 a total of 450 are called, from 41-65 250 are called, and 150 are called that are 65 and older.

Solution

Scenario 1. This looks similar to stratified random sampling, but is actually Quota Sampling. The number sampled from within each age group is not proportional to the population. By sampling 250 from each age group we have each age group representing 25% of the entire sample (250/1000*100%=25%). This is not equal to the given percentage of the population.

Scenario 2. This is stratified random sampling. The number sampled from within each age group is proportional to the population. 15% of the population was under 18 in the population and we see that 150 is 15% of 1000, so for this (and the other age group) the sample size as proportional to the population size.

The above example illustrates one of the drawbacks of quota sampling in that the proportions of each strata does not have to be equal to the population proportion. A good quota sampling would try to keep those proportions equal if they are known.

Another sampling method is cluster sampling, in which the population is divided into groups, and one or more groups are randomly selected to be in the sample. In this case the groups that are formed should be as identical as possible (look like a mini version of the actual population itself). Otherwise by random variation we may end up selecting a group unlike the population as a whole and our data would be possibly skewed.

Cluster sampling

In cluster sampling, the population is divided into subgroups (clusters) that are each similar to the population as a whole, and a set of subgroups are selected to be in the sample.

Example 4

If the college wanted to survey students, since students are already divided into classes, they could randomly select 10 classes and give the survey to all the students in those classes. If they do this, what sampling method would they be using?

Solution

This would be cluster sampling. The students are broken into groups (classes) and then we take a sample of those groups (classes) and then survey all the students in the group (classes) that are selected.

In the example above we see a potential issue with cluster sampling if steps are not taken to ensure the clusters created are not similar to the population. In the case of randomly selecting 10 classes for the college there is a chance of randomly picking 10 100 level courses (freshman level classes) and having the results of the survey skewed.

Another sampling method we often see is called systematic sampling.

Systematic sampling

In systematic sampling, every nth member of the population is selected to be in the sample.

This method requires us to order the population (like assigning them a number and going in a numeric order).

Example 5

Identify which sampling method is used (Simple Random Sample, Systematic, Stratified, Quota, or Systematic):

Scenario 1: To select a sample at the college the researcher assigns each student a number starting at 1 and ending at 4500 (the total number of students) and then randomly chooses 100 numbers (students) to be part of the survey.

Scenario 2: To select a sample, a pollster calls every 100th name in the phone book.

Scenario 3: A pollster divides Tucson into zip codes and randomly surveys 25 people from each zip code.

Solution

Scenario 1: This is a simple random sample. A random selection from the population where each person has the same chance of being selected.

Scenario 2: This is systematic sampling as we are somehow ordering the population and using a systematic approach to select every 100th person in that order.

Scenario 3: This is a Quota Sampling as the number selected from each group is not known to be proportional to the population.

Systematic sampling is not as random as a simple random sample (if your name is Albert Aardvark and your sister Alexis Aardvark is right after you in the phone book, there is no way you could both end up in the sample) but it can yield acceptable samples.

Perhaps the worst types of sampling methods are convenience samples and voluntary response samples.

Convenience sampling and voluntary response sampling

Convenience sampling is samples chosen by selecting whoever is convenient.
Voluntary response sampling is allowing the sample to volunteer.

These two types of methods are very easy to conduct, but are prone to issues with the results. Have you ever been sitting in a class and your instructor surveys the class by asking if everyone understood the last slide on a presentation they were giving? Do you believe the responses by students gave an accurate representation of students understanding each time that happened? Probably not as some students may not want to speak up or the ones who did speak up and maybe said yes didn’t represent the class as a whole.

Example 6

A pollster as stands on a street corner and interviews the first 100 people who agree to speak to them. What sampling method was used?

Solution

This is a convenience sample. The pollster is choosing members of the population that are easy to access. This sampling method should be avoided.

Example 7

A website has a survey asking readers to give their opinion on a tax proposal.  What sampling method was used?

Solution

This is a self-selected sample, or voluntary response sample, in which respondents volunteer to participate. The reason this is not just a convenience sample is that the respondent had to put effort into participating, so we call it a voluntary response sampling method instead.

Usually voluntary response samples are skewed towards people who have a particularly strong opinion about the subject of the survey or who just have way too much time on their hands and enjoy taking surveys.

Try it Now 1

A study is done to determine the average tuition that ASU undergraduate students pay per semester. Each student in the following samples are asked how much tuition he or she paid for the Fall semester. What is the type of sampling in each Scenario?

Scenario 1: A sample of 100 undergraduate ASU students is taken by organizing the students’ names by classification (freshman, sophomore, junior, or senior), and then selecting 25 students from each by assigning the students a number and randomly selecting 25 numbers.

Scenario 2: A random number generator is used to select a student from the alphabetical listing of all undergraduate students in the Fall semester. Starting with that student, every 50th student is chosen until 75 students are included in the sample.

Scenario 3: A completely random method is used to select 75 students. Each undergraduate student in the fall semester has the same chance of being chosen at any stage of the sampling process.

Scenario 4: The freshman, sophomore, junior, and senior years are numbered one, two, three, and four, respectively. A random number generator is used to pick two of those years. All students in those two years are in the sample.

Scenario 5: An administrative assistant is asked to stand in front of the library one Wednesday and to ask the first 100 undergraduate students he encounters what they paid for tuition the Fall semester. Those 100 students are the sample.

Hint 1 (click to Show/Hide)

Scenario 1: The population was first divided into different groups and then the selections were made. Were the amount selected from each group proportional to the numbers in the population (would it be safe to assume so)?

Scenario 2: The 75 selected is not what you should focus on here. The consistent selection of every 50th student identifies the sampling method.

Scenario 3: Each member in the population has the same chance of being selected and the population was not broken into any groups first.

Scenario 4: The population was divided into four groups and everyone in the selected groups were included in the sample. They did not take samples from each of the groups.

Scenario 5: Did the assistant have to put much effort to get the sample?

Answer (click to Show/Hide)

Scenario 1: Quota Sampling (It is close to stratified random sampling, but it is probably not safe to assume that each class has same number of students.)

Scenario 2: Systematic Sampling

Scenario 3: Simple Random Sample

Scenario 4: Cluster Sampling

Scenario 5: Convenience Sampling

Bias: How to mess things up before you start

Video Sampling Bias (6 mins 13 secs – CC)

There are number of ways that a study can be ruined before you even start collecting data. The first we have already explored – sampling or selection bias, which is when the sample is not representative of the population. One example of this is voluntary response bias, which is bias introduced by only collecting data from those who volunteer to participate. This is not the only potential source of bias.

Sources of bias

Sampling bias – when the sample is not representative of the population
Voluntary response bias – the sampling bias that often occurs when the sample is volunteers
Self-interest study – bias that can occur when the researchers have an interest in the outcome
Response bias – when the responder gives inaccurate responses for any reason
Perceived lack of anonymity – when the responder fears giving an honest answer might negatively affect them
Loaded questions – when the question wording influences the responses
Non-response bias – when people refusing to participate in the study can influence the validity of the outcome

In the examples that follow identify the largest source of bias that is present from the details given.

Example 8

Consider a recent study which found that chewing gum may raise math grades in teenagers. This study was conducted by the Wrigley Science Institute, a branch of the Wrigley chewing gum company. What potential source of bias should we be concerned about?

Solution

This is an example of a self-interest study; one in which the researches have a vested interest in the outcome of the study. While this does not necessarily ensure that the study was biased, it certainly suggests that we should subject the study to extra scrutiny. Not all self-interest studies are bias as many important pharmaceutical studies are funded by the companies who produce the drug.

Example 9

A survey asks people “when was the last time you visited your doctor?” What potential source of bias should we be concerned about?

Solution

This might suffer from response bias, since many people might not remember exactly when they last saw a doctor and give inaccurate responses.

Sources of response bias may be innocent, such as bad memory, or as intentional as pressuring by the pollster. Response bias may be present even in very long surveys as the respondent gets fatigued and puts in any easy answer to just move it along as quickly as possible.

Example 10

A survey asks participants a question about their interactions with members of other races. What potential source of bias should we be concerned about?

Solution

Here, a perceived lack of anonymity could influence the outcome.  The respondent might not want to be perceived as racist even if they are, and give an untruthful answer.

Example 11

An employer puts out a survey asking their employees if they have a drug abuse problem and need treatment help. What potential source of bias should we be concerned about?

Solution

Perceived lack of anonymity. Here, answering truthfully might have consequences; responses might not be accurate if the employees do not feel their responses are anonymous or fear retribution from their employer.

Example 12

A survey asks “do you support funding research of alternative energy sources to reduce our reliance on high-polluting fossil fuels?”  What potential source of bias should we be concerned about?

Solution

This is an example of a loaded or leading question – questions whose wording leads the respondent towards an answer. This question is leading the respondent with the negative environmental language in regards to fossil fuels.

Loaded questions can occur intentionally by pollsters with an agenda, or accidentally through poor question wording. Another concern is question order, where the order of questions changes the results.  A psychology researcher (Swartz, Norbert) provides an example:

“My favorite finding is this: we did a study where we asked students, ‘How satisfied are you with your life? How often do you have a date?’ The two answers were not statistically related – you would conclude that there is no relationship between dating frequency and life satisfaction. But when we reversed the order and asked, ‘How often do you have a date? How satisfied are you with your life?’ the statistical relationship was a strong one. You would now conclude that there is nothing as important in a student’s life as dating frequency.”

Example 13

A telephone poll to asks the question “Do you often have time to relax and read a book?”, and 50% of the people called refused to answer the survey. What potential source of bias should we be concerned about?

Solution

It is unlikely that the results will be representative of the entire population.  This is an example of non-response bias, introduced by people refusing to participate in a study or dropping out of an experiment.  When people refuse to participate, we can no longer be so certain that our sample is representative of the population.

Example 14

To determine how long it takes people to hit the brakes when an animal runs in the front of their car, 100 college students are recruited and put through a simulator. What potential source of bias should we be concerned about?

Solution

This is a case of sampling bias as our intended population was “people” in general and we limited our sampling to just college students.

Try It Now 2

In each situation, identify a potential source of bias

  1. A survey asks how many sexual partners a person has had in the last year
  2. A radio station asks readers to phone in their choice in a daily poll.
  3. A substitute teacher wants to know how students in the class did on their last test. The teacher asks the 10 students sitting in the front row to state their latest test score.
  4. High school students are asked if they have consumed alcohol in the last two weeks.
  5. The Beef Council releases a study stating that consuming red meat is poses little cardiovascular risk.
  6. A poll asks “Do you support a new transportation tax, or would you prefer to see our public transportation system fall apart?”
Hint 1 (click to Show/Hide)

Use the definitions to help guide you.

Sampling bias – when the sample is not representative of the population
Voluntary response bias – the sampling bias that often occurs when the sample is volunteers
Self-interest study – bias that can occur when the researchers have an interest in the outcome
Response bias – when the responder gives inaccurate responses for any reason
Perceived lack of anonymity – when the responder fears giving an honest answer might negatively affect them
Loaded questions – when the question wording influences the responses
Non-response bias – when people refusing to participate in the study can influence the validity of the outcome

Answer (click to Show/Hide)

a. Response bias – historically, men are likely to over-report, and women are likely to under-report to this question.
b. Voluntary response bias – the sample is self-selected
c. Sampling bias – the sample may not be representative of the whole class
d. Lack of anonymity
e. Self-interest study
f. Loaded question

Exercises


  1. Determine the type of sampling used (quota, simple random, stratified, systematic, cluster, or convenience).Scenario 1: A soccer coach selects six players from a group of boys aged eight to ten, seven players from a group of boys aged 11 to 12, and three players from a group of boys aged 13 to 14 to form a recreational soccer team.

    Scenario 2
    : A pollster randomly selects five high tech companies with 50 or more employees and interviews all human resource personnel in those five different high tech companies.

    Scenario 3
    : There are approximately 500 male and 500 female high school teachers in a school district. A high school educational researcher interviews 50 high school female teachers and 50 high school male teachers from within that district.

    Scenario 4
    : A medical researcher interviews every third cancer patient from a list of cancer patients at a local hospital.

    Scenario 5
    : A high school counselor uses a computer to generate 50 random numbers and then picks students whose names correspond to the numbers.

    Scenario 6
    : A student interviews classmates in her algebra class to determine how many electronic devices a student owns.
    Answer (click to Show/Hide)
    1. Stratified
    2. Cluster.
    3. Quota sampling (although close to stratified we select quota sampling since we are not given information on how the 50 female/male teachers are selected)
    4. Systematic
    5. Simple random
    6. Convenience
  2. Determine the type of sampling used (quota, simple random, stratified, systematic, cluster, or convenience).
    A high school principal polls 40 freshmen, 40 sophomores, 40 juniors, and 40 seniors regarding policy changes for after school activities. The freshman class is about 20% larger than the senior class while sophomore and junior classes are equal and about 10% less than the Freshman class.
    Answer (click to Show/Hide)

    Quota Sampling. Although this may look close to stratified random sampling the difference here is that the sample sizes are all equal, but the class level do not have the same number of students in each of them. This means the freshman class is over represented as a whole in the poll.

  3. In a study, the sample is chosen by writing everyone’s name on a playing card, shuffling the deck, then choosing the top 20 cards.  What is the sampling method?
    Answer (click to Show/Hide)

    This is an example of a Simple Random Sample.

  4. In a study, the sample is chosen by separating all cars by size, and selecting 10 of each size grouping.  What is the sampling method?
    Answer (click to Show/Hide)

    This is an example of Quota Sampling.

  5. Pima Community College (PCC) has approximately 14,000 part-time students (the population). We are interested in the average amount of money a part-time student spends on books in the fall term. Asking all 14,000 students is an almost impossible task.Suppose we take three different samples.First, we use convenience sampling and survey ten students from a first term organic chemistry class. Many of these students are taking first term calculus in addition to the organic chemistry class. The amount of money they spend on books is as follows:

    $128; $87; $173; $116; $130; $204; $147; $189; $93; $153

    The second sample is taken using a list of senior citizens who take P.E. classes and taking every fifth senior citizen on the list, for a total of ten senior citizens. They spend:

    $50; $40; $36; $15; $50; $100; $40; $53; $22; $22

    In the third sample we choose ten different part-time students from the disciplines of chemistry, business, English, psychology, sociology, history, nursing, physical education, art, and early childhood development. (We assume that these are the only disciplines in which part-time students at PCC are enrolled and that an equal number of part-time students are enrolled in each of the disciplines.) Each student is chosen using simple random sampling. The students spend the following amounts:

    $180; $50; $150; $85; $260; $75; $180; $200; $200; $150

    Do you think that any of these samples are representative of the entire part-time student population at PCC?

    Answer (click to Show/Hide)

    It is unlikely any of the three were representative of the population as a whole. The first sample probably consists of science-oriented students. Besides the chemistry course, some of them are also taking first-term calculus. Books for these classes tend to be expensive. Most of these students are, more than likely, paying more than the average part-time student for their books. The second sample is a group of senior citizens who are, more than likely, taking courses for health and interest. The amount of money they spend on books is probably much less than the average part-time student. Both samples are biased. Also, in both cases, not all students have a chance to be in either sample. The third sample may unbiased, but a larger sample would be recommended to increase the likelihood that the sample will be close to representative of the population. However, for a biased sampling technique, even a large sample runs the risk of not being representative of the population.

  6. Identify the most relevant source of bias in this situation:  A survey asks the following: Should the mall prohibit loud and annoying rock music in clothing stores catering to teenagers?
    Answer (click to Show/Hide)

    This suffers from loaded or leading question bias

  7. Identify the most relevant source of bias in this situation:  To determine opinions on voter support for a downtown renovation project, a surveyor randomly questions people working in downtown businesses.
    Answer (click to Show/Hide)

    This suffers from sampling bias as not all voters may work in the downtown area.

  8. Identify the most relevant source of bias in this situation:  A survey asks people to report their actual income and the income they reported on their IRS tax form.
    Answer (click to Show/Hide)

    This question would likely suffer from a perceived lack of anonymity. People may not wish to give a truthful answer.

  9. Identify the most relevant source of bias in this situation:  A survey randomly calls people from the phone book and asks them to answer a long series of questions. 70% of respondents refused to answer the survey.
    Answer (click to Show/Hide)

    This may suffer from Non-response bias if they were told it was a long series of questions up front. If a person responds you also risk response bias if the list of questions is extremely long.

  10. Identify the most relevant source of bias in this situation:  A survey asks the following: Should the death penalty be permitted if innocent people might die?
    Answer (click to Show/Hide)

    This may suffer from loaded or leading question bias. The extra material at the end inputs extra information from the cons of the death penalty into the question and could influence a persons response.

  11. Identify the most relevant source of bias in this situation:  A study seeks to investigate whether a new pain medication is safe to market to the public.  They test by randomly selecting 300 men from a set of volunteers.
    Answer (click to Show/Hide)

    This may suffer from voluntary response bias and sampling bias. The reason for sampling bias is if the population was to include woman they are not represented in the sample.


Attributions

This page contains modified content from David Lippman, “Math In Society, 2nd Edition.” Licensed under CC BY-SA 4.0.

This page contains modified content from “Collecting Data” by Foster et al.LibreTexts. Licensed under CC BY-NC-SA.

This page contains modified content from “OpenStax Introductory Satistics: 1.2 Data, Sampling, and Variation in Data and Sampling” by Barbara Illowsky, Susan Dean. Licensed under CC BY 4.0.

This page contains content by Robert Foth, Math Faculty, Pima Community College, 2021. Licensed under CC BY 4.0.

The recent study mentioned in Example 8, about chewing gum may raise math grades in teenagers, was from Reuters. https://www.reuters.com/article/us-gum-learning/chewing-gum-may-raise-math-grades-in-teens-idUSTRE53L79320090422. Retrieved 4/22/09

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Topics in Mathematics Copyright © by Robert Foth is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book