# Sampling and Sources of Bias

A **census** is when we sample the entire population.

It is difficult to take a census.

## Sampling Bias

There are multiple reasons for sampling bias:

**Non-response** can occur if only a small fraction of the sample respond to a survey, the sample may no longer be representative of the population.

**Voluntary response** occurs when the sample consists of people who volunteer to respond becuase they have strong opinions, causing the responses to not be representative of the population.

**Convenience samples** are samples with a higher proportion of people who are more easily accessible than the complete population.

It is possible to have a large sample, but for that sample to have a bias that leads to significant issues with the conclusions we can draw from the sample.

Almost all statistical methods are based on the notion of implied randomess.

## Common Sampling Techniques

**Simple Random Samples** randomly select cases from the entire population, where there is no implied connection between the points selected.

**Stratified Samples** are samples made up of random samples from non-overlapping subgroups. Each subgroup is called a **stratum** (plural, strata).

**Cluster Samples** are samples where the researcher divides the population into groups called **clusters**. Subgroups are created such that each group should have a similar population. When the clusers are created, we sample a simple random sample from within each cluster.

## Random Assignment vs Random Sampling

Random Assignment | No Random Assignment | |
---|---|---|

Random Sampling |
Causal and Generalizable (Ideal Experiment) | Not Causal but Generalizable (Most Observational Studies) |

No Random Sampling |
Causal but not Generalizable (Most Experiments) | Neither Causal nor Generalizable (Bad Observational Studies) |