BROWSE

Sample Selection Bias

Definition

In statistics, sampling bias is a bias in which a sample is collected in such a way that some members of the intended population are less likely to be included than others. It results in a biased sample, a non-random sample of a population in which all individuals, or instances, were not equally likely to have been selected. If this is not accounted for, results can be erroneously attributed to the phenomenon under study rather than to the method of sampling.

What is 'Sample Selection Bias'

Sample selection bias is a type of bias caused by choosing non-random data for statistical analysis. The bias exists due to a flaw in the sample selection process, where a subset of the data is systematically excluded due to a particular attribute. The exclusion of the subset can influence the statistical significance of the test, or produce distorted results.

Explaining 'Sample Selection Bias'

Survivorship bias is a common type of sample selection bias. For example, when back-testing an investment strategy on a large group of stocks, it may be convenient to look for securities that have data for the entire sample period. If we were going to test the strategy against 15 years worth of stock data, we might be inclined to look for stocks that have complete information for the entire 15-year period. However, eliminating a stock that stopped trading, or shortly left the market, would input a bias in our data sample. Since we are only including stocks that lasted the 15-year period, our final results would be flawed, as these performed well enough to survive the market.


Further Reading


Selection bias and econometric remedies in accounting and finance research
papers.ssrn.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

Currency crises, capital-account liberalization, and selection biasCurrency crises, capital-account liberalization, and selection bias
www.mitpressjournals.org [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

Corporate governance and performance: Controlling for sample selection bias and endogeneityCorporate governance and performance: Controlling for sample selection bias and endogeneity
papers.ssrn.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

The effects of tax increment financing on economic developmentThe effects of tax increment financing on economic development
www.sciencedirect.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

Sample selection bias with multiple selection rules: An application to student aid grantsSample selection bias with multiple selection rules: An application to student aid grants
www.sciencedirect.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

Evaluating the performance of value versus glamour stocks The impact of selection biasEvaluating the performance of value versus glamour stocks The impact of selection bias
www.sciencedirect.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

Testing and correcting for sample selection bias in discrete choice contingent valuation studiesTesting and correcting for sample selection bias in discrete choice contingent valuation studies
ideas.repec.org [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

Sample selection bias and repeat-sales index estimatesSample selection bias and repeat-sales index estimates
link.springer.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

An empirical study of sample-selection bias in indices of commercial real estateAn empirical study of sample-selection bias in indices of commercial real estate
link.springer.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …

Size characteristics of tests for sample selection bias: a Monte Carlo comparison and empirical exampleSize characteristics of tests for sample selection bias: a Monte Carlo comparison and empirical example
www.tandfonline.com [PDF]
… estimation. Matching by covariates could avoid both problems. Matched-sample designs … Loughran and Ritter [1997]). For example, researchers identify a control firm from the … considered random. Intuitively, it means that the selection bias is caused by observables …



Q&A About Sample Selection Bias


What are some examples of selection biases that occur when collecting samples?

Examples include non-random sampling and convenience sampling.

What is sample selection bias?

Sample selection bias is a type of bias caused by choosing non-random data for statistical analysis. The bias exists due to a flaw in the sample selection process, where a subset of the data is systematically excluded due to a particular attribute.

How can you avoid survivorship biases?

You can avoid survivorship biases by looking at all available securities and not just those with complete information over time periods being tested.

What does selection bias cause?

Selection bias causes some members of a population to be less likely to be included than others, resulting in a biased sample.

What is the phrase "selection bias" most often used to refer to?

The distortion of a statistical analysis.

What does survivorship bias describe?

Survivorship bias describes when back-testing an investment strategy on a large group of stocks, it may be convenient to look for securities that have data for the entire sample period. If we were going to test the strategy against 15 years worth of stock data, we might be inclined to look for stocks that have complete information for the entire 15-year period. However, eliminating a stock that stopped trading or shortly left the market would input a bias in our data sample. Since we are only including stocks that lasted the 15-year period, our final results would be flawed as these performed well enough to survive the market.

Why should you avoid using incomplete datasets?

Incomplete datasets will introduce survivorship biases into your testing which will lead to inaccurate conclusions about what works and what doesn't work in your investment strategy.

How can you tell if there is selection bias in your study?

If the sample is not random, then it may be biased.

Leave a Reply

Your email address will not be published. Required fields are marked *