Statistical Analysis
Data Context

Related Topics
Exponential Growth

Law Of Large Numbers

Misuse of Statistics

# 22 Misuses of Statistics

, updated on
There are three kinds of lies: lies, damned lies and statistics. ~ Mark Twain
A misuse of statistics is a pattern of unsound statistical analysis. These are variously related to data quality, statistical methods and interpretations. Statistics are occasionally misused to persuade, influence and sell. Misuse can also result from mistakes of analysis that result in poor decisions and failed strategies. The following are common misuses of statistics.

## Apples & Oranges

Comparing things that are not comparable or using unfair or impractical criteria of comparison.

## Cognitive Biases

Misinterpretations of numbers due to flawed logic such as cognitive biases.

## Correlation vs Causation

The invalid assumption that because two things are correlated that one causes the other.

## Data Dredging

Looking for patterns in data using brute force methods that try a large number of statistical models until matches are found. Data dredging has valid applications for exploratory data analysis. However, it is generally considered poor practice to draw conclusions using data dredging as it tends to find random patterns that are meaningless.

## Estimation Error

Neglecting estimation error in results.

## Garbage In Garbage Out

Low quality data produces low quality statistical analysis.

## Omitting A Controversy

Failing to mention controversial assumptions in your data. For example, representing the result of a particular IQ test as "intelligence."

## Out Of Context Data

Using data without understanding its context.

## Overcomplexity

Graphs and data visualizations that are too complex to be interpreted by your audience. This may prevent data from being challenged and validated.

## Overfitting

Testing too many theories against data such that random patterns are sure to be found.

## Prosecutor's Fallacy

A general term for an invalid interpretation of a valid statistic.

## Regression Toward The Mean

The tendency for extreme sets of results to become more average as the set grows. Neglecting regression toward the mean is a common error of statistical analysis.

## Significance

Basing analysis on a statistically insignificant number of samples.

## Subject Matter Interpretations

Interpretations that are made without input from an expert on a topic. For example, a data analysis that suggests a particular vitamin has health benefits that is performed by a statistician without involvement of medical doctors and other professionals who might spot errors and alternative explanations for observed data.

## Tyranny Of Averages

A tendency for averages to convey little practical information about a data distribution. For example, an average is often greatly influenced by outliers in data.

## Selection Bias

Using a sample that is not representative of an entire population. For example, if you ask for volunteers for a study about personality this would likely contain a selection bias because people who volunteer for studies may have different characteristics from the general population.

## Publication Bias

A tendency for studies with dramatic results to be published while studies that confirm conventional theories are not.

## Hawthorne Effect

The Hawthorne effect is a tendency for people to modify their behavior while they are participating in a study. For example, if you are participating in a study that measures productivity, you may tend to work harder.

## Survivorship Bias

Considering only the winners in a particular population. For example, looking at the historical returns of all companies currently in a stock index without considering companies that were dropped from the index due to poor performance.

## Confounding

Confounding occurs where it looks like there is a relationship between two variables because some third unidentified variable affects both. For example, if you find that people who drive grey cars get in less accidents, you might look for a confounding variable such as the average age of people who purchase a conservative color such as grey.

### Summary

Next: Prosecutor's Fallacy
Cohort
Data Science
Distributions
Exponential Growth
Forecasting
Growth
Large Numbers
Misuse of Statistics
Negative Correlation
Populations
Positive Correlation
Regression Analysis
Research
Samples
Statistical Model
Statistics
Structured Data
More ...

## Fallacies

A list of logical fallacies.

## Statistical Analysis

A list of basic statistical analysis techniques.

## Cohort Analysis

An overview of cohort analysis.

## Analytics

A definition of analytics with examples.

## Continuous Data vs Discrete Data

The difference between continuous and discrete data.

## Hypothesis Types

The common types of hypothesis with examples.

## Negative Correlation

The definition of negative correlation with examples.

## Errors

The common types of error with examples.

## Structured Data

An overview of structured data with examples.

## Regression Analysis

An overview of regression analysis with examples.
The most popular articles on Simplicable in the past day.

## New Articles

Recent posts or updates on Simplicable.
Site Map