# 17 Misuses of Statistics

, updated on April 19, 2019
There are three kinds of lies: lies, damned lies and statistics. ~ Mark Twain
A misuse of statistics is a pattern of unsound statistical analysis. They are variously related to data quality, statistical methods and interpretations. Statistics are occasionally misused to persuade, influence and sell. Misuse can also result from mistakes of analysis that result in poor decisions and failed strategies. The following are common misuses of statistics.

## Apples & Oranges

Comparing things that are not comparable or using unfair or impractical criteria of comparison.

## Cognitive Biases

Misinterpretations of numbers due to flawed logic such as cognitive biases.

## Correlation vs Causation

The invalid assumption that because two things are correlated that one causes the other.

## Data Dredging

Looking for patterns in data using brute force methods that try a large number of statistical models until matches are found. Data dredging has valid applications for exploratory data analysis. However, it is generally considered poor practice to draw conclusions using data dredging as it tends to find random patterns that are meaningless.

## Estimation Error

Neglecting estimation error in results.

## Garbage In Garbage Out

Low quality data produces low quality statistical analysis.

## Omitting A Controversy

Failing to mention controversial assumptions in your data. For example, representing the result of a particular IQ test as "intelligence."

## Out Of Context Data

Using data without understanding its context.

## Overcomplexity

Graphs and data visualizations that are too complex to be interpreted by your audience. This may prevent data from being challenged and validated.

## Overfitting

Testing too many theories against data such that random patterns are sure to be found.

## Prosecutor's Fallacy

A general term for an invalid interpretation of a valid statistic.

## Regression Toward The Mean

The tendency for extreme sets of results to become more average as the set grows. Neglecting regression toward the mean is a common error of statistical analysis.

## Significance

Basing analysis on a statistically insignificant number of samples.

## Subject Matter Interpretations

Interpretations that are made without input from an expert on a topic. For example, a data analysis that suggests a particular vitamin has health benefits that is performed by a statistician without involvement of medical doctors and other professionals who might spot errors and alternative explanations for observed data.

## Tyranny Of Averages

A tendency for averages to convey little practical information about a data distribution. For example, an average is often greatly influenced by outliers in data.

