“The probability that the results are due to chance”

One of the (wrong) explanations that you often see of what a p-value means is “the probability that data have arisen by chance.” I think people may struggle to see why this is wrong, as I did for a long time. A p-value is the probability of getting the data (or more extreme data) if the null hypothesis (no difference) is correct – right? So that would mean the specific result you got must have been due to chance variation, doesn’t it? So why isn’t the p-value the probability that the result was due to chance?

The problem is that there are two ways of interpreting “the probability that a result is due to chance.”
1. The probability that chance or random variation was the process that produced the result;
2. The probability of getting the specific data (or more extreme data) that you got in your experiment, if chance was the only process operating.

The second of these is what the p-value tells you; but the first is the interpretation that most people give it. The p-value tells you nothing about the process that produced the result, because it is calculated on the assumption that the null hypothesis is correct.

Original post: http://blogs.warwick.ac.uk/simongates/entry/8220the_probability_that/ 12 November 2016


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s