> *What it can tell you is the probability of seeing the data assuming the n...

pak · on Nov 18, 2011

> arbitrarily pick a class of data that's "at least as extreme" as what was actually seen

You are absolutely correct, in my haste to explain one misinterpretation of p-values, I stumbled into another gross oversimplification. The correct way of saying it must always involve some language about the deviation or extremity of the data being at least as great as observed.

It goes to show that proper reporting of p-value statistics takes a lawyerly craft with language; while writing the post I had to consider what the mathematical interpretation of "is a fluke" should be. I think the definition I chose is what most people will interpret it as. However, the word fluke just means "unlikely chance occurrence" on its own, so saying "there is a 1% chance the result is an unlikely chance occurrence" is uninformative if taken literally. It has to imply one of two things:

1. You are referring to the null hypothesis as the unlikely occurrence, i.e. "There is a 1% chance the null hypothesis is true [therefore making this data surprising]."

2. You are temporarily assuming it is true to make such a statement, i.e. "There is a 1% chance of seeing data at least this surprising [if the null hypothesis were true]."

Someone that does know what a p-value is might be generous and think you meant the latter, which is a correct statement. However, I think that most people hear it the first way.

The fact that p-value reporting (when properly done) involves thinking through double negatives and conditions that are easily ignored probably indicates that it's time for other measures of significance to become better accepted.