Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> What it can tell you is the probability of seeing the data assuming the null hypothesis is true.

Sigh. When will internet explainers ever get this right? :-) Your explanation cannot be literally correct because, if the null hypothesis is true (the coin is fair), the "probability of seeing the data" is exactly the same for all possible sequences of six coinflips. What hypothesis testing actually does is arbitrarily pick a class of data that's "at least as extreme" as what was actually seen, and report its total probability.

To quote Eliezer Yudkowsky rephrasing Steven Goodman's example:

> So lo and behold, I flip the coin six times, and I get the result TTTTTH. Is this result statistically significant, and if so, what is the p-value - that is, the probability of obtaining a result at least this extreme? Well, that depends. Was I planning to flip the coin six times, and count the number of tails? Or was I planning to flip the coin until it came up heads, and count the number of trials? In the first case, the probability of getting "five tails or more" from a fair coin is 11%, while in the second case, the probability of a fair coin requiring "at least five tails before seeing one heads" is 3%.

Also see the "voltmeter story": http://en.wikipedia.org/wiki/Likelihood_principle#The_voltme... .



> arbitrarily pick a class of data that's "at least as extreme" as what was actually seen

You are absolutely correct, in my haste to explain one misinterpretation of p-values, I stumbled into another gross oversimplification. The correct way of saying it must always involve some language about the deviation or extremity of the data being at least as great as observed.

It goes to show that proper reporting of p-value statistics takes a lawyerly craft with language; while writing the post I had to consider what the mathematical interpretation of "is a fluke" should be. I think the definition I chose is what most people will interpret it as. However, the word fluke just means "unlikely chance occurrence" on its own, so saying "there is a 1% chance the result is an unlikely chance occurrence" is uninformative if taken literally. It has to imply one of two things:

1. You are referring to the null hypothesis as the unlikely occurrence, i.e. "There is a 1% chance the null hypothesis is true [therefore making this data surprising]."

2. You are temporarily assuming it is true to make such a statement, i.e. "There is a 1% chance of seeing data at least this surprising [if the null hypothesis were true]."

Someone that does know what a p-value is might be generous and think you meant the latter, which is a correct statement. However, I think that most people hear it the first way.

The fact that p-value reporting (when properly done) involves thinking through double negatives and conditions that are easily ignored probably indicates that it's time for other measures of significance to become better accepted.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: