Everything about P-value totally explained
In
statistical hypothesis testing, the
p-value is the
probability of obtaining a value of the
test statistic at least as extreme as the one that was actually observed, given that the
null hypothesis is true. The fact that p-values are based on this assumption is crucial to their correct interpretation.
More technically, a p-value of an experiment is a random variable defined over the
sample space of the experiment such that its distribution under the null hypothesis is uniform on the interval [0,1]. Many p-values can be defined for the same experiment.
Coin flipping example
For example, say an experiment is performed to determine if a
coin flip is
fair (50% chance of landing heads or tails), or unfairly biased, either toward heads (> 50% chance of landing heads) or toward tails (< 50% chance of landing heads). Since we consider both biased alternatives, a
two-tailed test is performed. The null hypothesis is that the coin is fair, and that any deviations from the 50% rate can be ascribed to chance alone. Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips. The p-value of this result would be the chance of a fair coin landing on
heads at least 14 times out of 20 flips plus the chance of a fair coin landing on
heads 6 or fewer times out of 20 flips. In this case the random variable
T has a
binomial distribution. The probability that 20 flips of a fair coin would result in 14 or more heads is 0.0577. By symmetry, the probability that 20 flips of the coin would result in 14 or more heads or 6 or fewer heads is 0.0577 × 2 = 0.115.
Interpretation
Generally, one rejects the
null hypothesis if the p-value is smaller than or equal to the
significance level, often represented by the Greek letter α (
alpha). If the level is 0.05, then the results are only 5% likely to be as extraordinary as just seen, given that the null hypothesis is true.
In the above example we have:
- null hypothesis (H0) — fair coin;
- observation (O) — 14 heads out of 20 flips; and
- probability (p-value) of observation (O) given H0 — p(O|H0) = 0.0577x2 (two-tailed) = 0.1154 = 11.54%.
The calculated p-value exceeds 0.05, so the observation is consistent with the null hypothesis — that the observed result of 14 heads out of 20 flips can be ascribed to chance alone — as it falls within the range of what would happen 95% of the time were this in fact the case. In our example, we fail to reject the null hypothesis at the 5% level. Although the coin didn't fall evenly, the deviation from expected outcome is just small enough to be reported as being "not statistically significant at the 5% level".
However, had a single extra head been obtained, the resulting p-value (two-tailed) would be 0.0414 (4.14%). This time the null hypothesis - that the observed result of 15 heads out of 20 flips can be ascribed to chance alone - is rejected. Such a finding would be described as being "statistically significant at the 5% level".
Critics of p-values point out that the criterion used to decide "statistical significance" is based on the somewhat arbitrary choice of level (often set at 0.05). A proposed replacement for the p-value is
p-rep. It is necessary to use a reasonable
null hypothesis to assess the result fairly. The choice of null hypothesis entails assumptions.
Frequent misunderstandings
The conclusion obtained from comparing the p-value to a significance level yields two results: either the null hypothesis is rejected, or the null hypothesis
cannot be rejected at that significance level. You can't accept the null hypothesis simply by the comparison just made (11% > 5%); there are alternate tests that have to be performed such as some "goodness of fit" tests. It would be very irresponsible to conclude that the null hypothesis needs to be accepted based on the simple fact that the p-value is larger than the significance level chosen.
The use of p-values is widespread; however, such use has come under heavy criticism due both to its inherent shortcomings and the potential for misinterpretation.
There are several common misunderstandings about p-values.
The p-value is not the probability that the null hypothesis is true (claimed to justify the "rule" of considering as significant p-values closer to 0 (zero)).
In fact, frequentist statistics does not, and cannot, attach probabilities to hypotheses. Comparison of Bayesian and classical approaches shows that a p-value can be very close to zero while the posterior probability of the null is very close to unity. This is the Jeffreys-Lindley paradox.
The p-value is not the probability that a finding is "merely a fluke" (again, justifying the "rule" of considering small p-values as "significant").
As the calculation of a p-value is based on the assumption that a finding is the product of chance alone, it patently can't simultaneously be used to gauge the probability of that assumption being true. This is subtly different from the real meaning which is that the p-value is the chance that null hypothesis explains the result: the result might not be "merely a fluke," and be explicable by the null hypothesis with confidence equal to the p-value.
The p-value is not the probability of falsely rejecting the null hypothesis. This error is a version of the so-called prosecutor's fallacy.
The p-value is not the probability that a replicating experiment wouldn't yield the same conclusion.
1 − (p-value) is not the probability of the alternative hypothesis being true (see (1)).
The significance level of the test is not determined by the p-value.
The significance level of a test is a value that should be decided upon by the agent interpreting the data before the data are viewed, and is compared against the p-value or any other statistic calculated after the test has been performed.
The p-value doesn't indicate the size or importance of the observed effect (compare with effect size).Further Information
Get more info on 'P-value'.
|
External Link Exchanges
Do you know how hard it is to get a link from a large encyclopaedia? Well we're different and will prove it. To get a link from us just add the following HTML to your site on a relevant page:
<a href="http://p-value.totallyexplained.com">P-value Totally Explained</a>
Then simply click through this link from your web page. Our crawlers will verify your link, extract the title of your web page and instantly add a link back to it. If you like you can remove the words Totally Explained and embed the link in article text.
As long as your link remains in place, we'll keep our link to you right here. Please play fair - our crawlers are watching. Your site must be closely related to this one's topic. Any kind of spamming, dubious practises or removing the link will result in your link from us being dropped and, potentially, your whole site being banned. |