The p-value of a test — Learning statistics with jamovi (2024)

Table of Contents

A softer view of decision making¶ The probability of extreme data¶ A common mistake¶ FAQs References

In one sense, our hypothesis test is complete. We’ve constructed a teststatistic, figured out its sampling distribution if the null hypothesisis true, and then constructed the critical region for the test.Nevertheless, I’ve actually omitted the most important number of all,the p-value. It is to this topic that we now turn. There aretwo somewhat different ways of interpreting a p-value, oneproposed by Sir Ronald Fisher and the other by Jerzy Neyman. Bothversions are legitimate, though they reflect very different ways ofthinking about hypothesis tests. Most introductory textbooks tend togive Fisher’s version only, but I think that’s a bit of a shame. To mymind, Neyman’s version is cleaner and actually better reflects the logicof the null hypothesis test. You might disagree though, so I’ve includedboth. I’ll start with Neyman’s version.

A softer view of decision making¶

One problem with the hypothesis testing procedure that I’ve described isthat it makes no distinction at all between a result that is “barelysignificant” and those that are “highly significant”. For instance, inmy ESP study the data I obtained only just fell inside the criticalregion, so I did get a significant effect but it was a pretty nearthing. In contrast, suppose that I’d run a study in which X = 97out of my N = 100 participants got the answer right. This wouldobviously be significant too but by a much larger margin, such thatthere’s really no ambiguity about this at all. The procedure that I havealready described makes no distinction between the two. If I adopt thestandard convention of allowing α = 0.05 as my acceptableType I error rate, then both of these are significant results.

The probability of extreme data¶

The second definition of the p-value comes from Sir RonaldFisher, and it’s actually this one that you tend to see in mostintroductory statistics textbooks. Notice how, when I constructed thecritical region, it corresponded to the tails (i.e., extreme values)of the sampling distribution? That’s not a coincidence, almost all“good” tests have this characteristic (good in the sense of minimisingour type II error rate, β). The reason for that is that agood critical region almost always corresponds to those values of thetest statistic that are least likely to be observed if the nullhypothesis is true. If this rule is true, then we can define thep-value as the probability that we would have observed a teststatistic that is at least as extreme as the one we actually did get. Inother words, if the data are extremely implausible according to the nullhypothesis, then the null hypothesis is probably wrong.

A common mistake¶

Okay, so you can see that there are two rather different but legitimateways to interpret the p-value, one based on Neyman’s approach tohypothesis testing and the other based on Fisher’s. Unfortunately, thereis a third explanation that people sometimes give, especially whenthey’re first learning statistics, and it is absolutely and completelywrong. This mistaken approach is to refer to the p-value as“the probability that the null hypothesis is true”. It’s an intuitivelyappealing way to think, but it’s wrong in two key respects. First, nullhypothesis testing is a frequentist tool and the frequentist approach toprobability does not allow you to assign probabilities to the nullhypothesis. According to this view of probability, the null hypothesisis either true or it is not, it cannot have a “5% chance” of being true.Second, even within the Bayesian approach, which does let you assignprobabilities to hypotheses, the p-value would not correspond tothe probability that the null is true. This interpretation is entirelyinconsistent with the mathematics of how the p-value iscalculated. Put bluntly, despite the intuitive appeal of thinking thisway, there is no justification for interpreting a p-value thisway. Never do it.

[1]	That’s p = 0.000000000000000000000000136 for folks that don’t likescientific notation!

The p-value of a test — Learning statistics with jamovi (2024)

FAQs

What does p-value mean in Jamovi? ›

That is what p-values are for: they tell us whether our results are statistically significant or how surprising they are. The formal definition of a p-value is that it is the probability of observing data that is as extreme or more extreme than the data you have observed, assuming the null hypothesis is true.

Explore More ›

How do you find the p-value using the test statistic? ›

The p-value is calculated using the sampling distribution of the test statistic under the null hypothesis, the sample data, and the type of test being done (lower-tailed test, upper-tailed test, or two-sided test). The p-value for: a lower-tailed test is specified by: p-value = P(TS ts | H ₀ is true) = cdf(ts)

Know More ›

What is a good p-value for a study? ›

A p-value of 0.05 or lower is generally considered statistically significant. P-value can serve as an alternative to—or in addition to—preselected confidence levels for hypothesis testing.

Find Out More ›

What is test value in jamovi? ›

This tells jamovi to use this variable in the analysis as our dependent variable. The second is to tell jamovi what our “test value” is. The test value is another way of referring to the population mean we want to use as a comparison point. We are testing whether our sample mean differs significantly from this mean.

See Details ›

What p-value is good in regression? ›

If the P-value is lower than 0.05, we can reject the null hypothesis and conclude that it exist a relationship between the variables.

View Details ›

Why the p-value is high? ›

P-Value Explanation

A high P-value, between 0.5 and 1.0, means that it is more likely that the results occurred by random chance, or that the difference is not statistically significant in the case of a hypothesis test.

How to interpret p-value? ›

The p-value only tells you how likely the data you have observed is to have occurred under the null hypothesis. If the p-value is below your threshold of significance (typically p < 0.05), then you can reject the null hypothesis, but this does not necessarily mean that your alternative hypothesis is true.

Get More Info Here ›

How do you write the p-value in statistics? ›

How should P values be reported?

Do not only report P-values (a mistake frequently made in abstracts). ...
P is always italicized and capitalized.
Do not use 0 before the decimal point for statistical values P, alpha, and beta because they cannot equal 1, in other words, write P<.001 instead of P<0.001.

More items...

May 2, 2024

Learn More Now ›

How do you find the p-value of a diagnostic test? ›

H 0 : p = (probability of preferring diagnostic test #1 over diagnostic test # 2) = ½ In the above example, N = 58 and 35 of the 58 display a (+, - ) result, so the estimated binomial probability is 35/58 = 0.60. The exact p-value is 0.148 from McNemar's test (see SAS Example 18.3_comparing_diagnostic. sas below).

Discover More Details ›

Why is the p-value not enough? ›

The p value has been increasingly criticized when used alone in reporting results, particularly in medical research. One of its major limitations is that it only indicates whether or not the null hypothesis is true, but does not provide information about the magnitude of the effect or the extent of change.

Know More ›

What is the highest acceptable p-value? ›

The p-value can be perceived as an oracle that judges our results. If the p-value is 0.05 or lower, the result is trumpeted as significant, but if it is higher than 0.05, the result is non-significant and tends to be passed over in silence.

Discover More ›

Is 0.5 a good p-value? ›

A P-value less than 0.5 is statistically significant, while a value higher than 0.5 indicates the null hypothesis is true; hence it is not statistically significant.

Discover More Details ›

How to calculate p-value? ›

How do I calculate p-value from test statistic?

Left-tailed test: p-value = cdf(x).
Right-tailed test: p-value = 1 - cdf(x).
Two-tailed test: p-value = 2 × min{cdf(x) , 1 - cdf(x)}.

Jan 18, 2024

What is a good value for at test? ›

Generally, a t-statistic of 2 or higher is considered to be statistically significant. However, the exact value of the t-statistic that is considered to be statistically significant will depend on the sample size and the level of confidence desired.

Explore More ›

How do you know if your data is normally distributed in jamovi? ›

If the z-score for skew or kurtosis are less than |1.96| then it is not statistically significant and is normally distributed. However, if the z > |1.96| then it is statistically significant and is not normally distributed.

Keep Reading ›

What does a p-value indicate? ›

The p value, or probability value, tells you how likely it is that your data could have occurred under the null hypothesis. It does this by calculating the likelihood of your test statistic, which is the number calculated by a statistical test using your data.

Show Me More ›

What is the p-value in the data table? ›

Defined simply, a P-value is a data-based measure that helps indicate departure from a specified null hypothesis, Ho, in the direction of a specified alternative Ha. Formally, it is the probability of recovering a response as extreme as or more extreme than that actually observed, when Ho is true.

What does the p-value of the intercept mean in regression? ›

The p-value of the intercept indicates what would be the percentage of samples that will have a coefficient as far away from 0 or more if one draws at random multiple samples from the population studied, where the coefficient of the intercept is supposed to be 0.

Find Out More ›

What does the parameter p represent? ›

In statistics, “n” usually refers to the sample size, while “N” refers to the population size. In algebra, “n” might refer to the number of elements in a set. Overall, “n” is just a variable, like a, b, x, y or any other letter, and it takes its meaning from the context of the problem.

Value of α	0.05	0.04	0.03	0.02	0.01
Reject the null?	Yes	Yes	Yes	No	No