**Statistical Hypothesis Testing**

A **statistical hypothesis test** is a method of making decisions using data, whether from a controlled experiment or an observational study (not controlled). In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold probability, the significance level. The phrase "test of significance" was coined by Ronald Fisher: "Critical tests of this kind may be called tests of significance, and when such tests are available we may discover whether a second sample is or is not significantly different from the first."

These tests are used in determining what outcomes of an experiment would lead to a rejection of the null hypothesis for a pre-specified level of significance; helping to decide whether experimental results contain enough information to cast doubt on conventional wisdom. It is sometimes called **confirmatory data analysis**, in contrast to exploratory data analysis.

Statistical hypothesis tests answer the question *Assuming that the null hypothesis is true, what is the probability of observing a value for the test statistic that is at least as extreme as the value that was actually observed?*. That probability is known as the P-value.

Statistical hypothesis testing is a key technique of frequentist statistical inference. The Bayesian approach to hypothesis testing is to base rejection of the hypothesis on the posterior probability. Other approaches to reaching a decision based on data are available via decision theory and optimal decisions.

The *critical region* of a hypothesis test is the set of all outcomes which cause the null hypothesis to be rejected in favor of the alternative hypothesis.

