Statistical Hypothesis Testing - Definition of Terms

Definition of Terms

The following definitions are mainly based on the exposition in the book by Lehmann and Romano:

Statistical hypothesis
A statement about the parameters describing a population (not a sample).
Statistic
A value calculated from a sample, often to summarize the sample for comparison purposes.
Simple hypothesis
Any hypothesis which specifies the population distribution completely.
Composite hypothesis
Any hypothesis which does not specify the population distribution completely.
Null hypothesis (H0)
A simple hypothesis associated with a contradiction to a theory one would like to prove.
Alternative hypothesis (H1)
A hypothesis (often composite) associated with a theory one would like to prove.
Statistical test
A procedure whose inputs are samples and whose result is a hypothesis.
Region of acceptance
The set of values of the test statistic for which we fail to reject the null hypothesis.
Region of rejection / Critical region
The set of values of the test statistic for which the null hypothesis is rejected.
Critical value
The threshold value delimiting the regions of acceptance and rejection for the test statistic.
Power of a test (1 − β)
The test's probability of correctly rejecting the null hypothesis. The complement of the false negative rate, β. Power is termed sensitivity in biostatistics. ("This is a sensitive test. Because the result is negative, we can confidently say that the patient does not have the condition.") See sensitivity and specificity and Type I and type II errors for exhaustive definitions.
Size / Significance level of a test (α)
For simple hypotheses, this is the test's probability of incorrectly rejecting the null hypothesis. The false positive rate. For composite hypotheses this is the upper bound of the probability of rejecting the null hypothesis over all cases covered by the null hypothesis. The complement of the false positive rate, (1 − α), is termed specificity in biostatistics. ("This is a specific test. Because the result is positive, we can confidently say that the patient has the condition.") See sensitivity and specificity and Type I and type II errors for exhaustive definitions.
p-value
The probability, assuming the null hypothesis is true, of observing a result at least as extreme as the test statistic.
Statistical significance test
A predecessor to the statistical hypothesis test (see the Origins section). An experimental result was said to be statistically significant if a sample was sufficiently inconsistent with the (null) hypothesis. This was variously considered common sense, a pragmatic heuristic for identifying meaningful experimental results, a convention establishing a threshold of statistical evidence or a method for drawing conclusions from data. The statistical hypothesis test added mathematical rigor and philosophical consistency to the concept by making the alternative hypothesis explicit. The term is loosely used to describe the modern version which is now part of statistical hypothesis testing.
Conservative test
A test is conservative if, when constructed for a given nominal significance level, the true probability of incorrectly rejecting the null hypothesis is never greater than the nominal level.
Exact test
a test in which the significance level or critical value can be computed exactly and without any approximation. In some contexts this term is restricted to tests applied to categorical data and to permutation tests, in which computations are carried out by complete enumeration of all possible outcomes and their probabilities.


A statistical hypothesis test compares a test statistic (z or t for examples) to a threshold. The test statistic (the formula found in the table below) is based on optimality. For a fixed level of Type I error rate, use of these statistics minimizes Type II error rates (equivalent to maximizing power). The following terms describe tests in terms of such optimality:

Most powerful test
For a given size or significance level, the test with the greatest power (probability of rejection) for a given value of the parameter(s) being tested, contained in the alternative hypothesis.
Uniformly most powerful test (UMP)
A test with the greatest power for all values of the parameter(s) being tested, contained in the alternative hypothesis.

Read more about this topic:  Statistical Hypothesis Testing

Famous quotes containing the words definition of, definition and/or terms:

    No man, not even a doctor, ever gives any other definition of what a nurse should be than this—”devoted and obedient.” This definition would do just as well for a porter. It might even do for a horse. It would not do for a policeman.
    Florence Nightingale (1820–1910)

    Perhaps the best definition of progress would be the continuing efforts of men and women to narrow the gap between the convenience of the powers that be and the unwritten charter.
    Nadine Gordimer (b. 1923)

    The mystic purchases a moment of exhilaration with a lifetime of confusion; and the confusion is infectious and destructive. It is confusing and destructive to try and explain anything in terms of anything else, poetry in terms of psychology.
    Basil Bunting (1900–1985)