Statistical Hypothesis Testing and Statistical Decisions

The science of statistics has as its primary objective the extraction of conclusions about populations by using data derived from samples. Since it is practically impossible to examine an entire population, we resort to sampling procedures and, based on the information obtained, we attempt to make decisions. These decisions are characterized as statistical decisions because they are not absolute but are grounded in probability and in the analysis of sample data. In this way, we can attribute to the population certain features or parameters, although there is always a degree of uncertainty.
A characteristic example that highlights the usefulness of statistical decisions is the case in which the water level of a river during flood periods is studied. If engineers need to know whether the average rise in the water level exceeds a critical value in order to safely design a bridge, then statistical methods must be applied to provide a clear answer. Confidence interval estimation may provide one approach, but a more direct and systematic procedure is hypothesis testing, which allows the researcher to decide whether an initial assumption about the population should be accepted or rejected.

Statistical Hypotheses

Statistical hypotheses are the fundamental assumptions made prior to data analysis. They are usually related to population parameters such as the mean or variance, but they can also concern the form of the distribution, the comparison of sample distributions with theoretical ones, or even relationships between variables such as correlation or independence. The hypothesis testing procedure is always based on the formulation of two opposing scenarios.
The first is the null hypothesis, commonly denoted as H0. It is the hypothesis assumed to be true initially and serves as the point of reference for the analysis. In its simplest form, it is expressed as H0: θ = θ0, where θ is the parameter of interest and θ0 the specific hypothesized value. On the other hand, the alternative hypothesis, denoted as H1, expresses the opposite scenario. If the sample data are incompatible with the null hypothesis, then we are led to accept the alternative. The alternative hypothesis can take different forms, depending on the research question. It may state that the parameter is different from θ0, leading to a two-tailed test, or that it is smaller or greater than θ0, leading to one-tailed tests.

One-Tailed and Two-Tailed Tests

The choice between a one-tailed and a two-tailed test depends on the research objective. When the interest lies exclusively in values that fall below a certain threshold, then a one-tailed test in the lower direction is applied. Similarly, when we are only concerned about values above a threshold, an upper one-tailed test is used. If, however, we are interested in any kind of deviation, either lower or higher, then a two-tailed test is selected. For example, if we want to test whether the mean weight of a product is equal to the value stated on its packaging, without caring whether the deviation is upward or downward, a two-tailed test is appropriate. If instead we are only concerned that the weight may be less than the stated value, then a one-tailed test is required.

Significance Level

One of the most important elements of the methodology is the significance level, denoted by α. It is defined as the probability of rejecting the null hypothesis when it is in fact true, which corresponds to a type I error. In practice, the most commonly used significance levels are 0.05 and 0.01. The value of α determines the size of the rejection region and, consequently, the strictness of the test. The smaller the α, the smaller the probability of incorrectly rejecting H0, but at the same time, the greater the probability of failing to reject a false hypothesis, which corresponds to a type II error.
The null hypothesis suggests that the value θ0 of the parameter is correct. Parameter values close to θ0 support its acceptance, whereas values that deviate significantly provide grounds for its rejection. Thus, all possible values are divided into two regions: the acceptance region of H0 and the rejection region, denoted as R. The significance level precisely determines the boundary between these two regions.

Distributions and Critical Regions

The hypothesis testing procedure requires the selection of the appropriate distribution for the statistic under study. For large samples, thanks to the Central Limit Theorem, the normal distribution can be used. For smaller samples, especially when the variance is unknown, other distributions must be employed, such as the Student’s t distribution. In other cases, such as variance testing, the chi-square distribution or the F distribution is used.
Based on the significance level, critical values are determined. These values establish the boundary between the acceptance region and the rejection region. If the test statistic calculated from the sample lies within the rejection region, the null hypothesis is rejected. If it does not, the null hypothesis remains accepted. The decision is always probabilistic and never provides absolute certainty.

Rejection or Acceptance of the Null Hypothesis

Rejection of the null hypothesis occurs when the value of the test statistic falls within the rejection region. Acceptance occurs when the value lies within the acceptance region. It is important to emphasize that accepting H0 does not mean that we have proven its truth but rather that there is not enough evidence in the sample to reject it at the chosen significance level. Similarly, rejecting H0 does not mean with absolute certainty that it is false, but that the data strongly suggest it is inconsistent with the assumed hypothesis.

Step-by-Step Procedure

Applying a hypothesis test requires a systematic sequence of steps. Initially, the sample statistic of interest, such as the mean or variance, is calculated. Next, the null and alternative hypotheses are formulated. The significance level is set, and it is determined whether the test is one-tailed or two-tailed. The appropriate sampling distribution is then selected, the test statistic is calculated, and, where necessary, the degrees of freedom are identified. The critical values are subsequently defined, and the acceptance and rejection regions are established. Finally, by comparing the test statistic with the critical values, the final conclusion is drawn.

Conclusion

Statistical hypothesis tests are a fundamental tool in statistical analysis and are applied across a wide range of scientific disciplines, from engineering and medicine to social sciences and economics. They provide researchers with the ability to evaluate hypotheses in a systematic and rigorous manner, minimizing the risk of erroneous decisions. Although they never offer absolute certainty, they establish a scientifically sound framework that allows decisions to be made based on probability and the analysis of sample data. Their importance is crucial, as they form the link between theoretical statistics and practical application to real-world problems, where the ultimate goal is always the extraction of reliable and well-documented conclusions.