Historical Background
The Wilcoxon signed-rank test was named after Frank Wilcoxon (1892–1965). In addition to this test, he also proposed the rank-sum test for two independent samples in 1945. Later, Sidney Siegel (1956) contributed to the dissemination and application of the test through his book on nonparametric statistical methods, which became a key reference in the analysis of data within the social and behavioral sciences.
Theoretical Introduction
The Wilcoxon test is a nonparametric test used when we wish to compare two related samples, matched samples, or repeated measurements within the same sample. Its purpose is to assess whether there is a statistically significant difference between the means or medians of two conditions. The term “nonparametric” does not imply a lack of knowledge about the population, but rather that we do not assume that the data follow a normal distribution. For this reason, the Wilcoxon test is often used as an alternative to the paired t-test in cases where the assumption of normality is violated.
Uses of the Test
The Wilcoxon test is applied in a wide range of research fields when a researcher wants to determine whether there is a difference between two measurements on the same sample, such as before and after a therapeutic intervention. It can also be used to examine whether two dependent samples come from populations with similar distributions. Its use is particularly valuable in psychological, medical, and social science research, where data often do not follow a normal distribution.
Categories
There are two main categories of the Wilcoxon test. The first is the Wilcoxon signed-rank test, which compares a sample’s values to a hypothetical median. The second is the Wilcoxon matched-pair signed-rank test, which calculates the differences between paired values and then ranks them in order to compare the medians.
Assumptions of the Test
For the results of the Wilcoxon test to be valid, certain assumptions must be met. The dependent variable should be measured at least on an ordinal or continuous (interval or ratio) scale. The two groups being compared must be related, meaning that the same individuals have been measured twice or that the samples are paired. Finally, the distribution of differences between the two measurements should be symmetrical. If this assumption is not met, the sign test can be used as an alternative method.
Example of Application
A characteristic example of applying the Wilcoxon test is a study examining whether acupuncture therapy reduces lower back pain. The researcher recruits 25 participants and asks them to rate their pain on a scale of 1 to 10 before and after four weeks of treatment. The analysis showed that eleven participants reported lower pain scores after the treatment, four reported higher scores, and ten showed no change. The Wilcoxon test produced Z = -1.807 and p = 0.071, indicating that the reduction in pain was not statistically significant.
Procedure in SPSS
The application of the test in SPSS is carried out through the command Analyze → Nonparametric Tests → Legacy Dialogs → 2 Related Samples. In the dialog box that appears, the two variables representing the measurements are inserted and the Wilcoxon option is selected. SPSS produces three main tables: the Descriptive Statistics table, the Ranks table showing the positive and negative differences, and the Test Statistics table displaying the p-value and the Z statistic. These tables allow the researcher to interpret whether a statistically significant difference exists between the two sets of measurements.
Reporting Results
The presentation of results in a research context should combine statistical information with the broader research framework. In the example mentioned, the results could be reported as follows: “A Wilcoxon signed-rank test showed that a four-week course of acupuncture therapy did not lead to a statistically significant reduction in pain (Z = -1.807, p = 0.071). The median pain score was 5.0 both before and after treatment.” In this way, the interpretation of the result is complete, integrating both the statistical findings and their substantive research significance.