The Mann Whitney U Test is a widely used non-parametric test that compares two independent groups when the data does not meet the assumptions of parametric tests like the independent t-test. It determines whether one group tends to have higher or lower values than the other by comparing the ranks of all observations.
It is mathematically equivalent to the Wilcoxon Rank Sum Test. In many textbooks, both terms are used interchangeably. The Mann–Whitney U test is especially useful when the sample size is small, data is skewed, or the scale of measurement is ordinal.
When to Use the Mann Whitney U Test?
- When comparing two independent samples.
- When data is non-normal or ordinal.
- When sample size is small (n < 30).
- When assumptions of the independent t-test are violated.
- When values represent ranks or non-parametric scores.
Hypotheses
- H₀: The two samples come from identical populations.
- H₁: The two samples come from populations with different distributions.
Principle of the Test
Instead of comparing means, the Mann–Whitney test compares ranked values. All observations are ranked together, and the sum of ranks for each group is used to compute the U statistic.
Steps in Performing the Mann Whitney U Test
- Combine observations from both groups.
- Rank all values from lowest to highest.
- If ties occur, assign average ranks.
- Calculate the sum of ranks for each group (R₁ and R₂).
- Compute the U statistic using:
Formula
U₁ = n₁n₂ + (n₁(n₁ + 1) / 2) − R₁
U₂ = n₁n₂ + (n₂(n₂ + 1) / 2) − R₂
The test statistic is:
U = smaller of U₁ and U₂
Large Sample Approximation (Z-Test)
Z = (U − μU) / σU
Where:
- μU = n₁n₂ / 2
- σU = √[n₁n₂(n₁ + n₂ + 1) / 12]
Assumptions of Mann Whitney U Test
- Samples must be independent.
- Data must be ordinal, interval, or ratio but non-normal.
- Distribution shapes must be roughly similar.
- Observations are mutually independent.
Interpreting the Result
If the calculated U value is less than or equal to the critical U value (from tables), the null hypothesis is rejected, indicating a significant difference between the groups.
Example (Illustration)
A researcher compares pain reduction scores in two groups treated with different analgesics. After ranking all scores and calculating U, if the observed U is below the critical value, it suggests a statistically significant difference between the two treatments.
Advantages
- Does not require normal distribution.
- Simple and robust for many types of data.
- Works well for small sample sizes.
- Suitable for ordinal and ranked data.
Limitations
- Less powerful than a t-test when data is normal.
- Cannot be used for paired samples (use Wilcoxon Signed-Rank Test).
- Assumes similar distribution shapes across groups.
Applications
- Clinical trials comparing two treatment groups.
- Comparing patient satisfaction scores.
- Evaluating symptom improvement across independent groups.
- Biomedical research with ordinal or non-normal data.
Detailed Notes:
For PDF style full-color notes, open the complete study material below:
