22. MANN WHITNEY U TEST

The Mann Whitney U Test is a widely used non-parametric test that compares two independent groups when the data does not meet the assumptions of parametric tests like the independent t-test. It determines whether one group tends to have higher or lower values than the other by comparing the ranks of all observations.

It is mathematically equivalent to the Wilcoxon Rank Sum Test. In many textbooks, both terms are used interchangeably. The Mann–Whitney U test is especially useful when the sample size is small, data is skewed, or the scale of measurement is ordinal.

When to Use the Mann Whitney U Test?

When comparing two independent samples.
When data is non-normal or ordinal.
When sample size is small (n < 30).
When assumptions of the independent t-test are violated.
When values represent ranks or non-parametric scores.

Hypotheses

H₀: The two samples come from identical populations.
H₁: The two samples come from populations with different distributions.

Principle of the Test

Instead of comparing means, the Mann–Whitney test compares ranked values. All observations are ranked together, and the sum of ranks for each group is used to compute the U statistic.

Steps in Performing the Mann Whitney U Test

Combine observations from both groups.
Rank all values from lowest to highest.
If ties occur, assign average ranks.
Calculate the sum of ranks for each group (R₁ and R₂).
Compute the U statistic using:

Formula

U₁ = n₁n₂ + (n₁(n₁ + 1) / 2) − R₁

U₂ = n₁n₂ + (n₂(n₂ + 1) / 2) − R₂

The test statistic is:

U = smaller of U₁ and U₂

Large Sample Approximation (Z-Test)

Z = (U − μ_U) / σ_U

Where:

μ_U = n₁n₂ / 2
σ_U = √[n₁n₂(n₁ + n₂ + 1) / 12]

Assumptions of Mann Whitney U Test

Samples must be independent.
Data must be ordinal, interval, or ratio but non-normal.
Distribution shapes must be roughly similar.
Observations are mutually independent.

Interpreting the Result

If the calculated U value is less than or equal to the critical U value (from tables), the null hypothesis is rejected, indicating a significant difference between the groups.

Example (Illustration)

A researcher compares pain reduction scores in two groups treated with different analgesics. After ranking all scores and calculating U, if the observed U is below the critical value, it suggests a statistically significant difference between the two treatments.

Advantages

Does not require normal distribution.
Simple and robust for many types of data.
Works well for small sample sizes.
Suitable for ordinal and ranked data.

Limitations

Less powerful than a t-test when data is normal.
Cannot be used for paired samples (use Wilcoxon Signed-Rank Test).
Assumes similar distribution shapes across groups.

Applications

Clinical trials comparing two treatment groups.
Comparing patient satisfaction scores.
Evaluating symptom improvement across independent groups.
Biomedical research with ordinal or non-normal data.

Detailed Notes:

For PDF style full-color notes, open the complete study material below: