Nonparametric tests of differences in medians: comparison of the
Wilcoxon-Mann-Whitney and robust rank-order tests
Nick Feltovich
The nonparametric Wilcoxon-Mann-Whitney test is commonly used by experimental
economists for detecting differences in central tendency between two samples. This
test is only theoretically appropriate under certain assumptions concerning the
population distributions from which the samples are drawn, and is often used in
cases where it is unclear whether these assumptions hold, and even when they clearly
do not hold. Fligner and Pollicello's (1981) robust rank-order test is a modification
of the Wilcoxon-Mann-Whitney test, designed to be appropriate in more situations than
Wilcoxon-Mann-Whitney. This paper uses simulations to compare the performance of the
two tests under a variety of distributional assumptions. The results are mixed. The
robust rank-order test tends to yield too many false positive results for medium-sized
samples, but this liberalness is relatively invariant across distributional
assumptions,
and seems to be due to a deficiency of the normal approximation to its test statistic's
distribution, rather than the test itself. The performance of the Wilcoxon-Mann-Whitney
test varies hugely, depending on the distributional assumptions; in some cases, it is
conservative, in others, extremely liberal. The tests have roughly similar power. Overall,
the robust rank-order test performs better than Wilcoxon-Mann-Whitney, though when
critical values for the robust rank-order test are not available, so that the normal
approximation must be used, their relative performance depends on the underlying
distributions, the sample sizes, and the level of significance used.
Feltovich, Nick (2003), "Nonparametric tests of differences in medians: comparison of the
Wilcoxon-Mann-Whitney and robust rank-order tests," Experimental Economics 6 (3), pp.
273-297. DOI: 10.1023/A:1026273319211.