Consider the problem of sorting n integer keys in the range that are distributed equally over a p-processor distributed memory machine. An efficient algorithm is radix sort that decomposes each key into groups of r-bit blocks, for a suitably chosen r, and sorts the keys by sorting on each of the r-bit blocks beginning with the block containing the least significant bit positions [30]. Let . Assume (w.l.o.g.) that the number of processors is a power of two, say , and hence is an integer . Our algorithm demonstrates efficient uses of the transpose communication primitive, as well as the h-relation communication scheme.