Given a shared input array on a p processor partition,
distributed with one element per processor, the CONCAT
Communication Library Primitive returns a
array
consisting of the rearrangement of data such that each processor holds
a local copy of the
array A. In the BDM model, this
CONCAT communication algorithm has the following complexity: