A data parallel algorithm can be viewed as a sequence of parallel
synchronous steps in which each parallel step consists of an arbitrary
number of concurrent primitive data operations. The complexity of a
data parallel algorithm can be expressed in terms of two
architecture-independent parameters, the parallel time, i.e.,
the number of parallel steps, and the work, i.e., the total
number of operations used by the algorithm [20]. However we
concentrate here on evaluating the scalability of our algorithms on
two distinct architectures, namely the Connection Machines CM-2 and
CM-5. In this case we express the complexity of a parallel algorithm
with respect to two measures: the computation complexity \
which is the time spent by a processor on local computations, and the
communication complexity
which is the time spent on
interprocessor communication of the overall algorithm. We use the
standard sequential model for estimating
. However an estimate
of the term
depends in general on the data layout and the
architecture of the machine under consideration. Our goal is to split
the overall computation almost equally among the processors in such a
way that
is minimum. We discuss these issues next.