The above analysis indicates that, for fixed p and k, the communication complexity is independent of the problem size. Hence, as n increases, we expect the local computation to dominate.
The histogramming algorithm has been implemented on a CM-5 with p=16, 32, 64, and 128 processors, and the algorithm's performance is plotted in Figure 3 for 256 grey levels for images ranging from to pixels in size, and expanded in Figures 12 - 14 for to images on 16, 32, and 64 processors. Corresponding performance graphs are given for the IBM SP-1 in Figure 18 and for the IBM SP-2 in Figure 20. Plots indicate quadratic performance as a function of n for fixed p, and scalability in terms of p. Hence, our theoretical analysis is supported.
Figure 3: Histogramming and Connected Components Scalability on the CM-5
Please refer to the plot in Figure 3 for an illustration of the scalability of the histogramming algorithm's performance. Since computation dominates for large n, the algorithm runs as O. We have plotted vs. time for four configurations of the CM-5. The resulting plot shows the linear relationship between time and image size for each fixed p. Also, when the number of processors double, the running time approximately halves.