next up previous
Up: Practical Parallel Algorithms for Personalized Communication Previous: Counting Sort Algorithm

References

1
B. Abali, F. Özgüner, and A. Bataineh. Balanced Parallel Sort on Hypercube Multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 4(5):572--581, 1993.

2
A. Alexandrov, M. Ionescu, K. Schauser, and C. Scheiman. LogGP: Incorporating Long Messages into the LogP model - One step closer towards a realistic model for parallel computation. In 7th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 95--105, Santa Barbara, CA, July 1995.

3
R.H. Arpaci, D.E. Culler, A. Krishnamurthy, S.G. Steinberg, and K. Yelick. Empirical Evaluation of the CRAY-T3D: A Compiler Perspective. In ACM Press, editor, Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 320--331, Santa Margherita Ligure, Italy, June 1995.

4
D. Bader. Randomized and Deterministic Routing Algorithms for h-Relations. ENEE 648X Class Report, April 1, 1994.

5
D. A. Bader and J. JáJá. Parallel Algorithms for Image Histogramming and Connected Components with an Experimental Study. Technical Report CS-TR-3384 and UMIACS-TR-94-133, UMIACS and Electrical Engineering, University of Maryland, College Park, MD, December 1994.

6
D. A. Bader and J. JáJá. Parallel Algorithms for Image Histogramming and Connected Components with an Experimental Study. In Fifth ACM SIGPLAN Symposium of Principles and Practice of Parallel Programming, pages 123--133, Santa Barbara, CA, July 1995. To appear in Journal of Parallel and Distributed Computing.

7
D. A. Bader and J. JáJá. Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and Selection. Technical Report CS-TR-3494 and UMIACS-TR-95-74, UMIACS and Electrical Engineering, University of Maryland, College Park, MD, July 1995. To be presented at the 10th International Parallel Processing Symposium, Honolulu, HI, April 15-19, 1996.

8
D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, and S. Weeratunga. The NAS Parallel Benchmarks. Technical Report RNR-94-007, Numerical Aerodynamic Simulation Facility, NASA Ames Research Center, Moffett Field, CA, March 1994.

9
V. Bala, J. Bruck, R. Cypher, P. Elustondo, A. Ho, C.-T. Ho, S. Kipnis, and M. Snir. CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers. IEEE Transactions on Parallel and Distributed Systems, 6:154--164, 1995.

10
D.P. Bertsekas, C. Özveren, G.D. Stamoulis, P. Tseng, and J.N. Tsitsiklis. Optimal Communication Algorithms for Hypercubes. Journal of Parallel and Distributed Computing, 11:263--275, 1991.

11
G. E. Blelloch, C. E. Leiserson, B. M. Maggs, C. G. Plaxton, S. J. Smith, and M. Zagha. A Comparison of Sorting Algorithms for the Connection Machine CM-2. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 3--16, July 1991.

12
S.H. Bokhari. Complete Exchange on the iPSC-860. ICASE Report No. 91-4, ICASE, NASA Langley Research Center, Hampton, VA, January 1991.

13
S.H. Bokhari. Multiphase Complete Exchange on a Circuit Switched Hypercube. In Proceedings of the 1991 International Conference on Parallel Processing, pages I--525 -- I--529, August 1991. Also appeared as NASA ICASE Report No. 91-5.

14
S.H. Bokhari and H. Berryman. Complete Exchange on a Circuit Switched Mesh. In Proceedings of Scalable High Performance Computing Conference, pages 300--306, Williamsburg, VA, April 1992.

15
W.W. Carlson and J.M. Draper. AC for the T3D. Technical Report SRC-TR-95-141, Supercomputing Research Center, Bowie, MD, February 1995.

16
Cray Research, Inc. SHMEM Technical Note for C, October 1994. Revision 2.3.

17
D.E. Culler, A. Dusseau, S.C. Goldstein, A. Krishnamurthy, S. Lumetta, S. Luna, T. von Eicken, and K. Yelick. Introduction to Split-C. Computer Science Division - EECS, University of California, Berkeley, version 1.0 edition, March 6, 1994.

18
D.E. Culler, R.M. Karp, D.A. Patterson, A. Sahay, K.E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: Towards a Realistic Model of Parallel Computation. In Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993.

19
V.V. Dimakopoulos and N.J. Dimopoulos. Optimal Total Exchange in Linear Arrays and Rings. In Proceedings of the 1994 International Symposium on Parallel Architectures, Algorithms, and Networks, pages 230--237, Kanazawa, Japan, December 1994.

20
A.C. Dusseau. Modeling Parallel Sorts with LogP on the CM-5. Technical Report UCB//CSD-94-829, Computer Science Division, University of California, Berkeley, 1994.

21
N. Folwell, S. Guha, and I. Suzuki. A Practical Algorithm for Integer Sorting on a Mesh-Connected Computer. In Proceedings of the High Performance Computing Symposium, pages 281--291, Montreal, Canada, July 1995. Preliminary Version.

22
A.V. Gerbessiotis and L.G. Valiant. Direct Bulk-Synchronous Parallel Algorithms. Journal of Parallel and Distributed Computing, 22(2):251--267, 1994.

23
S. Heller. Congestion-Free Routing on the CM-5 Data Router. In Proceedings of the First International Workshop on Parallel Computer Routing and Communication, pages 176--184, Seattle, WA, May 1994. Springer-Verlag.

24
S. Hinrichs, C. Kosak, D.R. O'Hallaron, T.M. Strickler, and R. Take. An architecture for optimal all-to-all personalized communication. Technical Report CMU-CS-94-140, School of Computer Science, Carnegie Mellon University, September 1994.

25
T. Horie and K. Hayashi. All-to-All Personalized Communication on a Wrap-around Mesh. In Proceedings of the Second Fujitsu-ANU CAP Workshop, Canberra, Austrailia, November 1991. 10 pp.

26
J. JáJá and K.W. Ryu. The Block Distributed Memory Model. Technical Report CS-TR-3207, Computer Science Department, University of Maryland, College Park, January 1994.

27
J.F. JáJá and K.W. Ryu. The Block Distributed Memory Model for Shared Memory Multiprocessors. In Proceedings of the 8th International Parallel Processing Symposium, pages 752--756, Cancún, Mexico, April 1994. (Extended Abstract).

28
S.L. Johnsson and C.-T. Ho. Optimal Broadcasting and Personalized Communication in Hypercubes. IEEE Transactions on Computers, 38(9):1249--1268, 1989.

29
M. Kaufmann, J.F. Sibeyn, and T. Suel. Derandomizing Algorithms for Routing and Sorting on Meshes. In Proceedings of the 5th Symposium on Discrete Algorithms, pages 669--679. ACM-SIAM, 1994.

30
D.E. Knuth. The Art of Computer Programming: Sorting and Searching, volume 3. Addison-Wesley Publishing Company, Reading, MA, 1973.

31
D. Krizanc. Integer Sorting on a Mesh-Connected Array of Processors. Information Processing Letters, 47(6):283--289, 1993.

32
Y.-D. Lyuu and E. Schenfeld. Total Exchange on a Reconfigurable Parallel Architecture. In Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, pages 2--10, Dallas, TX, December 1993.

33
Message Passing Interface Forum. MPI: A Message-Passing Interface Standard. Technical report, University of Tennessee, Knoxville, TN, June 1995. Version 1.1.

34
S.R. Öhring and S.K. Das. Efficient Communication in the Foldned Petersen Interconnection Networks. In Proceedings of the Sixth International Parallel Architectures and Languages Europe Conference, pages 25--36, Athens, Greece, July 1994. Springer-Verlag.

35
S. Ranka, R.V. Shankar, and K.A. Alsabti. Many-to-many Personalized Communication with Bounded Traffic. In The Fifth Symposium on the Frontiers of Massively Parallel Computation, pages 20--27, McLean, VA, February 1995.

36
S. Rao, T. Suel, T. Tsantilas, and M. Goudreau. Efficient Communication Using Total-Exchange. In Proceedings of the 9th International Parallel Processing Symposium, pages 544--550, Santa Barbara, CA, April 1995.

37
T. Schmiermund and S.R. Seidel. A Communication Model for the Intel iPSC/2. Technical Report Technical Report CS-TR 9002, Dept. of Computer Science, Michigan Tech. Univ., April 1990.

38
D.S. Scott. Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies. In Proceedings of the 6th Distributed Memory Computing Conference, pages 398--403, Portland, OR, April 1991.

39
T. Suel. Routing and Sorting on Meshes with Row and Column Buses. Technical Report UTA//CS-TR-94-09, Department of Computer Sciences, University of Texas at Austin, October 1994.

40
R. Take. A Routing Method for All-to-All Burst on Hypercube Networks. In Proceedings of the 35th National Conference of Information Processing Society of Japan, pages 151--152, 1987. In Japanese. Translation by personal communication with R. Take.

41
R. Thakur and A. Choudhary. All-to-All Communication on Meshes with Wormhole Routing. In Proceedings of the 8th International Parallel Processing Symposium, pages 561--565, Cancún, Mexico, April 1994.

42
R. Thakur, A. Choudhary, and G. Fox. Complete Exchange on a Wormhole Routed Mesh. Report SCCS-505, Northeast Parallel Architectures Center, Syracuse University, Syracuse, NY, July 1993.

43
R. Thakur, R. Ponnusamy, A. Choudhary, and G. Fox. Complete Exchange on the CM-5 and Touchstone Delta. Journal of Supercomputing, 8:305--328, 1995. (An earlier version of this paper was presented at Supercomputing '92.).

44
L.G. Valiant. A Bridging Model for Parallel Computation. Communication of the ACM, 33(8):103--111, 1990.

45
J.-C. Wang, T.-H. Lin, and S. Ranka. Distributed Scheduling of Unstructured Collective Communication on the CM-5. Technical Report CRPC-TR94502, Syracuse University, Syracuse, NY, 1994.

46
S.C. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 24--36, June 1995.



next up previous
Up: Practical Parallel Algorithms for Personalized Communication Previous: Counting Sort Algorithm

David A. Bader
dbader@umiacs.umd.edu