We describe a methodology for developing high performance programs
running on clusters of SMP nodes. Our methodology is based on a small
kernel (SIMPLE) of collective communication primitives that make
efficient use of the hybrid shared and message passing environment.
We are addressing a number of basic combinatorial computations that
are commonly needed in various large scale applications. One such
computation is sorting, a problem that has been studied extensively in
the literature because of its many important applications and because
of its intrinsic theoretical significance. Our research has produced
the fastest known high-level, practical algorithms for many
combinatorial problems such as sorting, personalized communication,
selection, list ranking, and data redistribution, on general purpose
parallel machines.
Image processing applications are well-suited to high performance
computing techniques for several reasons: uniform grid layouts,
spatial locality, and highly regular kernels. Images used for analysis
are produced from a variety of applications, for example, remote
sensing, image understanding, face recognition, detection of surface
defects in industrial manufacturing, military target recognition, etc.
Some of the processing is low-level, such as image calibration or
enhancement, while other analysis is intermediate- or high-level, such
as segmenting an image into objects or regions and classifying each.
We have developed a high performance suite of practical algorithms for
image processing that seem to outperform all other known implementations
on the same platforms.
dbader@umiacs.umd.edu
Clusters of SMPs
Recent Publications
SIMPLE: A Methodology for Programming High Performance Algorithms
on Clusters of Symmetric Multiprocessors (SMPs)
Sorting on Clusters of SMPs
Combinatorial Computing
Recent Publications
Designing Practical Efficient Algorithms for Symmetric Multiprocessors
Prefix Computations on Symmetric Multiprocessors
A Randomized Parallel Sorting Algorithm With an Experimental Study
A New Deterministic Parallel Sorting Algorithm With an Experimental Evaluation
Parallel Algorithms for Personalized Communication and Sorting With an Experimental
Study (Extended Abstract)
Practical Parallel Algorithms for Personalized Communication and
Integer Sorting
Practical Parallel Algorithms for Dynamic Data Redistribution,
Median Finding, and Selection
The Block Distributed Memory Model
Image Processing
Recent Publications
Parallel Algorithms for Image Enhancement and Segmentation by
Region Growing with an Experimental Study
Parallel Algorithms for Image Histogramming and Connected Components
with an Experimental Study
Efficient Image Processing Algorithms on the Scan Line Array
Processor
Scalable Parallel Algorithms for Texture Synthesis and Compression
Experimental Data Sets