|
What is 3D Fast Fourier Transform Library for Blue Gene/L?
At the core of many real-world scientific and engineering applications is the necessity for computing three-dimensional Fast Fourier Transforms (FFT). Although there are many available FFT libraries, 3D Fast Fourier Transform Library for Blue Gene®/L (BGL3DFFT) is specifically designed to take advantage of the IBM® Blue Gene architecture by enabling applications that use three-dimensional FFTs to scale to thousands of Blue Gene/L processors. Most of the alternative parallel libraries compute three-dimensional FFTs by using a technique called slab decomposition. However, the scalability of the slab-based methods is limited by the number of the data elements along a single dimension of the three-dimensional FFT. For instance, a 1283 sized three-dimensional FFT can scale only up to 128 processors. In BGL3DFFT, the three-dimensional FFT implementation is based on a full distribution of the n2 1D FFTs required for each phase of the row-column method for computation of a three-dimensional FFT. This approach enables scalability for up to 1282=16,384 processors for computing the same 1283 FFT.
BGL3DFFT is a scalable C++ library for computing distributed, three-dimensional Fast Fourier Transforms on the Blue Gene/L platform. The library can be used to perform transforms on 3D arrays of complex data where each dimension is a power-of-two on Blue Gene/L processor partitions where each dimension (of the 3D partition) is a power-of-two.
This FFT library is a three-dimensional, torus architecture-specific library. The torus is one of the five major interconnection networks of the Blue Gene/L and it is used for both point-to-point messaging and collective communications. The 3D FFT algorithm is designed to scale effectively to thousands of processors by its fine-tuned decomposition and effective use of the three-dimensional torus topology of the Blue Gene/L supercomputer. Performance measurements indicate that the algorithm scales well up to 8192 processors for a 1283 FFT using the MPI (message-passing interface) communication layer. BGL3DFFT can be used to enable the scaling of large-scale scientific or engineering codes.
BGL3DFFT includes an object file for the Blue Gene/L platform, a directory with header files, a test driver, and brief documentation. The test driver can serve as an example to show how the library may be used in the user's own applications.
How does it work?
BGL3DFFT library uses the row-column method to carry out the computation of 3D FFTs. In this approach, the 3D FFT is computed as successive sets of independent, one-dimensional FFTs. All 1D FFTs are performed locally.
The library is built on the top of MPI communication layer and it may use any serial 1D FFT as a building block. In this release, it uses a subroutine from the IBM Engineering Scientific Subroutine Library (ESSL) for performing sequential, one-dimensional Fast Fourier Transforms (1D FFTs). This approach takes advantage of the high-performance routines in ESSL for the Blue Gene/L architecture.
The code uses C++ templates and takes the data precision as a template parameter. Currently, double and float precision are supported.
|