ARSC HPC Users' Newsletter 238, February 4, 2002
FFTs on Chilkoot
[ Thanks to Tom Logan of ARSC for this article. ]
I have been testing the various FFT routines available at ASRC, with an eye to performance. This article describes results for Chilkoot, the Cray SV1ex.
For comparison purposes, I started with a fairly quick FFT code taken from the INFO-MAC hyper archive whose original author was John Green (I shall refer to this as Green's FFT). Green's FFT routines were reported to be 2-3 times faster than the standard Numerical Recipes FFT algorithms.
I compared Green's complex 1-D FFT with the equivalent CRAY LibSci routine CCFFT and with the IMSL Math Library routines FFTCF/FFTCB. The tests were run on randomly generated complex arrays of varying lengths by calling both the forward and the reverse FFT 10000 times each. The run times in seconds for vectors of lengths from 64 to 16384 are summarized in the following table:
FFT Routine Name Length Green IMSL Libsci ===== ==== ====== 64 0.49 0.52 0.24 128 1.08 0.88 0.48 256 2.28 1.64 0.94 512 4.75 3.35 2.04 1024 10.92 6.86 1.76 2048 24.97 14.27 3.37 4096 51.95 29.17 8.18 8192 132.03 65.07 21.85 16384 296.99 177.55 62.75
The results overwhelmingly show the CRAY LibSci routines are superior.
Figure 1 - FFT Comparison on Chilkoot:
Graphically we see a 2 to 7.5 times speedup over Green's implementation when using LibSci.
Figure 2 - Speedup on SV1ex: LibSci versus Green's:
As a further comparison, I performed a similar test using the 2-D complex FFTs from LibSci (CCFFT2D) and from IMSL (F2T2D/F2T2B), running each 2000 times forward and reverse on matrices of varying sizes. Not surprisingly, the results were quite similar to the 1-D cases, with the LibSci routine running 2.5 to 5 times faster than the IMSL version.
Figure 3 - 2-D FFT Comparison on Chilkoot:
Upon reading the CCFFT2D man page more closely, I noticed under the performance tips the comment "it is very important to make the leading dimensions of the arrays odd numbers to avoid memory bank conflicts." Doing this provided an additional 1.2 to 1.4 times speed up over the already speedy LibSci FFT routine.
FFT Leading Dimension Length N N+1 ====== ====== 64 2.07 1.43 128 11.47 7.91 256 52.54 38.81 512 263.31 214.83Here's a graph showing the speed up obtained
Figure 4 - Speedup Changing Leading Dimension on SV1ex:
I think the conclusion to this exercise is obvious - if your application on the SV1ex uses FFTs, then you will be richly rewarded by faster run times when you use the CRAY LibSci routines and read the man pages for usage tips.
T3E Programming Environment and OS Upgrades
As announced in news/motd, yukon's default programming environment was upgraded to PE 126.96.36.199 last Wednesday, and the default message passing toolkit will be upgraded to MPT 188.8.131.52 this Wednesday.
As always, we'd like to hear of any problems, performance improvements, or other changes you notice. You'll have to recompile your code for the upgrade to have any effect.
UNICOS/mk will be upgraded soon. Watch news/motd.
"slayall" on Cluster
Clusters need to be watched a bit more than HPC systems like the T3E. One common problem is that job processes are inadvertently left running after a job completes.
Users of ARSC's linux cluster, quest, can ensure that all processes have been terminated by issuing the "slayall" command before leaving the system:
(Where USER is the environment variable with your username in it. You can use the command just as it's given above or substitute your actual username. For example, if you were user "farquat," you could use the command, "slayall farquat".)
Note that if you have jobs running in pbs, slayall will kill them. If not, please run "slayall" prior to logging off.
Quick-Tip Q & A
(The quick tip is still in hibernation... If you have any tips to share, we'd love to see them.)
[[ Answers, Questions, and Tips Graciously Accepted ]]
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.