ARSC T3D Users' Newsletter 34, May 5, 1995
T3D Jobs Are Lost When the Y-MP Goes Down
Once a job begins on the T3D it cannot be interrupted until it finishes. So when the Y-MP goes down for preventive maintenance or testing, all jobs executing on the T3D will not be restarted: they are aborted when the Y-MP is shut down. All planned shutdowns are announced in the logon MotD ("Message of the Day"). Usually testing on ARSC's machine is scheduled for Tuesday night starting at 5:30 PM Alaska time, but check the MotD in /etc/motd to be sure.
MPI on the T3DThe following announcement appeared on Newsnet in the past week:
> > MPI for the Cray T3D Release Notice > (MPICH with Cray T3D Shared Memory Driver) > Alpha Version 0.1a > > April 19, 1995 > > We are happy to announce alpha release 0.1a of the MPICH on the Cray > T3D. Previously, only a subset of the MPI specification existed on > the T3D using the native message passing software T3DPVM. This > new technology offers a more complete implementation of the MPI > standard and is expected to offer substantially higher performance. > > MPICH, the model implementation of MPI, is a joint research effort > project between Argonne (Bill Gropp and Rusty Lusk) and Mississippi > State (Tony Skjellum and Nathan Doss). MPICH is the most widely used > public implementation of MPI. For more information on MPICH, please > see http://www.info.mcs.anl.gov/mpi/ or http://www.erc.msstate.edu/mpi/. > > Current status of the Cray T3D implementation (as of April 19, 1995): > > 1. Most MPI functions are supported. The only functions known not > to work are MPI_Rsend, MPI_Irsend, MPI_Ssend, and MPI_Issend. > > 2. The current limit on the number of processes is 256. > > 3. There are quite a few known optimizations that have not yet been > done. The main goal of this initial release is to provide a > functional, working version of MPI. > > 4. The code has not been thoroughly or systematically tested. > > 5. This software is not yet part of the standard MPICH release, > and will not be until it has been tested and upgraded further. > > 6. Optimization of collective operations remains to be done. > > Note: We will work quickly to remove these limitations. > > ------------------------------------------------------------------------ > Ron Brightwell Anthony Skjellum > bright@ERC.MsState.Edu tony@CS.MsState.Edu > > Mississippi State University NSF Engineering Research Center > P.O. Box 6176 Fax: 601-325-7692 > Mississippi State, MS 39762 Telephone: 601-325-2497 > ------------------------------------------------------------------------From Ron Brightwell I got the ftp address to download the T3D version of MPICH and I have installed the libraries and include files in:
/usr/local/examples/mpp/mpichon denali. Also in this directory is a subdirectory of examples that compile and execute correctly on the ARSC T3D. By examining the Makefile and examples in that subdirectory a user could start using MPI on the T3D now. This is a preliminary version and I haven't done any timings or testing other than running the example programs.
There is a MPI homepage at: http://www.mcs.anl.gov under MCS Research Topics, then under Computer Sciences then under Programming Tools. It has reports, manuals, documentation and more examples on MPI.
GAMESS on the T3D and DenaliFrom Mike Schmidt of the Iowa State Quantum Chemistry Group I got a copy of the GAMESS ab initio quantum chemistry code. I have compiled, linked, run, tested and timed the program on both the Y-MP and various T3D configurations. This is a big program of almost 175K lines of Fortran and has lots of options and capabilities. It comes in both uniprocessor and multiprocessor versions and was ported to the T3D by Nick Nystrom of Pittsburg Supercomputing Center and Carlos Sosa of Cray Research. If you are interested in running GAMESS on the ARSC T3D please contact Mike Ess.
The package comes with a benchmark suite of problems and a table of execution times on many different platforms for this benchmark suite of problems. I have augmented that table with the execution times I got on ARSC machines. I can email this table of timings and a graph of timings on ARSC machines to anyone who is interested.
The vector_fastmath Routines of benchlibIn newsletter #29 (3/31/95) I announced the availability of benchlib on the ARSC T3D. The sources for these libraries are available on the ARSC ftp server in the file:
pub/submissions/libbnch.tar.ZThe compiled libraries are also available on Denali in:
/usr/local/examples/mpp/lib/lib_32.a /usr/local/examples/mpp/lib/lib_scalar.a /usr/local/examples/mpp/lib/lib_util.a /usr/local/examples/mpp/lib/lib_random.a /usr/local/examples/mpp/lib/lib_tri.a /usr/local/examples/mpp/lib/lib_vect.aand the sources are available in:
/usr/local/examples/mpp/src.In previous newsletters I've described the contents of some of the libraries:
#30 (4/7/95) the "pref" routine of lib_util.a #32 (4/28/95) the fast scalar math routines in lib_scalar.aIn this newsletter, I describe the fast vector routines of lib_vect.a. This library provides routines that take a vector input and produce a vector output, one for each element of the input vector. The general calling sequence is:
call routine_v( vectorlength, inputvector, outputvector )and there is no restriction on the vector length (called with a vector length of 0 or less than zero produced no error messsage). They could be used in loops that look like:
becomes: do 10 i = 1,n call sqrt_v(n,x,y) y(i) = sqrt(x(i)) do 10 i = 1,n . . . . . . 10 continue 10 continueLike all vector operations, the cost (in clock ticks) per element computed goes down as the length of the vector increases. For the table below, I ran and computed the cost for vector lengths 1, 10, 100 and 1000. As expected, the cost per element decrease as a function of vector length and the savings can be substantial when compared to the cost of calling the default Fortran intrinsic. The crossover point between these vector routines and the Fortran intrinsic routines is always less than 10 elements.
Cost (in clock ticks) per element computed for different vector lengths and the default Fortran intrinsics:
Performance of benchlib's fast vector routines ---------------------------------------------- in lib_vect.a vector lengths default 1 10 100 1000 intrinsic sqrt_v 1261.0 126.2 57.3 39.5 226.0 sqrti_v 821.0 168.5 54.5 37.2 300.7 exp_v 909.0 163.8 61.5 45.6 198.8 alog_v 1178.0 169.1 67.9 52.6 246.7 aln2_v 552.0 92.6 35.2 43.7 309.3 rtor_v 1482.0 215.9 95.0 99.2 1473.5 twotox_v 602.0 112.9 41.6 36.0 1404.0 oneover_ 819.0 143.1 41.2 21.9 75.0 sin_v 1092.0 200.4 68.3 57.0 203.1 cos_v 802.0 163.8 64.0 56.7 195.9 atan_v 952.0 234.5 78.5 61.6 258.5 vscale 537.0 31.8 9.5 6.0 10.8 vset 279.0 26.2 4.4 2.8 5.4 zcopy 465.0 44.9 9.0 4.3 8.4The performance case for the last three entries in the table is not so clear:
call vscale(n,s,a,b) is do i = 1, n a(i) = s * b(i) enddo call vset(n,s,a) is do i = 1, n a(i) = s enddo call zcopy(n,a,b) is do i = 1, n a(i) = b(i) enddoAlso note that the positions of input and output are reverse in vscale and zcopy from what they are in the first 11 routines in the table.
I have not done extensive testing of the benchlib routines to see how accurate they are. For the test cases I tried, I didn't see any difference but I have a feeling that the default libraries are more accurate.
List of Differences Between T3D and Y-MPThe current list of differences between the T3D and the Y-MP is:
- Data type sizes are not the same (Newsletter #5)
- Uninitialized variables are different (Newsletter #6)
- The effect of the -a static compiler switch (Newsletter #7)
- There is no GETENV on the T3D (Newsletter #8)
- Missing routine SMACH on T3D (Newsletter #9)
- Different Arithmetics (Newsletter #9)
- Different clock granularities for gettimeofday (Newsletter #11)
- Restrictions on record length for direct I/O files (Newsletter #19)
- Implied DO loop is not "vectorized" on the T3D (Newsletter #20)
- Missing Linpack and Eispack routines in libsci (Newsletter #25)
- F90 manual for Y-MP, no manual for T3D (Newsletter #31)
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.