ARSC T3D Users' Newsletter 22, February 13, 1995
ARSC T3D Upgrades
In the next month we will be upgrading the T3D Programming Environment (libraries, tools and compilers) from P.E. 1.1 to P.E. 1.2,
Users will be notified when these upgrades will happen in mailings to the ARSC T3D user's group (i.e., those who receive this newsletter).
Upgrade to the T3D MemoryOn February 7th, ARSC upgraded the memory on each PE from 2MWs to 8MWs. If any users have questions about this, please contact Mike Ess.
Upgrade to MAX 1.2On January 31st, ARSC upgraded to the 1.2 version of MAX, the T3D operating system. If any users notice differences in their codes running in the T3D they should notify Mike Ess.
What's in the LibrariesIn the directory /mpp/bin there is a mpp utility call nm that prints the object files that are in a library. It performs the same functions as nm in /bin/nm does for Y-MP libraries and those functions are described in the man page on denali.
In going from P.E. 1.1 to P.E. 1.2 this utility provides a quick method to see what's new in the library. First we run:
/mpp/bin/nm /mpp/lib/lib*.a grep lib > pe11.objsthen after ARSC has upgraded to P.E. 1.2 we run:
/mpp/bin/nm /mpp/lib/lib*.a grep lib > pe12.objsNow a diff between pe11.objs and pe12.objs will show us what's been added in the new release.
More SpeedOnce a T3D application is up and running we always need more speed. One place to look for increased speed is in the libsci routines. Libsci has single PE versions of all of the BLAS 1, 2, and 3 routines and and many of LAPACK routines. However before we add them to our code expecting a speed improvement, we should time the code they replace and the routine itself. It may be that calling the routine in libsci is actually slower than the code it replaces. For example, the overhead of the subroutine call to libsci could be as significant as the function performed.
Here is a small program I've used to see if the code replacement with a libsci will help:
real a( 2000 ), b( 2000 ) integer index( 16 ) data index / 0, 1, 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 400, + 500, 1000, 2000 / do 10 i = 1, 2000 a( i ) = i b( i ) = i 10 continue do 100 i = 1, 16 t1 = second() s1 = sdot( index( i ), a, 1, b, 1 ) ! libsci replacement t2 = second() s2 = 0.0 do 20 j = 1, index( i ) s2 = s2 + a( j ) * b( j ) ! code replaced 20 continue t3 = second() s3 = snrm2( index( i ), a, 1 ) ! libsci replacement t4 = second() s4 = 0.0 do 30 j = 1, index( i ) s4 = s4 + a( j ) * a( j ) ! code replaced 30 continue s4 = sqrt( s4 ) t5 = second() if( s1 .ne. s2 ) then write( 6, 600 ) s1, s2 ! same answers ? stop endif if( s3 .ne. s4 ) then write( 6, 601 ) s3, s4 ! same answers ? stop else write( 6, 602 ) i, index( i ), t2-t1,t3-t2,t4-t3,t5-t4 endif 100 continue 600 format( " Error in sdot , found ", f10.2, " should be ", f10.2 ) 601 format( " Error in snrm2, found ", f10.2, " should be ", f10.2 ) 602 format( I3, i8, 4f10.6 ) end real function second( ) second = dble( irtc( ) ) / 150000000.0 endThe results for the above program are:
1 0 0.000009 0.000001 0.000004 0.000007 2 1 0.000010 0.000002 0.000017 0.000006 3 2 0.000011 0.000002 0.000017 0.000007 4 3 0.000011 0.000002 0.000018 0.000007 5 4 0.000011 0.000002 0.000017 0.000007 6 5 0.000011 0.000002 0.000017 0.000007 7 10 0.000012 0.000003 0.000020 0.000008 8 20 0.000015 0.000004 0.000023 0.000009 9 40 0.000020 0.000006 0.000027 0.000011 10 50 0.000020 0.000007 0.000030 0.000011 11 100 0.000028 0.000020 0.000038 0.000015 12 200 0.000031 0.000035 0.000055 0.000023 13 400 0.000043 0.000067 0.000086 0.000038 14 500 0.000048 0.000083 0.000102 0.000046 15 1000 0.000079 0.000163 0.000184 0.000084 16 2000 0.000138 0.000323 0.000426 0.000231In this program we are investigating replacing the code for computing the dot product of two vectors and the Euclidean 2 norm of a vector with calls to libsci routines sdot and snrm2 (sdot and snrm2 are described in man pages on denali). From the times above it looks like replacing the Fortran code for a dot product with the libsci routine doesn't pay until the vectors are greater than 100 or so. For snrm2 is doesn't look like using the libsci version will ever payoff. With these small test cases it's easier to decide which libsci routines will improve the speed of a T3D program.
List of Differences Between T3D and Y-MPThe current list of differences between the T3D and the Y-MP is:
- Data type sizes are not the same (Newsletter #5)
- Uninitialized variables are different (Newsletter #6)
- The effect of the -a static compiler switch (Newsletter #7)
- There is no GETENV on the T3D (Newsletter #8)
- Missing routine SMACH on T3D (Newsletter #9)
- Different Arithmetics (Newsletter #9)
- Different clock granularities for gettimeofday (Newsletter #11)
- Restrictions on record length for direct I/O files (Newsletter #19)
- Implied DO loop is not "vectorized" on the T3D (Newsletter #20)
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.