ARSC T3D Users' Newsletter 46, August 4, 1995
The 1.2.2 Release of the Programming Environment
At ARSC, we now have the 1.2.2 PE and we are testing and timing it. One of the timing tests brought out a problem that users of the T3D should be aware of. One of the single PE tests we run is an old EISPACK regression test, where the eigenvalue routines are tested with matrices that are read in from a file. Here is a comparison of the times (in seconds) for different eigenvalue routines on the 1.2.1 and the 1.2.2 release:
( new results(1.2.2 PE) > ) ( < old results(1.2.1 PE) ) eispackd.results 1,9c1,9 < dcg passed,time = 0.701 < dch passed,time = 0.202 < drbl passed,time = 0.248 < drg passed,time = 0.884 < drgg passed,time = 0.411 < drl passed,time = 0.173 < drs passed,time = 0.377 < drsb passed,time = 1.011 < drsg passed,time = 0.669 --- > dcg passed,time = 29.501 > dch passed,time = 17.045 > drbl passed,time = 0.239 > drg passed,time = 18.506 > drgg passed,time = 35.593 > drl passed,time = 16.945 > drs passed,time = 17.205 > drsb passed,time = 0.961 > drsg passed,time = 14.822 11,14c11,14 < drsgba passed,time = 0.640 < drsp passed,time = 0.371 < drst passed,time = 0.124 < drt passed,time = 0.168 --- > drsgba passed,time = 0.660 > drsp passed,time = 0.362 > drst passed,time = 0.145 > drt passed,time = 0.169Why are the new results so much slower that the old results? I couldn't believe the new Fortran compiler had slowed down so much! So I ran the tests again and got this diff between 1.2.1 PE and the 1.2.2 PE:
eispackd.results 1,13c1,13 < dcg passed,time = 0.701 < dch passed,time = 0.202 < drbl passed,time = 0.248 < drg passed,time = 0.884 < drgg passed,time = 0.411 < drl passed,time = 0.173 < drs passed,time = 0.377 < drsb passed,time = 1.011 < drsg passed,time = 0.669 < drsgab passed,time = 0.640 < drsgba passed,time = 0.640 < drsp passed,time = 0.371 < drst passed,time = 0.124 --- > dcg passed,time = 0.702 > dch passed,time = 0.227 > drbl passed,time = 0.301 > drg passed,time = 0.903 > drgg passed,time = 0.451 > drl passed,time = 0.155 > drs passed,time = 0.403 > drsb passed,time = 1.008 > drsg passed,time = 0.702 > drsgab passed,time = 0.665 > drsgba passed,time = 0.662 > drsp passed,time = 0.357 > drst passed,time = 0.145This seemed the more reasonable result that nothing much has happened. The problem was that the timing program looked something like:
program main t1 = second() call dcg() t2 = second() call dch() t3 = second() . . . end subroutine dcg() open( 10, FILE="FILE33", ... ) c compute ... endThe reason for the disparity in the first and second runs on the same PE with the same PE was that the input files to generate the matrices had been migrated off the user disk and it took awhile for them to be restored. The time for the Y-MP agent to restore the files was counted as wall clock time on the T3D. On the T3D, we usually have that:
wall clock time = cpu timebecause there is no multiprogramming on the T3D. So the problem can be "solved" by running the problem twice or being more careful about what you are timing.
The AC Compiler at ARSCFor the brave and resourceful, I have built and installed the AC compiler on denali. The introduction of the AC compiler was described at the Spring CUG and was summarized in T3D Newsletter #28 (03/24/95):
> AC and the CRAY T3D, Jesse Draper and Bill Carlson, SRC and > IDA. This was a description of the port of the GNU C compiler, > gcc, to the T3D. The performance was in some cases 3 times > better than the CRI C compiler and the compiler has a single > extension 'dist' (for distributive) which allows shared arrays > in C. In one application the AC compiler produced an > executable 3 times faster than the CRI C compiler. I am trying > to get a copy of this compiler for use at ARSC. I have a copy > of the report and the slides.I'll have more details in future newsletters but the minimal instructions for using this compiler are:
Read the documentation supplied by Bill Carlson on denali in:
/usr/local/examples/mpp/AC/DISCLAIMER /usr/local/examples/mpp/AC/README.install (I have done this) /usr/local/examples/mpp/AC/README.distI will mail copies of the slides and the report mentioned above to those who request it.
- Add to your search path /usr/local/examples/mpp/bin ahead of /bin
Change your makefile to look something like:
CC = "/usr/local/examples/mpp/bin/ac" CLD= $(CC) CFLAGS= "-I/usr/local/examples/mpp/AC/include -O" CLDFLAGS= -L/usr/local/examples/mpp/AC OBJS = main.o second.o .c.o: $(CC) -c $(CFLAGS) $< a.out: $(OBJS) $(CLD) $(CLDFLAGS) $(OBJS)
As a sample of the problems I had, the declaration:
float dex[ 10 ];works as a local declaration but aborts the AC compiler as a global declaration.
Accessing the LPAC HPC Articles ArchiveFrom Roland Piquepaille of CRI-France, I got the following hint on speeding up access to the London Parallel Applications Centre High Performance Article Archive:
> To avoid your readers wasting some time browsing at the > London Parallel Application Centre, the full URL of the > article database (with the search box) is: > > http://www.lpac.ac.uk/SEL-HPC/Articles/ArticleArchive.html > > Roland.
Installation of a New Version of MPICHAs of July 31, 1995, I have installed the entire 1.0.10 version of MPICH in the directory:
/usr/local/examples/mpp/mpichIt is much more code than was in the preliminary version "Alpha Version 0.1a" that was announced in Newsletter #34 (05/05/95). Using the installation provided, the libraries are now in:
/usr/local/examples/mpp/mpich/lib/cray_t3d/t3dwhere previously they were in:
/usr/local/examples/mpp/mpich/libI moved all of the old version of MPICH to:
/usr/local/examples/mpp/mpich/lib/oldI encourage users to try it and possibly compare it to the Edinburgh/CRI version of MPI which was described in Newsletters #39 (6/9/95) and #41 (6/23/95). If you have any problems using this new version please contact Mike Ess.
List of Differences Between T3D and Y-MPThe current list of differences between the T3D and the Y-MP is:
- Data type sizes are not the same (Newsletter #5)
- Uninitialized variables are different (Newsletter #6)
- The effect of the -a static compiler switch (Newsletter #7)
- There is no GETENV on the T3D (Newsletter #8)
- Missing routine SMACH on T3D (Newsletter #9)
- Different Arithmetics (Newsletter #9)
- Different clock granularities for gettimeofday (Newsletter #11)
- Restrictions on record length for direct I/O files (Newsletter #19)
- Implied DO loop is not "vectorized" on the T3D (Newsletter #20)
- Missing Linpack and Eispack routines in libsci (Newsletter #25)
- F90 manual for Y-MP, no manual for T3D (Newsletter #31)
- RANF() and its manpage differ between machines (Newsletter #37)
- CRAY2IEG is available only on the Y-MP (Newsletter #40)
- Missing sort routines on the T3D (Newsletter #41)
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.