ARSC HPC Users' Newsletter 300, September 24, 2004
- 300 Issues !!!
- Hello from Mike Ess
- Hello from Guy Robinson
- Meet Ed Kornkven, ARSC Vector Specialist
- Meet Rebecca Koskela, ARSC Bioinformatics Specialist
- 300th Issue Contest! 3 Prizes!
- MPICH2 Release 0.971
- IBM LL Scripts Should Specify "network.MPI" as "shared"
- Barrier Implied when Allocating Co-Arrays
- Quick Tip
300 Issues !!!
We just celebrated the 10th Anniversary Issue but can't ignore a number like THREE - ZERO - ZERO! For this huge event, I thought you might enjoy catching up with past editors, meeting new ARSC staff members, and winning prizes in YET ANOTHER contest!
Thanks again to all readers, subscribers, contributors, and answerers of Quick-Tips!
Hello from Mike Ess
[ Mike started this newsletter in '94 and produced issues (weekly!) for almost two years... ]
What a great thing the the internet is! ARSC's newsletter has been spreading the word on High Performance Computing for ten years and we are all better for it.
Twenty years ago (in Cray-XMP time and b4 internet) each supercomputing site had it's own newsletter, it was usually a monthly hardcopy text-only version. Newsletters like Cray Research Marketing Newsletter and the newsletters from ECMWF, NCAR, LLL, and Sandia were the only way that Cray Users knew they weren't alone. Lucky was the site that had a librarian collecting all the other newsletters and mailing out the site's own. ARSC continues this tradition of getting the word out, congratulations.
Mike Ess, http://mywebpages.comcast.net/mike_ess
Hello from Guy Robinson
[ Guy was co-editor from early '97 through late '03. ]
Is there life after newsletter editing?
One of my desires during my "mature gap year" was to travel, and thanks to a world wide network of friends in a great many places this was actually pretty easy to do.
To the surprise of many folks when I stopped work at ARSC I didn't fly back to Europe immediately. While spending a lot of time putting all my stuff into storage, I also made many day trips and also flew off to see Vancouver, Missoula, and Portland.
The highlight of the stop in Missoula was Don Morton taking me flying so I had a better idea of the way the university hides itself in the hills. (Don used to be a major contributor to this newsletter, but now I think he'd rather be flying.) Then there was SC in Phoenix, back to Fairbanks, then back to the UK for a quiet family Christmas.
The first grand international trip was to Australia for the grand prix and then a tour 'round the continent. Darwin is interesting, it has the frontier feel of Alaska and there are so many similarities that it is "sister city" to Anchorage. The big difference is the temperature and humidity but both locations see tourism as a way forward. Catching up with Rob Bell in Melbourne, we drove along the grand prix circuit for a while before I headed back to the tail end of a Fairbanks winter.
Due to the odd way I've been purchasing airline tickets all my trips tend to start/finish in Fairbanks.
It was back to the UK, then back to Fairbanks in August to catch the start of Faculty Camp and see the continued development of this by Tom Logan and staff.
The next trip was to Russia, Moscow and St. Petersburg in particular. Walking round St. Petersburg seeing the palaces on a bright spring day with ice still on the river made for a great day out. Apart from the historical sites, it was good to see how wonderful a public transport system could be. (This feeling was compounded by the complete failure of the London underground on my return to the UK!)
Between travels I've been sorting and organizing as I try to put three households full of items together into one. A considerable volume of old magazines, references, and notes have been recycled. A disappointment has been to see how little has actually changed in ten years of scientific computing from the scientist's point of view. We have bigger, more complex models but the techniques and approach is essentially similar.
I've never been a great fan of having a computer at home, preferring to spend that time on other interests. Of course, if home is the office, some sort of system becomes necessary, so a PC has made its way into my life. After a reasonably good start, the ugly world of viruses and other incidents soon started to take up more time than actual work. Thus, I've now installed Linux, and hopefully that will get me around the problems.
Being just a short train ride from London it is relatively easy to drop down there for meetings. (Short means an hour and a half each way, but it is a great time to catch up with reading.) I've been to various technical meetings on the development of UK, European, and international grids, and a number of climate working groups. In Europe, many projects are building grids out of existing networks and formalising the connections between groups of scientists to build the next generation of working environments. In the evening there are shows and other events to go to. I particularly recommend the free BBC recordings; see http://www.bbc.co.uk/tickets , if you are ever in London.
In October I start working with various folks at the UK Meteorological Office, helping to port several climate codes to the Earth simulator. I'll be based in Exeter, UK, for a little while and then it's off to Japan.
Finally, thanks to all the friends who have helped out while I've been travelling, looked after things, provided rides and housing, etc. This adventure wouldn't have been possible without your support.
I'd also like to thank all those who have submitted items to the newsletter, it was always fun to coerce items and to talk with readers about the impact it had on the way they worked. If any reader manages to catch me during my coming travels I'd gladly buy them a pint and talk more about our adventures, either in the physical world or the virtual world of supercomputing.
Meet Ed Kornkven, ARSC Vector Specialist
Edward Kornkven has been on the ARSC staff as a Vector Specialist since fall of 2002. He came to ARSC from Colorado Springs where he was a Senior Software Engineer at Ambeo, Inc. At Ambeo he wrote and maintained components of their data warehouse management tools, including their SQL parser and database storage module.
Before that, Ed held various academic positions including teaching at the University of Wyoming and the South Dakota School of Mines and Technology.
It was at the University of Illinois that he first entered the high performance computing arena, working with researchers from the Illinois Natural History Survey and NCSA on models of insect pests using the Cray YMP and Connection Machines CM-2 and CM-5, as well as writing visualization programs for those models. He also was a research assistant at UI's Parallel Programming Lab investigating parallel programming models and their implementation.
Ed is married to Elizabeth, and spends most of his free time enjoying her and their six children whom they have schooled at home for the past thirteen years. A Wyoming native, it still warms Ed's heart to see western garb on the clerks celebrating the "Rancher Days" meat sale at the Fairbanks Safeway store.
Meet Rebecca Koskela, ARSC Bioinformatics Specialist
Rebecca Koskela has a background in both bioinformatics and high performance computing. Prior to joining ARSC, she was a member of the senior management team of the Aventis Cambridge Genomics Center and manager of the Scientific Computing group in Cambridge, MA. She was a bioinformatics specialist at the Mayo Clinic, director of informatics in the Department of Genetics at Stanford University and also worked at Cold Spring Harbor Laboratory with the Dana Consortium for the Genetic Basis of Manic-Depression Illness. In addition to her bioinformatics experience, Rebecca specialized in system performance and analysis at Sandia National Laboratories, Los Alamos National Laboratory, Cray Research and Intel.
Rebecca was raised on the Kenai Peninsula and is happy to be "home" again.
As a Bioinformatics Specialist, Rebecca will be working with the Institute of Arctic Biology Bioinformatics group and other biological projects at the university.
300th Issue Contest! 3 Prizes!
Yes, there will be prizes! Which will involve, you guessed it, Alaska Sourdough Starter and Zipper-Pull Thermometers good to 40 below.
Define the term: "Supercomputer."
- Your "definition" MUST be 42 words or less (as judged by "wc").
- Scoring will depend on a variety of factors. E.g., Does it educate or stimulate discussion? Is it humorous or original? Feel free to quote other good definitions of "supercomputer" if you'd like, but that won't help much for "originality." I'll try to recruit a balanced panel of judges.
- Anyone who subscribes to the email edition of this newsletter, including ARSC staff, can enter, but prizes are only for non-staff.
- Deadline: September 30, 2004
MPICH2 Release 0.971
We received this announcement from Rusty Lusk:
> A new release of MPICH2 is now available from the web site > > http://www.mcs.anl.gov/mpi/mpich2 > > This has nearly all MPI-2 functionality. Details can be found in the > RELEASE-NOTES file in the distribution. > > This is release 0.971. (We are saving the number 1.0 for when > absolutely all of the complete MPI-2 Standard is fully implemented.) > For Linux and Windows clusters, this MPICH is more robust and > higher-performning than any versions of MPICH-1, the current version of > which is 1.2.6.
IBM LL Scripts Should Specify "network.MPI" as "shared"
LoadLeveler allows you to specify "node_usage" as shared or not_shared and "network.MPI" as shared or not_shared.
Even if you specify node usage as shared, specifying network as not_shared makes network adapters unavailable to other jobs. Recently, we have seen several instances where jobs on iceberg were sharing nodes but not sharing adapters, meaning that nodes were left empty when there was plenty of work waiting.
Iceberg users: in the future, please specify network use as "shared" in ALL LoadLeveler scripts. This will work appropriately for node_usage of "shared" or "not_shared".
ARSC is updating "news queues" and our other documentation with this change. Here's an example:
#!/bin/csh # # @ error = Error # @ output = Output # @ notification = never # @ job_type = parallel # @ node = 5 # @ tasks_per_node = 8 # @ node_usage = not_shared # @ network.MPI = sn_single,shared,us # @ class = standard # @ wall_clock_limit=3600 # @ queue ./a.out
Barrier Implied when Allocating Co-Arrays
Last week, an ARSC user discovered (the hard way) the following fact about Co-Array Fortran. From CrayDoc manual: "S-3694-52: Fortran Language Reference Manual, Volume 3", Section: "8.1.7. Specifying Allocatable Co-arrays":
------------------ Caution: Execution of ALLOCATE and DEALLOCATE statements containing co-array objects causes an implicit barrier synchronization of all images. All images must participate in the execution of these statements, or deadlock can occur. ---------------
Implicit barriers occur in many other places, MPI collectives and OpenMP PARALLEL DO, for instance, but this one may be unexpected as it's tucked into a regular Fortran language construct. On the other hand, it makes sense. It's allocating a global object, so it's reasonable that all processors should wait until the allocation has completed and the array is thus safe to use.
The user was allocating the array on the only two images that actually needed it... which also makes sense... But, in terms of debugging time, this turned out to be an expensive mistake.
Quick-Tip Q & A
A: Does anyone know a work-around for this? Everything compiled, but it fails to link. % mpxlf_r -O3 -qarch=pwr4 -qtune=pwr4 -qcache=auto -qhot -qstrict \ -bmaxdata:1000000000 -bmaxstack:256000000 -o myprog \ modules/*.o grid/*.o tools/*.o local/*.o atmos/*.o \ land/*.o coupler/*.o main/*.o /bin/sh: /usr/bin/mpxlf_r: 0403-027 The parameter list is too long. # # Short answer from Hank Happ: # Perhaps I'm being naive, but wouldn't first putting all the *.o files into a library file (*.a) using the "ar" command work? # # "Answer -v", from Greg Newby: # This problem isn't related to the compiler or linker, but a limitation on the command length of the Unix shell (/bin/sh in this case, although some ARSC systems reportedly use the Korn shell, ksh, as /bin/sh). What's happening is, the subdirectories and names of all those .o files add up to a length that's longer than the shell can handle. (Yes, the shell uses a static array of characters to handle its arguments, rather than allocating memory on the fly.) The solution I recommend is creating a .a archive file using ar. Put all those .o files into it, then just add the one .a file to the compiler command line above. The compiler's linker will look into the .a file and extract all the .o's it needs. Since you'll insert to the .a one .o at a time, rather than all at once, the command line length won't be an issue. Presumably you're using a set of Makefiles to build all this code (I hope so!). Just add an extra step after the compile step (that creates the .o) to add the .o to your .a. For example, if you have a Makefile in the coupler directory: # Get to the subdirectory: cd coupler # Make coupler.o mpxlf_r -c coupler.f # Add the .o to the .a: ar -r ../mylibrary.a coupler.o "man ar" for complete syntax details. "ar" is similar to "tar", but is used directly by the compiler/linker. The "-r" says to replace an old coupler.o, if one exists. If you have a top-level "make clean", make sure you remove mylibrary.a, too, so you don't accidentally link against old .o's. Once your .a contains all of the needed .o's, your compile line above will shorten to something like this: % mpxlf_r -O3 -qarch=pwr4 -qtune=pwr4 -qcache=auto -qhot -qstrict \ -bmaxdata:1000000000 -bmaxstack:256000000 -o myprog myarchive.a There are a few details you'll need to work out for yourself, depending on your Makefile, such as dependencies, but the above should help to get you started. # # And another alternative, borrowed from a user's makefile: # This solution assumes that, for each subdirectory, "*.o" expands to a sufficiently short string that the shell can handle it. It uses relocatable loads to combine multiple object files into single object files, by subdirectory: % ld -r ./modulesObjs.o modules/*.o % ld -r ./gridObjs.o grid/*.o % ld -r ./toolsObjs.o tools/*.o % ld -r ./localObjs.o local/*.o % ld -r ./atmosObjs.o atmos/*.o % ld -r ./landObjs.o land/*.o % ld -r ./couplerObjs.o coupler/*.o % ld -r ./mainObjs.o main/*.o % mpxlf_r -O3 -qarch=pwr4 -qtune=pwr4 -qcache=auto -qhot -qstrict \ -bmaxdata:1000000000 -bmaxstack:256000000 -o myprog \ ./*Ojbs.o Q: I frequently want to delete a bunch of files from a large directory... but with a catch. A simple wild-card expression describes the files I want to RETAIN, but there's no simple expression for the files I need to remove. E.g., I want to delete everything except: "*.f90" Is there an "rm" option like grep "-v", which says "select the files not matching the expression"? Other suggestions?
[[ Answers, Questions, and Tips Graciously Accepted ]]
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.