ARSC T3D Users' Newsletter 67, December 29, 1995
Using the -base Flag on mppexec
The man page for mppexec on Denali shows many different options for submitting jobs to the T3D:
> MPPEXEC(1) Cray Research, Inc. UNICOS MAX 1.0 > > NAME > mppexec - Initiates and services a user application on a > CRAY T3D system > > SYNOPSIS > mppexec a.out [-base node] [-bwire c] [-debug] [-nosleep] > [-npes n or n-m] [-pool pool_name] [-time seconds] [-wired] > [-ypesim host:[path]] [user_options] [user_args]So a user can submit a job in either of two ways:
mppexec a.out -base 0x010or
a.out -base 0x010they both work the same.
The -base switch is described in the mppexec manpage as:
> -base node Causes the partition to be allocated such that > virtual PE0 will be located on the specified > node. If the PEs on node are currently busy, > the application will sleep waiting for them to > be released. For example: > > a.out -base 0x020 -npes 2 > > This example will assign the 2 PEs on node > 0x020 to the application. If the PEs on node > 0x020 are currently busy, the application will > sleep waiting for them to be released.From this description we know that we can at least specify which of the 128 PEs on the ARSC T3D will be PE0. Because there are 2 PEs to a single node board on the T3D it is not possible to get only 1 PE. (You can always run on 1PE but the other PE on the node board is necessarily idle.) And by experimenting with the -npes flags you can see that a user is always restricted to a power of 2 PEs.
From the mppview utility on Denali it is possible to assign a base node to each node of 2 PEs on the T3D:
_________________________________________________________ \ 0x000 0x002 0x004 0x006 0x008 0x00a 0x00c 0x00e \ \0x010 0x012 0x014 0x016 0x018 0x01a 0x01c 0x01e \ \________________________________________________________\ _________________________________________________________ \ 0x100 0x102 0x104 0x106 0x108 0x10a 0x10c 0x10e \ \0x110 0x112 0x114 0x116 0x118 0x11a 0x11c 0x11e \ \________________________________________________________\ _________________________________________________________ \ 0x200 0x202 0x204 0x206 0x208 0x20a 0x20c 0x20e \ \0x210 0x212 0x214 0x216 0x218 0x21a 0x21c 0x21e \ \________________________________________________________\ _________________________________________________________ \ 0x300 0x302 0x304 0x306 0x308 0x30a 0x30c 0x30 \ \0x310 0x312 0x314 0x316 0x318 0x31a 0x31c 0x31 \ \________________________________________________________\Although this is the mapping that mppview uses and is the mapping used in /usr/spool/mpp/mppsyslog, I found that to get PE0 into the intended position in the display, I had to transpose the first two digits of the node address. So:
a.out -base 0x130 goes to the lower left hand corner as 0x310 a.out -base 0x13e goes to the lower right hand corner as 0x310 a.out -base 0x12e goes to the 0x21eThis switch only assigns PE0 and does not specify the shape of the partition used by the job. The partitions generally look like a square or a rectangle in the mppview display but it can sometimes wrap around the torus and produce what looks like two disjointed rectangles.
Knowing how the partition is allocated would provide the user with the -base switch, the ability to avoid certain PEs. But in any case, knowing where the partition begins allows the user to avoid particular PEs. For example:
a.out -base 0x020 -npes 1 a.out -base 0x020 -npes 2 a.out -base 0x020 -npes 4 a.out -base 0x020 -npes 8 a.out -base 0x020 -npes 16 a.out -base 0x020 -npes 32 a.out -base 0x020 -npes 64would all avoid the PE 0x01a.
NQS and the T3DStuart Paton of the Edinburgh Parallel Computing Centre writes in about the limitations of NQS in scheduling the T3D (Newsletter #65(12/15/95)):
> Here at EPCC the NQS-T3D problems you mention were very visible > in the early period of production these have now been basically > eliminated by careful tuning. The system manager, Mike Brown, > recently delivered a paper at CUG on how this was achieved. I > am not sure whether you have seen this but it should address > the problems that you are seeing. I assume that it can be > retrieved somehow either from the CUG home page or direct from > M.D.Brown@ed.ac.uk.On the CUG home page, I saw no way of getting the papers from any of the past CUG conferences and also the paper was not available through the EPCC web page. But I know Mike Brown presented a description of EPCC's work with NQS and the T3D at the Alaska CUG. I missed most of that presentation but I do have a copy of slides and we are all waiting for the Proceedings of the Alaska CUG to come out.
Thanks to Dale ClarkBeing the end of the year, I'd like to thank Dale Clark, a Research Assistant at ARSC and a graduate student at UAF for making these newsletters into web documents.
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.