ARSC T3E Users' Newsletter 169, May 21, 1999

MPI Quiz

Can you spot three mistakes in the code below? It compiles and runs, but hangs... :-(


!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
      program mpi_quiz

      include 'mpif.h'

      integer idata
      parameter (idata=1000)

! code data
      integer irbuf,isbuf
      dimension irbuf(idata),isbuf(idata)
      integer i
      integer ndata

! MPI data
      integer myid, numprocs, ierr, rc
      integer mroot

      call MPI_INIT( ierr )
      call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
      call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
      print *, "Process ", myid, " of ", numprocs, " is alive"


      irbuf=0
      ndata=10

      mroot=0


      do i=1,ndata
        isbuf(i)=i*100
      enddo

      if(myid.eq.mroot) then
        call MPI_REDUCE(isbuf, irbuf, ndata, MPI_INT, MPI_SUM,
     !    mroot, MPI_COMM_WORLD)
        do i=1,ndata
          write(6,*) ' data on ',myid,' is ',isbuf(i),irbuf(i)
        enddo
      endif

      call MPI_FINALIZE(rc)
      stop
      end
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Here's a sample session showing the output:

  yukon$ f90 mpiquiz.f
  yukon$ mpprun -n2 ./a.out
   Process  0  of  2  is alive
   Process  1  of  2  is alive
At this point the program is hung--no more output appears. Here is the dump produced when program is interrupted by CTRL-C:

  SIGNAL: Interrupt ( from process 0 )

 Beginning of Traceback (PE 0):
  Interrupt at address 0x8000d9f0c in routine '???'.
  Called from line 1422 (address 0x8000f500c) in routine '_T3DMPI_reduce_zero'.
  Called from line 1524 (address 0x8000f6580) in routine 'MPI_Reduce'.
  Called from line 2430 (address 0x800171760) in routine 'MPI_REDUCE'.
  Called from line 36 (address 0x80000154c) in routine 'MPI_QUIZ'.
  Called from line 475 (address 0x800000c98) in routine '$START$'.
 End of Traceback.

IJCR: Research on Parallel Computing


>                      CALL FOR PAPERS
> 
> Special Issue of the "International Journal of Computer Research"
> (http://www.softlab.ntua.gr/~mastor/IJCR.htm) on:
> 
> 
>          INDUSTRIAL APPLICATIONS OF PARALLEL COMPUTING
> 
> Parallel scientific and engineering computing is becoming of paramount
> importance in several industrial applications, especially when the
> solution of large and complex problems must cope with harder and harder
> time scheduling.
> 
> In this special issue, we would like to report on relevant research
> representing the state-of-the-art in the following areas of parallel
> computing:
> 
>   1. parallel and distributed combinatorial and global optimization
>      methods, as applied to industrial/practical problems
> 
>   2. parallel and distributed computing techniques and software systems,
>      as applied to industrial/practical problems
> 
> The application areas are to be understood very broadly and include,
> but are not limited to: computational fluid mechanics, structural
> engineering, computational chemistry, electronic and electromagnetic
> circuits, signal and image processing, etc.
> 
> Submission deadline: September 15th, 1999
[ For submission details, see URL, above. ]

Fortran Information List

The quarterly Fortran Information List submitted to comp.fortran.90 by Mike Metcalf provides, as usual, a tremendous list of information and resources for anyone involved in Fortran.

The contents of the May 20, 1999 list include:

  • WHAT'S NEW?
    • Since 20 April:
      • Update Cray and SGI entries to Fortran 95.
      • Update Lahey's entry.
      • Update Compaq e-addresses.
      • Add Alan Miller's source form converter.
    • Since 23 March:
      • Add "The DIGITAL Visual Fortran Programmer's Guide".
      • Add Fortran 95 standard electronic ordering information.
      • Update Compaq (Digital) entry for Linux.
    • Since 22 February:
      • Update Wagener's book entry.
      • Update Sun's entry.
      • Delete Meissner's e-book.
      • Add advice on down loading Dubois's lecture notes.
      • Replace IBM's entry - for Fortran 95 compliance etc.
      • Replace Fujitsu's entry - for Fortran 95 compliance etc.
      • Add Lahey/Fujitsu version of f90gl.
  • WHERE CAN I OBTAIN A FORTRAN COMPILER?
  • OTHER USEFUL PRODUCTS
  • WHAT BOOKS ARE AVAILABLE? References provided to books in: Chinese, Dutch, English, Finnish, French, German, Japanese, Russian, and Swedish
  • WHERE CAN I OBTAIN COURSES, COURSE MATERIAL OR CONSULTANCY?
  • WHERE CAN I FIND THE FORTRAN AND HPF STANDARDS?

We've made the May 20, 1999 list accessible at:

http://www.arsc.edu/support/news/T3Enews/misc/i169FortranList.html

To subscribe:

(hpff@cs.rice.edu is a mailing list for announcements related to High Performance Fortran.)

To (un)subscribe to this list, send mail to hpff-request@cs.rice.edu. Leave the subject line blank, and in the body put the line (un)subscribe <email-address>

MPI Quiz: Answers

Here are the three mistakes:
  1. The reduction/broadcast operation (MPI_REDUCE) isn't called on all the processors in the MPI_COMM_WORLD Communicator.
  2. MPI_INT is specified as the data type in the MPI_REDUCE call. Oops!

    If you work in both C and Fortran, remember, it is MPI_INT in the former and MPI_INTEGER in the latter. Other data types vary also. Fortran's IMPLICIT NONE can help since MPI_INT isn't defined in mpif.h header files.

  3. The IERR parameter is not included in all MPI subroutine calls.

We can see the different effects of these problems by correcting them one at a time. The error messages all provide clues.

As shown above, problem #1 causes the program to hang. To correct this problem, the "if" statement can be commented out, like this:


!!      if(myid.eq.mroot) then
        call MPI_REDUCE(isbuf, irbuf, ndata, MPI_INT, MPI_SUM,
     !    mroot, MPI_COMM_WORLD)
        do i=1,ndata
          write(6,*) ' data on ',myid,' is ',isbuf(i),irbuf(i)
        enddo
!!      endif
With this change, MPI_REDUCE will be called on all processors. This version produces this output:

yukon$ mpprun -n4 ./t1.a.out 
 Process  1  of  4  is alive
 Process  3  of  4  is alive
 Process  0  of  4  is alive
 Process  2  of  4  is alive
-MPI- ERROR FATAL: world rank SIGNAL-MPI- ERROR FATAL: world rank 0, c: 2, comm Operand range erroromm 0
   (0
  In c [0] In call memory management faultall _T3DMPI_op_sum)_T3DMPI_op_sum, [-
, [-490
 Beginning of Traceback (PE 1):
490, class   Interrupt at address 0x80017175c in routine 'MPI_REDUCE'.
, class MPI_ERR_OP  Called from line 36 (address 0x8000014b8) in routine 'MPI_QUIZ'.
MPI_ERR_OP],
  [  Called from line 475 (address 0x800000c98) in routine '$START$'.
],
  [Invalid op End of Traceback.
Invalid op] MP] MPOperand range error(coredump)
Correcting problem #1 has exposed problem #2. To correct problem #2, we replace MPI_INT with MPI_INTEGER, like this:

        call MPI_REDUCE(isbuf, irbuf, ndata, MPI_INTEGER, MPI_SUM,
     !    mroot, MPI_COMM_WORLD)
The code is recompiled and rerun:

yukon$ mpprun -n4 ./t2.a.out
 Process  3  of  4  is alive
 Process  1  of  4  is alive
 Process  2  of  4  is alive
 Process  0  of  4  is alive
SIGNAL: Operand range error ( [0] memory management fault)

 Beginning of Traceback (PE 3):
  Interrupt at address 0x80017175c in routine 'MPI_REDUCE'.
  Called from line 36 (address 0x8000014c0) in routine 'MPI_QUIZ'.
  Called from line 475 (address 0x800000c98) in routine '$START$'.
 End of Traceback.
Operand range error(coredump)
Correcting problem #2 has exposed problem #3. We add the IERR parameter to the MPI_REDUCE call:

        call MPI_REDUCE(isbuf, irbuf, ndata, MPI_INTEGER, MPI_SUM,
     !    mroot, MPI_COMM_WORLD, ierr)
Again, the code is recompiled and rerun:

yukon$ mpprun -n4 ./t3.a.out
 Process  3  of  4  is alive
 Process  0  of  4  is alive
 Process  1  of  4  is alive
 Process  2  of  4  is alive
  data on  1  is  100,  0
  data on  3  is  100,  0
  data on  2  is  100,  0

[... snipped ...]

  data on  3  is  1000,  0
  data on  0  is  1000,  4000
  data on  2  is  1000,  0
 STOP (PE 2)   executed at line 44 in Fortran routine 'MPI_QUIZ'
 STOP (PE 0)   executed at line 44 in Fortran routine 'MPI_QUIZ'
 STOP (PE 3)   executed at line 44 in Fortran routine 'MPI_QUIZ'
 STOP (PE 1)   executed at line 44 in Fortran routine 'MPI_QUIZ'
This seems to be the final correction, but let us know if you spot anything else.

Quick-Tip Q & A



A:{{ "I can't login! I keep on trying... The Kerberos server
   accepts my 'kerberos password,' asks for my 'card-code,' which 
   I enter, but then it says:

       
Enter Next Token:


   I enter my SecurID PIN into my SecurID card (AGAIN), type the 'next
   token' which appears on the card, but it doesn't work!"

   (What should this person do?)  }}



For politically correct brevity, let this person be a "he."

Because he's had too many failed logins, the SecurID Server has put
this user's account into "next token mode."

At the "Enter Next Token:" prompt, he should NOT reenter the PIN into
the card to obtain a new passcode.  Instead, he should wait for the
passcode displayed on the card to change naturally (this will take at
least 60 seconds).  The new passcode is the "next token" requested.  He
should enter it at the "Enter Next Token:" prompt.

If this attempt fails, the user should call his computer center's help
desk.  Possible solutions:

  o The consultant can cancel "next token mode."

  o If the card has lost synchronization, the consultant can try to
    resync it.

  o If the card's battery is fading, or it can not be resynchronized,
    the consultant can provide a temporarily login mechanism and ship a
    new card.





Q: What's the best restaurant in Minneapolis and why?  

   (CUG '99 is in Minneapolis next week.  Hungry attendees may seek out
   Guy or Tom for preliminary responses to this question.  Satisfied
   attendees are welcome to submit opinions.)

[ Answers, questions, and tips graciously accepted. ]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top