ARSC HPC Users' Newsletter 243, April 12, 2002

Etnus Totalview 5.0 Installed on Icehawk

ARSC has just installed the Etnus Totalview 5.0 parallel debugger on its IBM SP Cluster, icehawk.

To use it, add this:

/usr/local/adm/pkg/flexlm/license.dat

to the settings of your LM_LICENSE_FILE environment variable. Recompile with the -g compiler option, and launch totalview against the resulting executable. E.g.,

icehawk1$ totalview ./a.out

Totalview documentation is built in. Click on "Help".

(When the bugs have been exterminated and you start production runs, be sure to recompile again WITHOUT the -g . -g disables optimizations and kills performance.)

Upcoming Visitors, Talks, and Events

Visitors:

ARSC is hosting several visitors in the next couple of weeks to participate in evaluation of large-scale storage systems. They include Robert Bell (Bureau of Meteorology / CSIRO), David McGee (NAVO DSRC), and Roy Campbell (ERDC DSRC).

The visitors will all be giving presentations open to the wider UAF community.

Talks:

Title: "Our NEC of the Woods: Oz, CSIRO, HPCCC and NEC"
Date: Thursday 25th April 2002
Speaker: Dr. Robert Bell

Abstract:

After a brief introduction to Australia and CSIRO, which is the largest research agency in Australia, this talk will give some background on the computing history of CSIRO. This will lead on to the formation of the Bureau of Meteorology / CSIRO High Performance Computing and Communications Centre (HPCCC).

The HPCCC acquired its first system, an NEC SX-4 in August 1997, and now has a core of two NEC SX-5 systems, and associated storage and front-end systems.

The talk will describe the HPCCC core and associated systems, and some of the applications which use the HPCCC facilities.

Biographical Note:

Dr. Robert Bell is Deputy Manager of the HPCCC. He has worked for CSIRO since 1974, firstly in numerical modeling for atmospheric research, and since 1987 in the management of computing facilities for scientific research.

He is Asia/Pacific Representative on the Cray User Group Board of Directors.

As soon as it's available, the full schedule of talks will be posted on the "Hot Topics" section of, http://www.arsc.edu.

Events:
ARSC Faculty Camp:
Expression of interest is required by May 1st to participate in ARSC's 2002 "Faculty Camp." See: http://www.arsc.edu/pubs/bulletins/FacultyCamp2002.shtml
WOMPAT 2002 at ARSC/UAF, Aug 5-7:Student Employment Opportunities:
Check our "vacancies" page, http://www.arsc.edu/misc/jobs.html . Within a week or so, we expect to announce summer/fall openings for student positions.
ARSC Summer Tours:
ARSC welcomes tourists and other drop-in visitors at 1:00 pm every Wednesday afternoon, June 12 - Aug 28, for a one-hour tour. Just show up at the ARSC machine-room viewing window in the basement of the Butrovich Building on the UAF West-Ridge.

SV1 Craylib Problem Isolated

In issue #224, we noted this:

    > Unresolved issues in two SV1 user codes were cleared up recently when
    > the users switched back from the default craylibs version to craylibs
    > 3.3.0.2. It is suspected that this is an issue with the FFT routines,
    > but investigation is ongoing.
    >
    > If you feel the need to try this, you should use the command:
    >
    >     module switch craylib.3.3.0.2

Through a lot of difficult trouble-shooting on one of the two user codes, Tom Logan of ARSC was able to narrow the problem down to the LAPACK routine, CTRSM. CTRSM is a low-level routine. The code accesses CTRSM through a call to the LAPACK routine, CHEGST. The problem manifested itself by a failure of the algorithm to converge when run on 1-CPU, but correct and repeatable results when run on multiple CPUs.

It turns out that Cray already had a problem report open on CTRSM, a fix is in testing and will be integrated in a future release of craylib. From the Cray SPR, a more precise statement of the problem: "For the argument N > 64 and odd, the libsci_sv1 version of CTRSM gives wrong answers."

For now, you can download the netlib LAPACK fortran source of ctrsm.f, and add it into your compilation. ARSC users can contact consult@arsc.edu for this fortran file. Linking your own ctrsm.o file will preempt the libsci routine of the same name and thus allow you to safely use everything else from the latest installation of craylib (which, on chilkoot, is release 3.5.0.1).

As usual, please report problems and mysteries to us (consult@arsc.edu).

UAF Computational Physics Programs

The Sloan Foundation maintains a web page listings for a number of Physics and Chemistry Masters programs:

http://www.sciencemasters.com/science.php

It now includes UAF's Computational Physics program:

MS Program in Computational Physics. University of Alaska at Fairbanks.

A new professional masters degree is offered by the Physics Department for students with undergraduate backgrounds in physics or a closely related discipline.

The degree is appropriate for students seeking careers in industry, government, and research that require expertise in modeling and simulation of physical systems. Many department faculty have joint appointments with the Geophysical Institute and International Arctic Research Center and provide a range of interesting computational research projects in the fields of space physics, atmospheric physics, complex system dynamics and turbulence, data analysis techniques, ice-mechanics, and ice-ocean dynamics.

Local access to advanced high-performance computational resources are provided by ready student access to the Arctic Region Supercomputer Center. As well as courses in physics, mathematics, and numerical methods, additional special topics courses such as parallel processing techniques are offered.

Contact: Brenton Watkins, Professor of Physics E-mail: ffbjw@uaf.edu Web: http://www.uaf.edu/physics/

Quick-Tip Q & A

Bonus! Two answers in one week!

A:[[ SV1 Totalview question from  last issue ... How to view entire 
  [[ automatic arrays, like:
  [[      COMPLEX         Z( LDZ, * )

  # 
  # Thanks to Ed Anderson:
  # 
  To view the full array, dive on one of the variables, such as A.  The
  window shows "Type: COMPLEX(147,1)".  Click on COMPLEX(147,1) (or go
  to the Edit->Type menu), and change the type to COMPLEX(147,147).  The
  data object window should update automatically.  You might find it
  easier to view the array with the menu option Display->Array Browser.
  Unfortunately, the debugger doesn't remember this info when you close
  the data object window.
  # 
  # Editor's Note:
  # 
  # This works on the T3E, fails on the SV1.  It looks like a problem
  # with the SV1 totalview, but is under investigation.
  # 


A:[[ I've been connecting remotely to an SGI Octane2. I use the DISPLAY
  [[ environment variable to export the X Windows display back to my
  [[ personal workstation.
  [[
  [[ For some reason, when I sit down at this SGI and log onto the
  [[ console, the screen flashes, and it immediately logs me off. I'm
  [[ definitely not over-quota, my account is active, and everything works
  [[ perfectly when I connect remotely again.
  [[
  [[ Any ideas what's up?

  # 
  # Thanks to Richard Griswold:
  # 
  I had a similar problem when my home directory was shared between Linux
  and AIX.  Something about the .Xauthority file was different between the
  two OSes, so when I logged in on the console of one system, I had to
  delete the file before I could could log in on the console of the other
  system and run X apps.
  
  You could check for this file and try deleting it.
  # 
  # Editor's answer:
  # 
  The specific incident hit an ARSC user because he had put this:
    setenv DISPLAY <HIS_PERSONAL_WORKSTATION>
  in his SGI .cshrc file. When logging directly onto the SGI at the
  console, the IRIX window manager objected to its display being sent
  elsewhere, and bailed.

Q: Am I going nuts? 
    $ ls -d DATA
      DATA
    $ ls DATA
      D2001  index.txt
    $ cd DATA
      sh-56 ksh: DATA: not found.
  
   Why is it doing this to me?

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top