ARSC HPC Users' Newsletter Number 423 2013-07-17

The Arctic Region Supercomputing Center Users' Newsletter provides a platform for discourse relevant to users of high performance computing systems. Topics include: programming, commands, tools, applications, and more.

Please send items, ideas, announcements, etc. to owner-hpc_users@arsc.edu.

To subscribe or manage your subscription, visit:

ARSC's 20th Year

By: Greg Newby, ARSC Director

The Arctic Region Supercomputing Center was initiated in 1992, and first opened its electronic doors in 1993. This year, we are marking 20 years of providing supercomputing cycles and support for a broad range of scientists, engineers, students, and others.

Feedback on git article from Issue #422

Issue #422 featured Kate Hedström's article, "Getting into and out of trouble with git." We received this additional input from Jed Brown:

From the article:

9. git add Apps/Arctic
10. git merge <sha1 from fixes>

Oops! A git status shows that many things changed - all those private
updates came along for the ride (and introduced a few conflicts). I
wanted a cherry-pick, not a merge. No big deal:

11. git reset --hard
12. git cherry-pick <sha1 from fixes>

[...]

13. git fsck
14. git show <sha1 from a dangling blob>
15. git show <sha1 from a dangling blob> > Apps/Arctic/filename

Jed wrote:

FWIW, if you had committed the merge, then the full state would be in the reflog, in which case you could recover without this manual process. Alternatively, using 'git reset' (or 'git merge --abort') would have left the tree intact. Or if you caught it earlier, 'git stash' is handy.

Compiling Large Dynamically Allocated Arrays in PGI Fortran

By: Ed Kornkven, HPC Specialist

Fortran programmers know that they can avoid some of the headaches of large arrays by dynamically allocating them instead of declaring them statically. Dynamically allocated arrays are created by the ALLOCATE() routine and their storage is created on the program's heap at run time. As an example, here is a simple Fortran program that creates some arrays using ALLOCATE(). This example is adapted from the PGI Compiler User's Guide, release 13.

program mat_allo
    integer i, j
    integer size, m, n
    parameter (size=1291)
    parameter (m=size,n=size)
    double precision, allocatable::a(:,:),b(:,:),c(:,:,:)
    allocate(a(m,n), b(m,n), c(m,n,m))
    do i = 100, m, 1
        do j = 100, n, 1
            a(i,j) = 10000.0D0 * dble(i) + dble(j)
            b(i,j) = 20000.0D0 * dble(i) + dble(j)
        enddo
    enddo
    call mat_add_3d(a,b,c,m,n)
    print *, "M =",m,",N =",n
    print *, "c(M,N,M) = ", c(m,n,m)
end

subroutine mat_add_3d(a,b,c,m,n)
    integer m, n, i, j, k
    double precision a(m,n),b(m,n),c(m,n,m)
    !$omp do
    do i = 1, m
        do j = 1, n
            do k = 1, m
                c(i,j,k) = a(i,j) + b(i,j)
            enddo
        enddo
    enddo
    return
end

There is a pitfall with this program when compiled by the PGI compiler (and perhaps others). The problem is that the array C has more than 2G elements (1291*1291*1291) = 2,151,685,171 > 2,147,483,648. Note: the problem is not with the size of the array, it is with the number of elements. In order to index into an array, the compiler is going to compute offsets and the offset for C is going to exceed 2G, the largest integer that can be represented in 31 bits (reserving one bit in the signed 32-bit integer). The upshot is, the compiler needs to create 64-bit offsets. This is requested in PGI Fortran with the "-Mlarge_arrays" compiler flag. If we are going to have 64-bit (8-byte) offsets, we may want to have our index variables also be 8-bytes long, though in this example it isn't necessary. Default integers of 8 bytes is requested with the "-i8" flag. Our compilation command then, could be something like:

pgfortran -o mat_allo mat_allo.f90 -i8 -Mlarge_arrays

A spot check on Copper, a Cray XE, reveals that this is not an issue when compiling with Cray's crayftn, Gnu's gfortran or Intel's ifort. Those compilers are apparently automatically dealing with array indices exceeding 2G. When compiled with the PGI compiler however, the program generates a "Memory fault" which can be traced using a debugger to the assignment to c(i,j,k). A pretty obscure bug, but easily avoided by using the -Mlarge_arrays option.

Quick-Tip Q & A

Last time, we asked:

Q: I need to copy some large files between computers. What is the fastest way to do that?

A: Martin Lüthi suggests rsync

rsync is the standard tool for that. It copies only files that have changed, and copies large files chunk-wise, meaning that only changed chunks are copied and thus speeding up the process.

Convenient command line options are


-z: use compression
-v: verbose mode (visual feedback)
-r: recursive mode (copy whole directories)

rsync -rvz user@remote.host.uaf.edu:/scratch/user/file /home/user/results

A: Dan Stahlke also advocates rsync, and provided some additional usage information:

I am a fan of rsync over ssh. The advantage of rsync is that you can resume an aborted transfer, although if it was in the middle of sending a large file it still needs to start over with that file. The command is:

rsync -av -e 'ssh -c arcfour' source_directory hostname:target_directory

This will transfer for example "source_directory/xyz.dat" to "target_directory/source_directory/xyz.dat". If you put a slash at the end of the source path, "source_directory/" then it would transfer "source_directory/xyz.dat" to "target_directory/xyz.dat". You don't need to transfer entire directories, you can do individual files.

The "-a" option tells it to recurse into subdirectories and preserve all attributes. "-v" is of course for verbosity. The "-e 'ssh -c arcfour'" is optional; it tells ssh to use the arcfour cipher (RC4) which is weaker than the default, but still pretty secure, and requires much less CPU usage. If you pass the "--delete" option, then files that exist at the target but not the source are deleted. Use that with extreme caution! When using "--delete" you should always also pass the "--dry-run" option which causes rsync to do nothing but to tell you what it would have done.

Rsync is a powerful command that can be used for many things, and it is worth looking at the man page and searching online for blog posts. I use it to make incremental backups of my home computer (kind of like Time Machine on Mac OS), to push changes to my web server, and to copy music and documents to my phone.

By the way, if you are on a modern system and using bash as your shell, then you should have tab completion for the remote machine. So, if you are typing the above command, and for the last part you type "hostname:/home/me/" and hit tab twice you should get a list of subdirectories on the remote machine (although there may be some delay).

Another option to always keep in mind, if you have a truly huge amount of data and want the absolute fastest transfer possible, is to just take the hard drive out of the one machine and put it in the other. If your network connection is 100 Mbit or less this will be much faster. With a 1 Gbit connection, just use the network.

A: Jed Brown offers some perspective:

The fastest way is not to copy them.

The definition of "large files" is eternally in flux, but if you are even considering moving it, then it's not big at all. For example, a few terabytes is a lot to move between geographically-distant sites, but the larger supercomputers can hold more than 1 PB in memory at any given time, can rewrite the contents of memory in about a second, can write the contents of memory to local disk in about an hour, and would take weeks to transfer to a different data center. On high-end machines (this applies to anyone doing science because if you have a good justification, you can get an allocation on those machines), "big data" starts when it no longer fits in memory---more than a petabyte. If you have petabytes of data, transferring it between computers is too expensive for most projects, so you will need to process it in-situ.

One exception is if there are so many people trying to analyze the data that the analysis must be distributed to other machines. Data from the Large Hadron Collider is one such example, with thousands of physicists around the world analyzing about 25 PB/year of data, but that project has a budget in the billions.

[There are many options for moving intermediate-sized files (as we've seen above, "large" files almost never move), depending on the environment, scale, and structure of the data, but I suspect others will suggest some of those tools.]

Here is a new question for next month's newsletter.

Q: Do you have a useful custom Unix/Linux shell prompt? If so, please share the code you use to produce your prompt, and make sure you mention which shell you utilize (i.e., an entry for .bashrc, .cshrc, etc.)


[[ Answers, Questions, and Tips Graciously Accepted ]]


About ARSC

The Arctic Region Supercomputing Center (ARSC) provides research and high performance computing, large-scale storage, and related services to the students, faculty and staff of the University of Alaska.

Editor

Greg Newby, ARSC Director, gbnewby@alaska.edu, 907-450-8663

Publication Schedule

Monthly, on approximately the third Wednesday.

Subscription Information

Subscribing and unsubscribing:

Quick tip answers and other correspondence:

Back Issues are Available

Back to Top