Introduction to the CLSP Grid

From CLSP Wiki
Jump to: navigation, search

Grid intro

We assume you have a username and password. If not, and you need one, you may obtain one by emailing clsphelp at clsp.jhu.edu

The default shell we give to new users is bash. Our machines run Debian linux. The "a" machines (a01 through a18) each have many CPUs and have a lot of memory (typically around 100G). The "b" machines (b01 through b19) additionally have GPUs.

Accessing the grid (login nodes)

To access our cluster you should ssh to login.clsp.jhu.edu or login2.clsp.jhu.edu. For basic help and advanced tricks on using ssh and running jobs remotely see Remote Access.

When you first log in you should change your password from the one our administrator sent you; you can do this with the command yppasswd. This will work from any machine. Also, please do ypchfn and put in your email address in the "office" field and mention someone who you are working with.

Accessing other grid nodes

To make it easier to get around the grid you can run the following commands after you have logged in:

 ssh-keygen  ## just press enter a couple of times at the prompts
 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

This will enable you to ssh to any node without entering your password, for instance:

 ssh a05

It is better to do things like compilation and interactive work such as editing on a randomly chosen node rather than on login or login2. this reduces load on the login machines. You are encouraged to learn to use the program screen so that your work doesn't get interrupted if you lose your ssh connection. (Run screen from a random node, e.g. a05, and never run screen inside a `qlogin` session or from `login` or `login2`).

In the following section we detail the grid usage policy; we then discuss general grid usage (SGE) and more application-specific information.

Resource usage policy

The grid is a shared resource. Please follow best practices to ensure efficient and fair use. Examples of bad things you should not do, and which will prompt us to email you, are:

  • Filling up disks (see Storage)
  • Causing excessive disk load by running too many I/O heavy jobs at once
  • Using up too much memory without a corresponding request to qsub (e.g. "-l mem_free=10G,ram_free=10G")
  • Running "too many" jobs. An example of "too many" jobs is
    • Submitting a parallel job with 1000 concurrent parts
    • Running 100 jobs that take a week to finish and are not part of some important government contract that is paying for our grid. (see Parallel jobs)
  • Running more than a handful of processes outside GridEngine (not through qsub)
  • Copying data onto or off of the grid with excessive bandwidth

You *can* do the following:

  • ssh directly to a node (e.g. ssh a10) and running a reasonable number of interactive jobs there, e.g. compilation and debugging.

The main rule you should be aware of regarding storage is: don't cause too much I/O to bdc01. This will make the whole cluster unresponsive. bdc01 is the disk server where most home directories are located (type du ~ to find out where yours is located). We ask that you limit the size of your home directory to 50G.

We don't have individual disk-usage limits on the "a" disks, but we do keep track of it and we might start to notice if you use more than a few hundred gigabytes.

Please never use a GPU (i.e. never run a process that will use a GPU) UNLESS:

e.g. by doing qlogin -l gpu=1 -l h_rt=8:00:00 -now no
and running the process in that shell
[in which case log out promptly when done]
  • you are running the process in a script which you submit to the queue with -l gpu=1.

If you use a GPU without reservation, you will kill people's jobs and cause a large amount of wasted time.

Please never run 'screen' (or 'tmux') from inside a `qlogin` session; it disrupts the ability of GridEngine to manage your session. If you want to run 'screen', ssh to a random node (e.g. a03), run it there, and you can run `qlogin` *from* a screen session. Just not the other way round.

We allow all our collaborators to have an account, but for people outside CLSP, if we feel you are using too much resources we may limit the number of jobs you can run, or we may ask you to limit your disk usage or otherwise do something differently.

Storage and I/O

When you run experiments, create a directory for yourself on some other disk such as /export/a01, /export/a02 through /export/a14, except /export/a03 (which doesn't exist) and a06 and a07 (which are reserved). The 'b' and 'c' nodes also have /export/ directories you can use. The directory should be the same as your username, for example jbloggs would create a directory with

  mkdir /export/a11/jbloggs

You can create such directories on multiple disks if you need.

Most "a", "b", and "c", machines have a large disk in /mnt/data which is also exported with names such as /export/a10. With the exception of a06 (which is on a08:/mnt/data2) and a07 (which is on a12:/mnt/data2), you can refer to the directory as either <node>:/mnt/data or /export/<node> in your scripts (you can verify the location of an export directory , i.e: ypcat -k auto.export | grep a10). All machines have a space in /tmp which you can, in principle, use for local scratch space, but be careful not to fill up /tmp space.

Avoid disks that have been largely occupied to avoid machine crashes; run diskinfo to find out which disks to use.

See File Transfer for more information about transferring files to/from the CLSP grid.

Heavy I/O use

If you need to do very I/O heavy jobs, that is, jobs that do a lot of reading from and writing to disk (particularly continuous access to disk, where the total I/O of the job would be in the tens of gigabytes or greater), consider running the jobs on the same machine where data is stored, so that you don't use all of the bandwidth of the host and server machines. (A lot of people move large data across filesystems by directly calling mv; that's very bad. Use rsync --bwlimit=2000 to avoid taking all the bandwidth.)

Memory mapping is one of the more antisocial things you can do on the grid, and is rarely a good idea, especially for large files that are located on nfs shares. The reason is that memory mapping a large region over an NFS share forces the transfer of the entire file in a single kernel call, which... well, it leads to bad stuff. The Linux kernel is very complicated and we don't fully understand why this causes machines to hang, but it does. Please don't do it. Memory mapping of a file that is stored locally *might* be OK, but it can also cause bad things. One issue is that during an attempt to memory-map a file, whether local or over NFS, a process goes into a 'disk sleep' state (`D` when you view the process with `ps`), and attempts to acquire certain process-related information or otherwise interact with the job will hang, which can sometimes interact badly with programs like `top` and cause them to hang. If at all possible, please avoid doing this.

For jobs that require significant grid-external network resources, including jobs that copy large files off or onto the grid, see File Transfer. The pipe we have between the grid and outside is quite small (one gigabit), so you have to be more careful with file transfer to/from outside.

Backups

The directories /export/aXX /export/bXX /export/cXX (and most other /export/ directories) are not backed up. They are RAIDed which reduces the time to failure, but they sometimes do die. If you have code or scripts which you can't afford to lose, it's your responsibility to back them up. If you put things on bdc01 (i.e. your home directory, if located there), they will be backed up if they are under 50GB, but don't run experiments from your home directory. Personally (Dan Povey) I use git version control for all my important files, which I host on github. This can also work from local repositories, for example, hosted on your home directory. You can alternatively periodically copy your code or scripts to a disk physically hosted on a different machine.

Running (CPU and GPU) jobs

Information about running jobs on the grid is split into general and GPU-specific information. To learn about running GPU jobs you first need to learn about running general (CPU) jobs:

  1. A gentle introduction to sge and qsub
  2. GPUs on the grid

File transfer

See File Transfer for information about transferring files to and from the grid.

(Programming) language details

The following pages contain important information about using specific programming languages on the grid:

Printing

See Printing for more information about using the printers at CLSP.

Wireless networks

See wireless for information about connecting to wireless networks.

Software licenses

See the JHU WSE software page for information about obtaining licenses for commercial software not already provided on the grid. (You must be connected on the campus network or via VPN to access this page.)

Mail client configuration (faculty/staff)

See the JHU WSE Exchange user instructions for help on configuring your mail client (faculty and staff only).