HPC novice

Logging in

Overview

Teaching: 45 min
Exercises: 5 min
Questions
  • What is a super computer?

  • Where is a super computer?

  • How do I connect to it?

  • How do I transfer data to and from it?

Objectives
  • Use ssh to open an interactive shell on a cluster.

  • Inspect a directory with ls.

  • Transfer a file to the cluster.

  • Transfer files/folders from the cluster to your local machine.

Through out this material, we will assist Lola Curious and look over her shoulder while she is starting to work at the Institute of Things as a side job to earn some extra money during her studies..

On the first day, her supervisor greets her friendly and welcomes her to the job. He explains what her task is and suggests her that she will need to use the HPC cluster on the campus. Lola has so far used her Laptop at home for her studies, so the idea of using a super computer appears a bit intimidating to her. Her supervisor notices her anxiety and tells her that she will receive an introduction to the super computer after she has requested an account on the cluster. The word super computer however doesn’t bring Lola to ease.

Lola walks to the IT department and finishes the paper work to get an account. One of the admins, called Rob, promises to sit down with her in the morning to show her the way around the machine. And as Lola expected, they don’t own a super computer. Rob explains that Lola will use a small to mid-range HPC cluster.

A Super Computer ?

Generally, a super computer refers to the worlds fastest machines available irrespective of their design but with the limitations that they need to be general purpose. Smaller computers of similar design than the above are commonly referred to as High performance computing (HPC) farms, batch farms, HPC clusters etc. A list of the fastest super computers is available on top500.org.

First of all, Rob asks Lola to connect to the super computer. So Rob asks Lola to open a terminal on her laptop and type in the following commands:

$ ssh lola@cray-1

Logging in

If you do this material on your own, be sure to replace lola with the username that is attributed to you on cray-1. When you hit enter, a prompt like this might appear:

lola@cray-1's password:

Now is your chance to type in your password. But watchout, the characters you type are not displayed on the screen.

Last login: Tue Mar 14 14:13:14 2017 from lolas_laptop
-bash-4.1$ 

Rob explains to Lola that she is using the secure shell or ssh. This establishes a temporary encrypted connection between Lola’s laptop and cray-1. The word before the @ symbol, e.g. lola here, is the user account name that Lola has access permissions for on the cluster.

Where do I get this ssh from ?

On Linux and/or macOS, the ssh command line utility is typically pre-installed. Just open a terminal and you are good to go. At the time of writing, the openssh support on microsoft is still pretty recent. Alternatives to this are putty, bitvise SSH or mRemoteNG. Download it, install it and open the GUI. They typically ask for your user name and the destination address or IP. Once provided, you will be queried for your password just like in the example above.

Rob tells her to use a UNIX command called ls (for list directory contents) to have a look around.

$ ls

To no surprise, there is nothing in there. Rob asks Lola to issue a command to see on what machine she currently is on.

$ hostname
cray-1

Lola wonders a bit what this may be about, that you need a dedicated command to tell you where you are, but Rob explains to her that he has so many machines under his responsibility, that the output of hostname is often very valuable.

Am I in the cloud now?

Not really, sorry. At the time of writing, there are a couple of distinctive features that separate cloud computing from HPC.

Rob explains to Lola that she has to work with this remote shell session in order to run programs on the HPC cluster. Launching programs that open a Graphical User Interface (GUI) is possible, but the interaction with the GUI will be slow as everything will have to get transferred through the WiFi network her laptop is currently logged into. Before Rob continues, he suggests to leave the cluster node again. For this, Lola can type in logout or exit.

$ logout

He continues to explain, that typically people perform computationally heavy tasks on the cluster and prepare files that contain the results or a subset of data to create final results on the individuals laptop. So communication to and from the cluster is done mostly by transferring files. For example, Rob asks Lola to use a file of her liking and transfer it over. For this, he advises her to use the secure copy command, scp. As before, this establishes a secure encrypted temporary connection between Lola’s laptop and the cluster just for the sake of transferring the files. After the transfer has completed, scp will close the connection again.

$ scp todays_canteen_menu.pdf lola@cray-1:todays_canteen_menu.pdf
todays_canteen_menu.pdf                                              100%   28KB  27.6KB/s   00:00

She can now ssh into the cluster again and check, if the file has arrived after she just uploaded it:

$ ssh lola@cray-1
Last login: Tue Mar 14 14:17:44 2017 from lolas_laptop
-bash-4.1$ ls
todays_canteen_menu.pdf

Great. Now, let’s try the other way around, i.e. downloading a file from the cluster to Lola’s laptop. For this, Lola has to swap the two arguments of the scp command she just issued.

$ scp lola@cray-1:todays_canteen_menu.pdf todays_canteen_menu_downloaded.pdf

Lola notices how the command line changed. First, she has to enter the source (lola@cray-1) then put a : and continue with the path of the file she wants to download. After that, separated by a space, the destination has to be provided, which in this case is a file todays_canteen_menu_downloaded.pdf in the current directory.

todays_canteen_menu.pdf                                                100%   28KB  27.6KB/s   00:00

Lola has a look in the current directory and indeed todays_canteen_menu_downloaded.pdf. She opens it with her pdf reader and can tell that it contains indeed the same content as the original one. Rob explains that if she would have used the same name as the destination, i.e. todays_canteen_menu.pdf, scp would have overwritten her local copy.

To finish, Rob tells Lola that she can also transfer entire directories. He prepared a temporary directory on the cluster for her under /tmp/this_weeks_canteen_menus. He asks Lola to obtain a copy of the entire directory onto her laptop.

$ scp -r lola@cray-1:/tmp/this_weeks_canteen_menus .
canteen_menu_day_2.pdf                                                 100%   28KB  27.6KB/s   00:00    
canteen_menu_day_3.pdf                                                 100%   28KB  27.6KB/s   00:00    
canteen_menu_day_5.pdf                                                 100%   28KB  27.6KB/s   00:00    
canteen_menu_day_4.pdf                                                 100%   28KB  27.6KB/s   00:00    
canteen_menu_day_1.pdf                                                 100%   28KB  27.6KB/s   00:00

The trailing . is a short-hand to signify the current working directory that Lola calls scp from. When inspecting the current directory, Lola sees the transferred directory:

$ ls
this_weeks_canteen_menus/  todays_canteen_menu_downloaded.pdf  todays_canteen_menu.pdf

A closer look into that directory using the relative path with respect to the current one:

$ ls this_weeks_canteen_menus/

reveals the transferred files.

canteen_menu_day_1.pdf  canteen_menu_day_2.pdf  canteen_menu_day_3.pdf  canteen_menu_day_4.pdf  canteen_menu_day_5.pdf

Rob suggests to Lola to consult the man page of scp for further details by calling:

$ man scp

All mixed up

Lola needs to obtain a file called results.data from a remote machine that is called safe-store-1. This machine is hidden behind the login node cray-1. However she mixed up the commands somehow that are needed to get the file onto her laptop. Help her and rearrange the following commands into the right order!

$ ssh lola@cray-1
$ logout
$ scp lola@cray-1:results.data .
$ scp lola@safe-store-1:results.data .

Solution

$ ssh lola@cray-1
$ scp lola@safe-store-1:results.data .
$ logout
$ scp lola@cray-1:results.data .

Who is hanging around ?

The w utility displays a list logged-in users and what they are currently doing. Use it to check:

  1. that nobody but yourself is logged into your laptop/desktop
  2. that a lot of people use the login node of your cluster cray-1

Where did they go ?

Rob has a zip file stored under /tmp/passwords.zip on the login node of the cluser cray-1. He wants to unzip it on his laptop under /important/passwords. How does he do that?

  1. $ ssh rob@cray-1
    $ unzip /tmp/passwords.zip
    
  2. $ scp cray-1@rob:/tmp/passwords.zip .
    $ unzip passwords.zip
    
  3. $ cd /important/passwords
    $ scp rob@cray-1:passwords.zip .
    $ unzip passwords.zip
    
  4. $ cd /important/passwords
    $ scp rob@cray-1:/tmp/passwords.zip .
    $ unzip passwords.zip
    

Solution

  1. No: Rob only unpacks the zip file, but does not transfer the unpacked files onto his laptop
  2. No: Rob mixed up the syntax for scpc
  3. No: Rob did not specify the correct path of /tmp/passwords.zip on the login node of the cluser cray-1
  4. Yes: you may also use unzip foo.zip -d /somewhere if you want to omit the first command

Key Points