Egor Demidov

Using a remote linux machine: PART 2

This tutorial covers the remaining basics of linux: installing and running software, setting environment variables, and automating tasks with BASH scripts. See part 1 of the tutorial for prerequisites.

Our linux box info
Must be connected to NJIT VPN to access
Hostname: linux.edemidov.com
Port: 22

Installing and running software

When using the terminal, almost every command calls a program that is installed in a system directory. The command name corresponds to the name of the program. For example, the cat command runs the program “cat” installed in /usr/bin/cat, the command wget runs the program “wget” installed in /usr/bin/wget, etc. The reason why you do not need to type in the full path every time you want to run the cat or wget commands is that the terminal is configured to search for executables in the /usr/bin directory. So, every program installed in /usr/bin can be run from anywhere in the system with only its name. Sidenote: a few commands, such as cd or source, are not separate programs. They are built-in features of the BASH environment and are called “intrinsic commands”.

If using linux at home, you would likely use the system package manager to install software to /usr/bin. However, writing to /usr/bin has side effects for all users on the system and requires administrator privileges. Thus, on our shared linux box, you will be installing software into some folder under your home directory. I will now show how to create a designated software directory, install something into it, and configure the terminal to search for executables in your custom installation directory.

First, create a folder for software in your home directory. Run:

mkdir ~/Software

The tilde (~) is a shorthand for your home directory and expands to /home/<username>. Open the newly created directory:

cd ~/Software

Now, let us install soot-dem here. First, the most recent version needs to be downloaded from version control. For more details, see the version control tutorial.

git clone https://github.com/egor-demidov/soot-dem.git
cd soot-dem
git submodule update --init

Now we will create a build directory in soot-dem and compile the program in it. For more details, see the C/C++ compilation tutorial.

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -G Ninja ..
ninja

If compilation was successful, there should be several executables now present in the build directory. List the files with ls and you should see something like:

afm_breaking_necks
afm_necking_fraction
aggregate_deposition
aggregation
anchored_restructuring
mechanics_testing
restructuring
restructuring_breaking_necks

Each of these files is an executable corresponding to a certain DEM simulation type and each or them can be used as a command in the terminal. For example, to run a restructuring simulation while in the build directory, you could use:

./restructuring <path_to_input_file>

To make restructuring a global command accessible from anywhere on the computer, the terminal needs to be configured to look for executables in the ~/Software/soot-dem/build directory. This can be done by adding this directory to the PATH environment variable. Environment variables are system variables that affect the behavior of the terminal and other programs running in the system. PATH is a special variable reserved for the terminal and it contains the list of paths where to search for executables. As you write your own programs, you can define your own environment variables and expect the user to set them. As we will see later, environment variables are very useful for automating tasks.

Now, back to adding the soot-dem executables to PATH. There is a special file in the home directory called .bashrc, which is read every time a terminal session is started to initialize the user’s environment variables. You want to add a line to the .bashrc file that appends the current directory (where you built soot-dem) to PATH. To do this, run:

echo export PATH="$(pwd)":'$PATH' >> ~/.bashrc

Let’s break down what every part of this command does. The command echo simply outputs its arguments back to the terminal. The operator >> captures the output and appends it to a file instead of printing it to the terminal. The file that is being appended to is on the right hand side of operator >>. The line export PATH="$(pwd)":'$PATH' is being appended to ~/.bashrc. If you now read the last line of .bashrc by running:

tail -n 1 ~/.bashrc

The output will be export PATH=/home/<username>/Software/soot-dem/build:$PATH. The $(pwd) statement was evaluated to the working directory in which the echo command was called. The line itself starts with the keyword export, which means that the variable PATH that we set should be accessible to the child processes launched from the terminal. PATH=<value> sets the value of PATH to the path with soot-dem executables. Lastly, since we do not want to overwrite paths that were previously stored in the PATH variable, we add :$PATH at the end of the line.

In order for the changes to take effect, you need to re-initialize your terminal by running:

source ~/.bashrc

Now restructuring, aggregate_deposition, afm_necking_fraction, etc. are commands that can be run from anywhere in the system. Go to the folder with example input files under soot-dem/dist/soot-dem/example and run a restructuring simulation:

cd ~/Software/soot-dem/dist/soot-dem/example
restructuring restructuring_input.xml

If installation was successful, the simulation should start running. You do not have to wait for the run to finish. Use the Ctrl+C shortcut on your keyboard to terminate a running program. The soot-dem suite uses the OMP_NUM_THREADS environment variable to determine the number of threads to use when executing parallelized tasks. If OMP_NUM_THREADS is not set, soot-dem sets the number of threads to the number of logical cores present in the system. Say you want to constrain soot-dem to two threads. Then run it with:

export OMP_NUM_THREADS=2
restructuring restructuring_input.xml

The above can also be combined into a single command:

OMP_NUM_THREADS=2 restructuring restructuring_input.xml

Automation with BASH scripting

The advantage of command line tools, like soot-dem, is that their usage can be easily automated with scripts. A script is a sequence of terminal commands listed in a text file. Scripts can contain control statements, such as for loops and if statements. Let us run a batch of restructuring simulations with soot-dem, which was installed in the previous section.

First, download and extract the archive with input files that was created for this tutorial. Run:

wget https://www.edemidov.com/linux-tutorial.tar.gz
tar -xzvf linux-tutorial.tar.gz
rm linux-tutorial.tar.gz

A directory named “linux-tutorial” should appear. Its structure is as follows:

linux-tutorial
| | aggregate_bank
| | | aggregate_X.xml
|
| | restructuring_simulations
| | | common_all_simulations.xml
| | | aggregate_X
| | | | common_aggregate.xml
| | | | frac_Y
| | | | | restructuring_input.xml

Where X is the range of integers 0, 1, 2 and Y is the range of integers 0, 30, 70, 90. This folder structure represents a matrix of simulations: there are X different aggregates and for each aggregate you want to run restructuring simulations with the necking fractions Y. Adds up to 12 simulations. I am sure you would not want to launch every simulation manually, wait for it to complete, then launch the next one, and so on…

Luckily, we can automate the tedious task of launching every simulation with this simple BASH script:

#!/bin/bash

# Do not continue if an error is encountered
set -e

# Necking fractions used in restructuring simulations
declare -a arr=(0 30 70 90)

# Open the restructuring simulation folder
cd restructuring_simulations

# Iterate over aggregates
for (( i = 0; i < 3; i ++ ))
do
    cd aggregate_$i
    # Iterate over necking fractions
    for j in "${arr[@]}"
    do
        cd frac_$j
        # Output a message to the user
        echo "Processing aggregate_$i, frac_$j"
        # Run restructuring
        restructuring restructuring_input.xml | tee stdout.txt
        cd ..
    done
    cd ..
done

# Return to the original folder
cd ..

Every statement in the script is a terminal command. For-loops are the only new feature. They are similar in semantics to for loops in other programming languages. There are also several new commands used in this script. The command set -e prevents the script from executing if an error is encountered. The “pipe” operator | redirects the output of one command to another command. Command tee accepts input and splits it into two streams: it prints its input to the terminal and writes it to a text file at the same time. We use tee here so that you can see the output from soot-dem to track its progress in the terminal and at the same time save the output to a file for record keeping.

Now, change into the linux-tutorial folder, copy the above script, and save it to a file run-restructuring.sh. Text files in the terminal can be edited with the nano command. Write:

nano run-restructuring.sh

To start editing the file. Paste the contents of the script provided above, press Ctrl+O to save the file, and Ctrl+X to close the nano text editor. Now, we need to make the script executable. Run:

chmod a+x run-restructuring.sh

Finally, the script can be executed with:

./run-restructuring.sh

Running a program in the background

Any program you run from the terminal is by default attached to your terminal session. If you get disconnected, the program will terminate. In order to avoid that and keep the program running until completion, use the screen command when launching it.

screen <command>

You can then use the Ctrl+A followed by Ctrl+D shortcut to detach from the session. After detaching, you can safely exit the session and the program will keep running. To reattach to a detached program and see its output, use:

screen -R

To view all currently running programs, use top (press q to exit top). Note that every running program has a unique identifier called the PID. To stop a program that is running in the background, look up its PID with top and then run:

kill <PID>

As the last exercise in this tutorial, execute the run-restructuring.sh script created in the previous part. This time, use the screen command, detach from the session, and log out of the remote machine with the simulation still running.

Copyright © 2024 Egor Demidov. Content on this website is made available under CC BY 4.0 licence unless specified otherwise