Using a remote linux machine: PART 2
This tutorial covers the remaining basics of linux: installing and running software, setting environment variables, and automating tasks with BASH scripts. See part 1 of the tutorial for prerequisites.
Installing and running software
When using the terminal, almost every command calls a program that is installed in a system directory. The command name corresponds to
the name of the program. For example, the cat
command runs the program “cat” installed in /usr/bin/cat
, the command
wget
runs the program “wget” installed in /usr/bin/wget
, etc. The reason why you do not need to type in the full path
every time you want to run the cat
or wget
commands is that the terminal is configured to search for executables in
the /usr/bin
directory. So, every program installed in /usr/bin
can be run from anywhere in the system with only its name.
Sidenote: a few commands, such as cd
or source
, are not separate programs. They are built-in features of the BASH environment
and are called “intrinsic commands”.
If using linux at home, you would likely use the system package manager to install software to /usr/bin
. However, writing to
/usr/bin
has side effects for all users on the system and requires administrator privileges. Thus, on our shared linux box,
you will be installing software into some folder under your home directory. I will now show how to create a designated software
directory, install something into it, and configure the terminal to search for executables in your custom installation directory.
First, create a folder for software in your home directory. Run:
mkdir ~/Software
The tilde (~
) is a shorthand for your home directory and expands to /home/<username>
. Open the newly created directory:
cd ~/Software
Now, let us install soot-dem here. First, the most recent version needs to be downloaded from version control. For more details, see the version control tutorial.
git clone https://github.com/egor-demidov/soot-dem.git
cd soot-dem
git submodule update --init
Now we will create a build directory in soot-dem and compile the program in it. For more details, see the C/C++ compilation tutorial.
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -G Ninja ..
ninja
If compilation was successful, there should be several executables now present in the build directory. List the files with ls
and you should see something like:
afm_breaking_necks
afm_necking_fraction
aggregate_deposition
aggregation
anchored_restructuring
mechanics_testing
restructuring
restructuring_breaking_necks
Each of these files is an executable corresponding to a certain DEM simulation type and each or them can be used as a command in the terminal. For example, to run a restructuring simulation while in the build directory, you could use:
./restructuring <path_to_input_file>
To make restructuring
a global command accessible from anywhere on the computer, the terminal needs to be configured to
look for executables in the ~/Software/soot-dem/build
directory. This can be done by adding this directory to the
PATH
environment variable. Environment variables are system variables that affect the behavior of the terminal and other
programs running in the system. PATH
is a special variable reserved for the terminal and it contains the list of paths
where to search for executables. As you write your own programs, you can define your own environment variables and expect the
user to set them. As we will see later, environment variables are very useful for automating tasks.
Now, back to adding the soot-dem executables to PATH
. There is a special file in the home directory called
.bashrc
, which is read every time a terminal session is started to initialize the user’s environment variables.
You want to add a line to the .bashrc
file that appends the current directory (where you built soot-dem) to PATH
.
To do this, run:
echo export PATH="$(pwd)":'$PATH' >> ~/.bashrc
Let’s break down what every part of this command does. The command echo
simply outputs its arguments back to the terminal.
The operator >>
captures the output and appends it to a file instead of printing it to the terminal. The file that is being
appended to is on the right hand side of operator >>
. The line export PATH="$(pwd)":'$PATH'
is being appended to ~/.bashrc
.
If you now read the last line of .bashrc
by running:
tail -n 1 ~/.bashrc
The output will be export PATH=/home/<username>/Software/soot-dem/build:$PATH
. The $(pwd)
statement was evaluated to the
working directory in which the echo
command was called. The line itself starts with the keyword export
, which means that
the variable PATH
that we set should be accessible to the child processes launched from the terminal. PATH=<value>
sets the
value of PATH
to the path with soot-dem executables. Lastly, since we do not want to overwrite paths that were previously stored
in the PATH
variable, we add :$PATH
at the end of the line.
In order for the changes to take effect, you need to re-initialize your terminal by running:
source ~/.bashrc
Now restructuring
, aggregate_deposition
, afm_necking_fraction
, etc. are commands that can be run from anywhere in the system.
Go to the folder with example input files under soot-dem/dist/soot-dem/example
and run a restructuring simulation:
cd ~/Software/soot-dem/dist/soot-dem/example
restructuring restructuring_input.xml
If installation was successful, the simulation should start running. You do not have to wait for the run to finish.
Use the Ctrl+C shortcut on your keyboard to terminate a running program.
The soot-dem suite uses the OMP_NUM_THREADS
environment
variable to determine the number of threads to use when executing parallelized tasks. If OMP_NUM_THREADS
is not set, soot-dem sets
the number of threads to the number of logical cores present in the system. Say you want to constrain soot-dem to two threads.
Then run it with:
export OMP_NUM_THREADS=2
restructuring restructuring_input.xml
The above can also be combined into a single command:
OMP_NUM_THREADS=2 restructuring restructuring_input.xml
Automation with BASH scripting
The advantage of command line tools, like soot-dem, is that their usage can be easily automated with scripts. A script is a sequence of terminal commands listed in a text file. Scripts can contain control statements, such as for loops and if statements. Let us run a batch of restructuring simulations with soot-dem, which was installed in the previous section.
First, download and extract the archive with input files that was created for this tutorial. Run:
wget https://www.edemidov.com/linux-tutorial.tar.gz
tar -xzvf linux-tutorial.tar.gz
rm linux-tutorial.tar.gz
A directory named “linux-tutorial” should appear. Its structure is as follows:
linux-tutorial
| | aggregate_bank
| | | aggregate_X.xml
|
| | restructuring_simulations
| | | common_all_simulations.xml
| | | aggregate_X
| | | | common_aggregate.xml
| | | | frac_Y
| | | | | restructuring_input.xml
Where X
is the range of integers 0, 1, 2 and Y
is the range of integers 0, 30, 70, 90. This folder structure represents
a matrix of simulations: there are X
different aggregates and for each aggregate you want to run restructuring simulations
with the necking fractions Y
. Adds up to 12 simulations. I am sure you would not want to launch every simulation manually,
wait for it to complete, then launch the next one, and so on…
Luckily, we can automate the tedious task of launching every simulation with this simple BASH script:
#!/bin/bash
# Do not continue if an error is encountered
set -e
# Necking fractions used in restructuring simulations
declare -a arr=(0 30 70 90)
# Open the restructuring simulation folder
cd restructuring_simulations
# Iterate over aggregates
for (( i = 0; i < 3; i ++ ))
do
cd aggregate_$i
# Iterate over necking fractions
for j in "${arr[@]}"
do
cd frac_$j
# Output a message to the user
echo "Processing aggregate_$i, frac_$j"
# Run restructuring
restructuring restructuring_input.xml | tee stdout.txt
cd ..
done
cd ..
done
# Return to the original folder
cd ..
Every statement in the script is a terminal command. For-loops are the only new feature. They are similar in semantics to
for loops in other programming languages. There are also several new commands used in this script. The command set -e
prevents the script from executing if an error is encountered. The “pipe” operator |
redirects the output of one command to another command.
Command tee
accepts input and splits it into two streams: it prints its input to the terminal and writes it to a text file
at the same time. We use tee
here so that you can see the output from soot-dem to track its progress in the terminal and at the same time
save the output to a file for record keeping.
Now, change into the linux-tutorial folder, copy the above script, and save it to a file run-restructuring.sh
. Text files
in the terminal can be edited with the nano
command. Write:
nano run-restructuring.sh
To start editing the file. Paste the contents of the script provided above, press Ctrl+O to save the file, and Ctrl+X to close
the nano
text editor. Now, we need to make the script executable. Run:
chmod a+x run-restructuring.sh
Finally, the script can be executed with:
./run-restructuring.sh
Running a program in the background
Any program you run from the terminal is by default attached to your terminal session. If you get disconnected, the program will
terminate. In order to avoid that and keep the program running until completion, use the screen
command when launching it.
screen <command>
You can then use the Ctrl+A followed by Ctrl+D shortcut to detach from the session. After detaching, you can safely exit
the session and
the program will keep running. To reattach to a detached program and see its output, use:
screen -R
To view all currently running programs, use top
(press q
to exit top
). Note that every running program has a unique identifier called the PID.
To stop a program that is running in the background, look up its PID with top
and then run:
kill <PID>
As the last exercise in this tutorial, execute the run-restructuring.sh
script created in the previous part.
This time, use the screen command, detach from the session, and log out of the remote machine with the simulation still running.