This lecture is about programming, but it would be a big mistake to forget to talk about Linux here, at least a short introduction.
Every student in (geo-)sciences should know about the existence of Linux and know the basics of it. Atmospheric Sciences students in particular will have to use it sooner or later, since most of the tools and data they are using are running or have been created on Linux systems.
The web is going to give a better answer than me to this question.
My very personal answer is a clear: yes! I can't force you to it though, and most of the exercises of this lecture can be done on a Windows computer too. If you are running Mac OS X then you have access to a linux terminal anyway and you can ignore my recommendation (even if I don't like Apple for several other reasons - totally subjective opinion again).
Linux has always been an environment for programmers and has the reputation of being "geeky" and "complicated". This is less true today, with many Linux distributions becoming mainstream and easy to use (my personal favorite is Linux Mint, but you probably mostly heard of Ubuntu which I also recommend).
I believe that Linux is much more user-friendly than Windows: once some of the particularities of Linux are understood (which can be frustrating in the beginning since it works quite differently than windows), it appears that there is much less "hidden" in Linux than in Windows (especially when it comes to installing/deinstalling software, using the command line, or protecting yourself from viruses and intruders).
As a scientist and programmer there are many reasons to prefer Linux (or Mac OS X) to Windows. Let me list some reasons here:
There are also some reasons not to use Linux of course. For one, Linux can be surprising at first and you'll need some time to get used to it. Also, it doesn't come pre-installed on most computers and you'll have to do it yourself (this is going to change: several retailers are already selling cheaper Ubuntu laptops to students). It is possible (but unlikely) that some of your hardware will not be compatible with Linux. Finally, some programs simply aren't available on Linux: Microsoft Office, the Adobe Suite, and games are the most prominent examples. The free and open-source alternatives (LibreOffice and GIMP) are good, but not as good as their commercial counterpart. I used Linux exclusively for more than 10 years now, and I never regretted it. Recently I had to install Windows in a virtual machine in order to use the MS Office software provided by the University, and it works perfectly. Another option is to use wine.
Here are a few ways you can use Linux and make an idea by yourself:
Copyright notice: this section was largely inspired from the first parts of Michael Stonebank's UNIX tutorial
To open a terminal window, click on the "Terminal" icon from the Applications/System Tools
menu. You can add an icon to your "quick launch" taskbar simply by dragging the icon to it.
A terminal window should appear with a $
prompt, waiting for you to start entering commands. The command line has a very important role in linux (as compared to windows where nobody uses it) since many tasks can be done much more efficiently with simple commands.
ls (list)
When you first login, your current working directory is your home directory. Your home directory has the same name as your username, for example, c7071047, and it is where your personal files and subdirectories are saved.
To find out what is in your home directory, type:
$ ls
The ls
command lists the contents of your current working directory.
ls
does not, in fact, cause all the files in your home directory to be listed, but only those ones whose name does not begin with a dot (.) Files beginning with a dot (.) are known as hidden files and usually contain important program configuration information. They are hidden because you should not change them unless you know what you do.
To list all files in your home directory including those whose names begin with a dot, type:
$ ls -a
As you can see, ls -a
lists files that are normally hidden.
ls
is an example of a command which can take options: -a
is an example of an option. The options change the behaviour of the command. There are online manual pages that tell you which options a particular command can take, and how each option modifies the behaviour of the command. ls -lh
is an other way to call ls
with two options, l
for "listing format" and h
for "human readable".
Note: linux file names and commands are case sensitive, i.e. Test.txt
is different from test.txt
, and both names could coexist in the same directory.
All the files are grouped together in the directory structure. The file-system is arranged in a hierarchical structure, like an inverted tree. The top of the hierarchy is traditionally called root (written as a slash /
)
When loging in we are automatically located in our personal home
directory, which is aptly named because:
home
during our exerciseshome
, while we are not allowed to write, delete or change things in the other directories.To know where we are in the directory structure, there is a useful command:
$ pwd
which prints the path of the current working directory.
mkdir (make directory)
We will now make a subdirectory in your home directory to hold the files you will be creating and using in the course of this tutorial. To make a subdirectory called unixstuff
in your current working directory type:
$ mkdir unixstuff
To see the directory you have just created, type:
$ ls
cd (change directory)
The command cd directory
means change the current working directory to "directory". The current working directory may be thought of as the directory you are in, i.e. your current position in the file-system tree.
To change to the directory you have just made, type:
$ cd unixstuff
Type ls
to see the contents (which should be empty)
Exercise: make a directory called backup
in the unixstuff
directory
Still in the unixstuff
directory, type:
$ ls -a
As you can see, in the unixstuff directory (and in all other directories), there are two special directories called (.) and (..)
The current directory (.)
In linux, (.) means the current directory, so typing:
$ cd .
means "stay where you are" (the unixstuff directory).
This may not seem very useful at first, but using (.) as the name of the current directory will save a lot of typing, as we shall see.
The parent directory (..)
(..) means the parent of the current directory, so typing:
$ cd ..
will take you one directory up the hierarchy (back to your home directory). Try it now.
Note: typing cd with no argument always returns you to your home directory. This is very useful if you are lost in the file system.
~ (your home directory)
Home directories can also be referred to by the tilde ~
character. It can be used to specify paths starting at your home directory. So typing:
$ ls ~/unixstuff
will list the contents of your unixstuff directory, no matter where you currently are in the file system.
cp (copy)
cp file1 file2
is the command which makes a copy of file1
in the current working directory and calls it file2
What we are going to do now, is to take a file stored in an open access area of the file system, and use the cp
command to copy it to your unixstuff directory.
First, cd
to your unixstuff directory:
$ cd ~/unixstuff
Then type:
$ cp /scratch/c707/c7071047/tuto/science.txt .
Note: Don't forget the dot . at the end. Remember, in linux, the dot means the current directory.
The above command means "copy the file science.txt to the current directory, keeping the name the same".
Note: The directory /scratch/c707/c7071047/tuto
is an area to which everyone in the University has read and copy access. If you are from outside the University, you can grab a copy of the file from the internet. For this, you can use another very useful command, wget
:
$ wget http://www.ee.surrey.ac.uk/Teaching/Unix/science.txt
This will download the file science.txt
to your current directory
Exercise: Create a backup of your science.txt
file by copying it to a file called science.bak
Note: directories can also be copied with the -r
option added to cp
.
[TAB]
is very useful in the Linux (and other) command line: it use an automated completion algorithm to complete the commands you are typing. For example, try to type $ cd ~/uni
and then TAB
. This is also going to make suggestions in case of multiple choices, or for commands.
mv (move)
mv file1 file2
moves (or renames) file1
to file2
To move a file from one place to another, use the mv
command. This has the effect of moving rather than copying the file, so you end up with only one file rather than two.
It can also be used to rename a file, by moving the file to the same directory, but giving it a different name.
We are now going to move the file science.bak
to your backup
directory.
First, change directories to your unixstuff
directory. Then type:
$ mv science.bak backup/
Type ls
and ls backup
to see if it has worked.
rm (remove), rmdir (remove directory)
To delete (remove) a file, use the rm
command. As an example, we are going to create a copy of the science.txt
file and then delete it.
Inside your unixstuff
directory, type:
$ cp science.txt tempfile.txt
$ ls
$ rm tempfile.txt
$ ls
You can use the rmdir
command to remove a directory (make sure it is empty first). Try to remove the backup
directory. You will not be able to since linux will not let you remove a non-empty directory. To delete a non-empty directory with all its subdirectories you can use the option -r
(r for recursive):
$ rm -r /path/to/some/directory
This command will then ask you confirmation for certain files judged important. If you are very sure of what you do, you can add a -f
to the command (f for force):
$ rm -rf /path/to/some/directory/that/i/am/very/sure/to/delete
Note: directories deleted with rm
are lost forever. They don't go to the trash, they are just deleted.
clear (clear screen)
Before you start the next section, you may like to clear the terminal window of the previous commands so the output of the following commands can be clearly understood.
At the prompt, type:
$ clear
This will clear all text and leave you with the $
prompt at the top of the window.
cat (concatenate)
The command cat
can be used to display the contents of a file on the screen. Type:
$ cat science.txt
As you can see, the file is longer than than the size of the window. You can scroll back but this is not very useful.
less
The command less
writes the contents of a file onto the screen a page at a time. Type:
$ less science.txt
Press the [space-bar]
if you want to see another page, and type [q]
if you want to quit reading. As you can see, less
is used in preference to cat
for long files.
head
The head
command writes the first ten lines of a file to the screen.
First clear the screen then type:
$ head science.txt
Then type:
$ head -5 science.txt
What difference did the -5
do to the head command?
tail
The tail command writes the last ten lines of a file to the screen.
Clear the screen and type:
$ tail science.txt
Exercise: How can you view the last 15 lines of the file?
Simple searching using less
Using less
, you can search though a text file for a keyword (pattern). For example, to search through science.txt
for the word "science", type:
$ less science.txt
then, still in less
, type a forward slash [/]
followed by the word to search:
/science
And tape [enter]
. Type [n]
to search for the next occurrence of the word.
grep
grep
is one of many standard linux utilities. It searches files for specified words or patterns. First clear the screen, then type:
$ grep science science.txt
As you can see, grep
has printed out each line containg the word science.
Or has it ????
Try typing:
$ grep Science science.txt
The grep
command is case sensitive; it distinguishes between Science and science.
To ignore upper/lower case distinctions, use the -i
option, i.e. type:
$ grep -i science science.txt
To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe symbol). For example to search for spinning top, type:
$ grep -i 'spinning top' science.txt
Some of the other options of grep
are:
-v
display those lines that do NOT match-n
precede each matching line with the line number-c
print only the total count of matched linesTry some of them and see the different results. Don't forget, you can use more than one option at a time. For example, the number of lines without the words science or Science is:
$ grep -ivc science science.txt
We have only shown some examples of how to navigate in the linux directory structure and use some simple commands. This is by far not sufficient to demonstrate its usefulness: the grep
command for example is extremely powerful to help you find files that you thought you had lost, and wget
a file to a current directory is often much faster than downloading it in firefox and copy it afterwards. For the time being, this short tutorial should help get you started.
To go further, I recommend to follow Ryan's linux tutorials, they are excellent!
Now that you are familiar with the command line, you must have noticed that linux commands have some similarities with the statements you used in your bachelor programming course: by typing them in, the computer does things for you and gives you information back (often printed on screen, but not always). When you want to automate a series of commands you just typed in (for example renaming and moving files), a natural thing would be to write a script to do so.
Scripts are the simplest type of program one can write, and this can be seen as your first programming experience in this course. We are going to write a bash script and execute it: why we want to do this in a python scientific programming course may be not clear right now, but it will make more sense later I promise.
In your unixstuff
directory, create a new file name myscript.sh
. You can do this using the default (graphical) text editor in linux, gedit
. At the command line:
`
$ gedit myscript.sh
Just type one simple line in the file:
echo Hello World!
Quit your editor and go back to the command line again. You can execute your script with the following command:
$ bash myscript.sh
Bash is a so-called "interpreter". It reads your file and understands how to execute the commands in it.
List the files in your directory, but whith the option -l
for more information. Here is how it looks like on my computer:
mowglie@flappi-top ~/Documents/unixstuff $ ls -l
total 28
drwxrwxr-x 2 mowglie mowglie 4096 Feb 22 18:05 backup
-rw-rw-r-- 1 mowglie mowglie 19 Feb 22 18:44 myscipt.sh
-rw-rw-r-- 1 mowglie mowglie 7767 Sep 1 2000 science.txt
The first list of characters indicate the file's permissions. Now read the first section of Ryan's tutorial about permissions. So what do we learn from the above? That the file's owner (myself) is allowed to read and write the myscript.sh
file, but not to execute it. Let's change this:
$ chmod a+x myscipt.sh
Now everybody (including me) is allowed to execute this file. It is a quite harmless script, so I'm not too worried. Now we can execute it:
$ ./myscript.sh
Nice! We could add much more commands to our script (possibly making it more useful), but this was enough to illustrate the point I wanted to make: files can be executable in Linux, and a whole new world opens to us.
Note: Most often, it is recommended to add a specific first line to your script, called a shebang. This line tells the computer which interpreter should be used to run the file. In our case it is the default interpreter (bash), but this may not always be the case. To be entirely explicit, we recommend to always add a shebang to your script. In this case, we should add #!/bin/bash
to our script:
#!/bin/bash
# Maybe some comment line about the purpose of this script
echo Hello World!
It is possible to write simple programs in linux called "scripts". These scripts can be executed by the default interpreter, and be made "executable" very easily. If you want to know more about bash scripts, read Ryan's tutorial about the subject!
You may have asked yourself: how does the linux command line know about the commands we are using? Where do I find them? A nice program helping us to find out is which. Let's use it:
$ which less
/usr/bin/less
Now we can use another command, whatis, to tell us what we just did:
$ whatis which
The which
command tells us that the less
program is located in /usr/bin
. We can also ask which
where to find which
:
$ which which
/usr/bin/which
$ ls -l /usr/bin/which
lrwxrwxrwx 1 root root 10 Nov 19 14:42 /usr/bin/which -> /bin/which
As expected, which
is an executable file. Some executables are binary files (i.e. not human readable), but in this case which
is actually a script:
$ less which
Exercise: scroll through the script. Can you locate the shebang line? How does the code look like to you: easily understandable, or rather cryptic?
The linux command line knows so-called environmental variables. The echo
command can display their value:
$ echo $PATH
(the $
tells echo to display the value of the variable PATH rather than its name). The PATH variable contains a list of paths, separated by semicolons. It is an extremely important variable: it tells linux where to look for executables and programs. Each time you type a command, linux will look in the PATH and see if it finds it. This is an extremely flexible and powerful system, as we are going to see in the next chapter (Installing Python).
The PATH can be extended to contain any other directory you find useful to add. Let's do it:
$ mkdir ~/myprograms
$ export PATH=$PATH:~/myprograms
export
creates a new variable called PATH (in this case, it overwrites the existing PATH).
Exercise: now move your executable script myscript.sh
in the myprograms
directory. Verify that you can now execute the myscript.sh
program from any directory.
Note: we have extended the PATH
variable in this session only. If you close and reopen the terminal the changes will be lost. There are ways to make these changes permanent, we will learn them in the next chapter.
Note: if a program is listed in several path directories, linux takes the first instance. Directories can be prepended and appended, as we will see in the next chapter.
The PATH
variable tells linux where to look for programs and scripts. This is a very simple and powerful way to customize linux with your own scripts and programs. We are going to use this feature to install Python in the next chapter.
Back to the table of contents, or jump to the next chapter.