Installing Python

This chapter contains some instructions about how to install Python on your university account (mandatory exercise for the UIBK students) and some advice on how to install it on your personal computer. Python is free, so download it and try it at home!

Which Python version?

Short answer

As of today (Oct 2, 2019), the latest stable major release of Python is Python version 3.7. Python versions 3.6 and 3.7 are still actively maintained, Python 3.8 will soon be released.

Long answer

If you did some online search about Python, you probably stumbled upon some discussion about the "Py2 vs Py3 problem". About 10 years ago, an evolution of the python language introduced changes which were not backwards compatible. This means that code written in Python 2.7 may not work in Python 3. According to the Python developers this change was necessary to get rid of old baggage and inconsistent syntax, but it was also considered by many as a "treason". Indeed, backwards compatibility is an important aspect of software development for large projects like python: it is a contract between developers and users stipulating that the code that users write will continue to work in subsequent releases. This contract was broken once, but this shouldn't happen again.

Since updating code requires time and money, many people and companies continued to use Python 2 internally. Five years ago, I would maybe have been reluctant to teach you Python 3 because the future of the language was uncertain and many tools existed only in Python 2. Today the wide adoption of Python 3 is finally accomplished and the future is looking very good. All major scientific python tools are supporting Python 3, and major libraries announced that they will stop to support Python 2 before of shortly after the end-of-life date of Python 2 (by 2020).

In this lecture we might talk about the differences between Python 2 and Python 3 at some point, but it's unlikely. You can safely concentrate on learning Python 3 only.

Installation at the University of Innsbruck

In this section we use the powerful linux PATH variable to "install" the same Python program in your personal Linux account. This is done in a couple of easy steps.

The default python on linux systems

Open a terminal and type:

$ python

This should have started the python command line, looking like this:

Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

The >>> is a prompt, waiting for you to type python commands. Type [CTRL+D] to exit.

Python is available per default on virtually any linux system, for the simple reason that many programs actually use python internally. However, as you can see, it runs python version 2. We can also run python 3 with a simple:

$ python3

Now, shouldn't this python program be enough? For simple usages like today's examples, yes it would suffice. But as we are going to see, python alone isn't really useful for us scientists: we need so-called third party packages, tools developed with and for python, but not available per default. While some of them are so important that they are available as linux packages, it is much easier (and safer) to use our own python installation for installing those. This is the primary purpose of a personalized python installation.

Using a custom (pre-installed) python

It turns out I have "installed" a more recent python version (and some additional packages) on a shared repository available to everyone with a UIBK account. I wrote "installed" with quotation marks because, in linux, the definition of "installed" is subjective. Let's say that I've put the python executables somewhere where everybody can see them. Let's try it:

$ /project/c7071047/miniconda3/bin/python
Python 3.7.3 | packaged by conda-forge | (default, Jul  1 2019, 21:52:21) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

You should have been able to launch a new python interpreter: this version is more recent and provided by Anaconda, Inc. This is the python I'd like you to use from now on (when at the University - for your PC see the next section).

Note: the /project folder is a shared folder. I don't know where the device is located physically, but it definitely isn't in the computer room. This is yet another powerful feature of linux systems: shared folders look just like normal folders. Quite practical, huh? Now this may have one simple drawback: launching this python might be a bit slow sometimes (especially if all of you try to open it at the same time). Most of the time this shouldn't be a big problem, though.

Making this change "permanent"

Nobody wants to write such a long command to start python. So what we are going to do is to add the /project/c7071047/miniconda3/bin/ folder to our PATH, and in such a way that it is remembered for your later sessions.

We are going to edit a special file in your HOME, ~/.bashrc. This file contains a list of commands which are executed automatically each time you open a terminal. How practical! Let's open this file and add the following two lines at the end of it:

# added for the python course:
export PATH="/project/c7071047/miniconda3/bin:$PATH"

CAREFUL! This will add a folder to the PATH variable, which is a fundamental element in linux: make sure that you add these two lines (and exactly these lines: respect the upper- and lower-case as well!) to your .bashrc

What did we just do? We added a folder to the PATH (remember what this is?) and added it at the beginning of it. This is important because linux is going to look for programs in order: if a python executable is found in the first folder, no need to look for another one (i.e. ignore the default linux program).

Note: for this change to take effect, close your terminal and open a new one! ~/.bashrc is executed only once, at the opening of a new terminal.

If everything worked fine, after typing python in the command line you should be given the most recent version I prepared for you.

Installing Python on your PC

Note: if you don't want to install python on your PC or want to do it later, skip this section.

Unlike Matlab or IDL, Python's scientific track is not installed "out of the box" with a single installation file.

Fortunately, there are very useful tools out there to help us out. The most useful is Miniconda, which will help us to install both Python and the packages we need within a couple of minutes. The first installation step is platform specific (Windows, Linux, or Mac) while the other steps are the same on all platforms.

Install Miniconda

Got to the miniconda website and download the latest python 3.X installer for your platform (be careful to check whether you need a 64- or 32-bit version). As of today (Oct 2, 2019), Python versions 3.6 and 3.7 should work fine.

The installation is really easy and described here. Choose an installation directory where you have enough space available (conda installations can quickly grow larger than a few GB).

To see if everything worked well, close the terminal window you were using and open a new one (on Windows, a terminal is called a command prompt) , and type in:

conda update conda

If you type:

python

A new python prompt should appear, with something like:

Python X.X.X |Continuum Analytics, Inc.| (default, Oct 19 2015, 21:52:17) 
xxxx
Type "help", "copyright", "credits" or "license" for more information.
>>>

You can type exit() or [CTRL+D] to get out of the python prompt.

Optional: make a new environment called "py3"

Conda also helps us to define so-called "environments". A conda environment is a directory that contains a specific collection of packages that you have installed.

Environments are not necessary, but helpful on the long term (helpful e.g. if some packages are in conflict with each other). If this is your first installation you probably want to skip this part and go directly to Install the packages. Please have a short look at the conda introduction here before going on.

In the terminal, type:

conda create -n py3 python=3.7

This created a new environment called "py3" which has python V3.6 as default interpreter. This environment can be activated with one command:

source activate py3

or on windows:

activate py3

Note that after activation, the prompt changed to something like [py3]xxx>. Once in a specific environment, all the packages we install will be available in this environment only. This is very useful if you need different versions of python for different projects for example.

Don't forget to activate this environment every time you want to work on the exercises or when you want to install a new package. You can deactivate the current environment with:

source deactivate

or on windows:

deactivate

Install the packages

First we are going to tell conda where to look for packages to install:

conda config --add channels conda-forge

This has to be done only once. conda-forge is a package repository, once you have set it up as default you don't have to worry about it any more.

There are a couple of packages that you will always need, whatever you are working on. These are: ipython, Jupyter, numpy, scipy, matplotlib.

To install them, type:

conda install numpy scipy matplotlib ipython jupyter

This will download the packages and install them, as well as their dependencies. This can take a while! If successful, you should now be able to start ipython for example:

ipython

Use [CTRL+D] to close the interpreter and get back to the prompt.

We will need some other packages later on, but this will get you started.

Optional: update the packages

After a while, it might be useful to update the packages you are using (careful! this is most of the times a good idea, but not always*).

If not already done before, set the conda-forge default channel:

conda config --add channels conda-forge

And then simply update them all:

conda update --all

*: some updates might include changes which change the way functions are called, or -even worse- change your results. This is usually not the case for large, well-known software packages, but it can happen. The best way to ensure consistency are tests - as we are going to learn - and getting information about the package updates.

Troubleshooting package installation

Although conda and conda-forge have greatly improved the process of installing python packages, some issues remain. The main problems that occur at installation are related to so-called "external libraries", written in C or FORTRAN, that some python packages require to run properly. Typical examples of "problem packages" in the geosciences are rasterio and geopandas: both rely on the external libraries PROJ and gdal.

Most of the time, the installation will work without error, but problems will happen at runtime. One typical problem will be of the kind:

import rasterio
ImportError: lib-blabla.so.66: cannot open shared object file: No such file or directory

It is not easy to solve these issues. One way worth trying is to set $ conda config --set channel_priority strict as suggested in the conda-forge documentation and install rasterio again.

If this doesn't work, google might be the only solution, together with trying to install an older package version.

Take home points

  • python is available per default on virtually any linux distribution
  • it is not available per default on windows, but can be installed
  • the system python is useful for scripting or simple tasks, but for real-world data analysis you probably want to install python (and related packages) on your own.
  • the best tool to install a custom python is conda, installed via miniconda
  • on linux, a python "installation" is simply a link to a new python executable

What's next?

License