Installing Python
Contents
Installing Python#
This chapter contains some instructions about how to install Python on your personal computer and at the University.
In previous years, it was possible to use Python at the University’s Computer lab only. In these rather special circumstances, it is probably easier to also install Python on your laptop.
Preamble: which Python version?#
Short answer: the latest stable major release of Python is Python version 3.9.
Long answer
If you did some online search about Python, you probably stumbled upon some discussion about the “Py2 vs Py3 problem”. About 10 years ago, an evolution of the python language introduced changes which were not backwards compatible. This means that code written in Python 2.7 may not work in Python 3. According to the Python developers this change was necessary to get rid of old baggage and inconsistent syntax, but it was also considered by many as a “treason”. Indeed, backwards compatibility is an important aspect of software development for large projects like python: it is a contract between developers and users stipulating that the code that users write will continue to work in subsequent releases. This contract was broken once, but this shouldn’t happen again.
Since updating code requires time and money, many people and companies continued to use Python 2 internally. Five years ago, I would maybe have been reluctant to teach you Python 3 because the future of the language was uncertain and many tools existed only in Python 2. Today the wide adoption of Python 3 is finally accomplished and the future is looking very good. All major scientific python tools are supporting Python 3, and major libraries announced that they will stop to support Python 2 before or shortly after the end-of-life date of Python 2 (by 2020).
In this lecture we might talk about the differences between Python 2 and Python 3 at some point, but it’s unlikely. You can safely concentrate on learning Python 3 only.
Installation at the University of Innsbruck (optional)#
Note
I kept this section online for consistency with previous classes, but because of the restrictions you will not have a normal access to the computer lab this year. If you want, you can ignore these instructions, but they are still working and may be useful for your in-presence classes.
In this section we use the powerful linux PATH
variable to “install” the same Python program in your personal Linux account. This is done in a couple of easy steps.
The default python on linux systems#
Open a terminal and type:
$ python
This should have started the python command line, looking like this:
Python 2.7.5 (default, Apr 2 2020, 13:16:51)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
The >>>
is a prompt, waiting for you to type python commands. Type [CTRL+D]
to exit.
Python is available per default on virtually any linux system, for the simple reason that many programs actually use python internally. However, as you can see, it runs python version 2 (as of 2020, modern Linux distributions now removed the python
command and only offer python3
). We can also run python 3 with a simple:
$ python3
Now, shouldn’t this python program be enough? For simple usages like today’s examples, yes it would suffice. But as we are going to see, python alone isn’t really useful for us scientists: we need so-called third party packages, tools developed with and for python, but not available per default. While some of them are so important that they are available as linux packages, it is much easier (and safer) to use our own python installation for installing those. This is the primary purpose of a personalized python installation.
Using a custom (pre-installed) python at the university#
It turns out I have “installed” a more recent python version (and some additional packages) on a shared repository available to everyone with a UIBK account. I wrote “installed” with quotation marks because, in linux, the definition of “installed” is subjective. Let’s say that I’ve put the python executables somewhere where everybody can see them. Let’s try it:
$ /project/c7071047/miniconda3/bin/python
Python 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
You should have been able to launch a new python interpreter: this version is more recent and provided by Anaconda, Inc.
This is the python I’d like you to use from now on (when at the University - for your personal computer see the next section).
Note
The /project
folder is a shared folder. I don’t know where the device is located physically, but it definitely isn’t in the computer room. This is yet another powerful feature of linux systems: shared folders look just like normal folders. Quite practical, huh? Now this may have one simple drawback: launching this python might be a bit slow sometimes (especially if all of you try to open it at the same time). Most of the time this shouldn’t be a big problem, though.
Making this change “permanent”#
Nobody wants to write such a long command to start python. So what we are going to do is to add the /project/c7071047/miniconda3/bin/
folder to our PATH
, and in such a way that it is remembered for your later sessions.
We are going to edit a special file in your HOME
, ~/.bashrc
. This file contains a list of commands which are executed automatically each time you open a terminal. How practical! Let’s open this file and add the following two lines at the end of it:
# added for the python course:
export PATH="/project/c7071047/miniconda3/bin:$PATH"
Warning
Careful! This will add a folder to the PATH
variable, which is a fundamental element in linux: make sure that you add these two lines (and exactly these lines: respect the upper- and lower-case as well!) to your .bashrc
What did we just do? We added a folder to the PATH
(remember what this is?) and added it at the beginning of it. This is important because linux is going to look for programs in order: if a python
executable is found in the first folder, no need to look for another one (i.e. ignore the default linux program).
Note
For this change to take effect, close your terminal and open a new one! ~/.bashrc
is executed only once, at the opening of a new terminal.
If everything worked fine, after typing python
in the command line you should be given the most recent version I prepared for you.
Installing Python on your PC#
Unlike Matlab or IDL, Python’s scientific track is not installed “out of the box” with a single installation file. Fortunately, there are very useful tools out there to help us out. The most useful is Miniconda, which will help us to install both Python and the packages we need within a couple of minutes. The first installation step is platform specific (Windows, Linux, or Mac) while the other steps are the same on all platforms.
For Windows users
Windows is not a very friendly environment for software development and science1. Things tend to work, but they are often more complicated and slower than on linux or mac. Furthermore, since the majority of the scientific software developers and teachers (including me) do not work on Windows, bugs or problems are sometimes left unnoticed.
That being said: if you are not comfortable with computers or the command line, it is very much OK to stay with Windows. All of my past students managed to get python to work on Windows. It should work OK for you as well.
Install Miniconda#
Go to the miniconda website and download the latest installer for your platform (be careful to check whether you need a 64- or 32-bit version - most probably 64).
The installation is really easy and described here. Choose an installation directory where you have enough space available (conda installations can quickly grow larger than a few GB).
To see if everything worked well, open a terminal window (on Windows, the anaconda prompt) , and type in:
conda update conda
If you type:
python
A new python prompt should appear, with something like:
Python X.X.X |Continuum Analytics, Inc.| (default, Oct 19 2015, 21:52:17)
xxxx
Type "help", "copyright", "credits" or "license" for more information.
>>>
You can type exit()
or [CTRL+D]
to get out of the python prompt.
For Windows users
To have access to conda commands and to manage your environments and packages, the anaconda prompt is what you want to use.
You may have installed and tried git bash
in the previous lecture. If you like it and want to use it further instead of the anaconda prompt, you can add your conda
commands to it by following these instructions.
Recommended: make a new environment called “scipro”#
Conda also helps us to define so-called “environments”. A conda environment is an isolated directory that contains a specific collection of packages that you have installed.
Environments are not necessary, but helpful on the long term (helpful e.g. if some packages are in conflict with each other). If this is your first installation you can skip this part and go directly to Install the packages.
If you want to try environments already (you’ll thank me later), please have a short look at the conda introduction here before going on.
In the terminal, type:
conda create -n scipro
This created a new environment called “scipro” which has the latest python version as default interpreter. This environment can be activated with one command:
conda activate scipro
Note that after activation, the prompt changed to something like
[scipro]xxx>
. Once in a specific environment, all the packages we
install will be available in this environment only. This is very useful
if you need different versions of python for different projects for example.
Don’t forget to activate this environment every time you want to work on the exercises or when you want to install a new package. You can deactivate the current environment with:
conda deactivate
Install the packages#
There are a couple of packages that you will always need, whatever you are working on. These are: ipython, Jupyter, numpy, scipy, matplotlib.
To install them, type:
conda install numpy scipy matplotlib ipython jupyter
This will download the packages and install them, as well as their
dependencies. This can take a while! If successful, you should now be able to
start ipython
, for example:
ipython
Use [CTRL+D]
to close the interpreter and get back to the prompt.
We will need some other packages later on, but this will get you started. If you want to install the spyder development environment, do:
conda install spyder
And then run:
spyder (or spyder3, depending on your system)
Optional: update the packages#
After a while, it might be useful to update the packages you are using (careful! this is most of the times a good idea, but not always2).
You can update a single package:
conda update numpy
Or then simply update them all:
conda update --all
Go safe: clone your working environments before installing new packages#
“Never change a winning team” (also known as “if it ain’t broke, don’t fix it”) is a true programming wisdom. In the course of your studies, you might have to install new packages to a perfectly fine, working environment. I’ve said often in class that conda environments are nothing to be afraid of (they are just files on your computer), but still: if your setup currently works, keeping it working is generally a good idea.
If you have a scipro
environment that works and you are afraid of braking it, it’s a good idea to clone it with:
conda create --name scipro_clone --clone scipro
Once this is done, you can now try wild things in the scipro
environment, knowing that you can always delete it if things go wrong and clone it back from the original anytime…
Advanced python installation instructions and troubleshooting#
Sooner or later, you will encounter issues when installing a package. For me (on my Windows machine that I installed just for you, my dear students!), it started at Assignment 03, when I wanted to install rasterio
on my scipro
environment. I was welcomed with the message that rasterio
was not available for my python version (v3.9) in the default channels. Similar problems may also occur on Linux or MacOS, by the way.
Fortunately, I was able to solve this problem and install rasterio on Windows without hassle. I will now provide you with the tools that can help you to do that. *Please use these tools only after having tried the default instructions on this page, and after you have understood the basic concepts of environments as explained above. These tools work very well, but I recommend to understand some conda basics before using them.
Now, we will address two main problems of the default conda installation:
conda
commands are very slow. They become much slower as the number of packages you have installed increases. There is a bunch of information on the web about that, and I won’t detail them here. One solution to this problem is mamba.the anaconda default channels (the online repositories delivering the packages you installed if you followed the instructions above) are not always up-to-date to the latest developments (this is why
rasterio
was not yet available with python 3.9 when I tried it). The solution to this problem is conda-forge.
Solve the performance problem: mamba
#
mamba is an “in-place” replacement for conda. This means that all conda commands you know about can be replaced by mamba commands and will work just the same (but quite faster). Using mamba is quite safe - I recommend to avoid mixing mamba commands with conda commands though: once you have started using mamba, use only that.
To use mamba, install mamba in your (base)
environment:
$ conda install mamba -n base -c conda-forge
Once installed, you can use mamba
in place of the typical conda
commands to install packages (see below).
For git bash users
If you use git bash
instead of the anaconda prompt to start spyder, you will have to add mamba to it as well. Edit your ~/.bashrc
file to add a link to mamba
as well. Here is how mine looks like:
. /c/Users/c7071047/Miniconda3/etc/profile.d/conda.sh
mamba() { /c/Users/c7071047/Miniconda3/condabin/mamba.bat "$@" ;}
Solve the package availability problem: use conda-forge
#
conda-forge is a collection of packages maintained by a large community of volunteers and the package maintainers themselves. It is usually more up-to-date than the default conda channels. To use conda-forge
, we will follow the instructions on their documentation.
We start by adding conda-forge
to our channels and set it as the default channel:
$ conda config --add channels conda-forge
$ conda config --set channel_priority strict
Then, we create a new environment (just to be sure):
$ conda create --name scipro_forge
$ conda activate scipro_forge
And finally, we install the packages we need:
$ mamba install numpy scipy matplotlib ipython jupyter spyder matplotlib xarray
$ mamba install rasterio (if you want!)
That’s as simple as that!
Take home points#
python is available per default on almost all linux distributions, for the simple reason that many software packages use python internally
it is not available per default on windows, but it can be installed
the system python is useful for scripting or for simple tasks, but for real-world applications you want to install python (and related packages) on your own.
the best tool to install a custom python and selected packages is conda, installed via miniconda
conda-forge and mamba are drop-in replacements for the default anaconda channels and conda install commands, respectively. They are useful for more advanced use cases
on linux, a python “installation” is simply a link to a python executable file - on Windows, it’s more complicated
Windows excels at hiding details to the users: this can be good sometimes (for example when you want to play games or run microsoft word), but as a scientist I recommend you start to think and learn about how things work on the system you use everyday. In short: stick to Windows if you like it, but learn how to use it properly and learn how it works from the inside.
- 1
Some people would obviously disagree with this, but I personally stand to it: for daily tasks like programming, connecting to a supercomputer or a data server, analyzing and managing data, writing a document with latex, etc., linux is better than Windows.
- 2
Some updates might include changes which change the way functions are called, or -even worse- change your results. This is usually not the case for large, well-known software packages, but it can happen. The best way to ensure consistency are tests - as we are going to learn - and getting information about the package updates.