Fabien Maussion
These lecture notes are kept for documentation and historical purposes only! For the latest version visit:
https://fabienmaussion.info/scientific_programming
If you're not here for the first time, jump to:
There are plenty of excellent online resources to learn python (see references below). So why writing these lecture notes? Well, for one, because none of the tutorials is organized to fit exactly the 15 week semester of Austrian universities. Therefore, I had to make some choices regarding (i) what to teach and (ii) how to teach it. However, there is no need in reinventing the wheel and rewriting what much better teachers wrote before me: as you will see, I will rely heavily on external resources, all openly available. Following the open source philosophy, these lecture notes are also freely available.
These notes are actualized on the go, as this course advances. I am trying to write them in such a way that they are understandable without actually attending the course, but I strongly recommend to participate to both the lectures and the practicals. Let me know if you encounter any typo / mistake / broken link / incomprehensible or difficult passage.
This class aims at learning modern programming techniques for (geo-)scientists. After finishing the class, attendees should be able to program in a structured, extendable and reproducible manner. They should understand how numbers are handled by computers and be aware of numerical accuracy errors. Furthermore, they will get acquainted with various programming tools (IDEs, debugger, unit testing, object oriented programming, open development practices).
The targeted audience for this lecture are (geo-)science students at the master level with previous experience in programming. No prior knowledge of python is required, but I'll assume that you are familiar with a similar language (Matlab, IDL, R...). This is not an introductory course, although we will shortly revisit programming basics in order to learn the python syntax.
The course encompasses the following topics, which are developed by means of concrete examples in the Python programming language:
Scientific programming targets to solve scientific problems with the help of computers.
It is sometimes used as synonym for computational science, but in my opinion these are not entirely the same. "Scientific programming" is not really a discipline, and therefore cannot be taught.
What are we doing here then? Well, we are going to learn programming first, and then programming as a tool to do science. We are going to apply our new skills to scientific problems, but not only. Within the time given to us, we won't be able to learn everything about programming of course. My hope is that at the end of the lecture you'll have sufficient background and tools at your disposition to solve your own problems, and (this is the most important bit) that you'll know where to find solutions to the problems you encounter.
As a scientist you are going to either produce or analyze data, most of the time you'll do both. For a long time, scientists have seen programming as a "tool", a menial task to accomplish in order to answer the questions they were asking. Nowadays programming has taken a prominent place in a scientist's work, for several reasons:
In simple words, we have to become better programmers to be faster and better at what we do: science.
We will use the Python programming language in this course. In case you are wondering why this language and not any other like <name your favorite language here>
, let me stop you right away: this course is not about "learning Python", it is about learning the general concepts of programming: algorithmic, numerics, program structure, object oriented programming, testing, etc. Python is just the tool I chose to use for this purpose.
We could indeed have taken any other language, but there are several advantages in using Python. A quick web search will give you millions of reasons, but let me pick some of my favorites here:
There are many other reasons to use Python (and some arguments against Python as well of course), but I don't think it's relevant to list them here. My argument is following: for a good programmer, switching language is not a very big deal. It's not easy of course, but it's possible - becoming a good programmer is the hard bit, and is a never ending process.
To access the previous course content (winter semester 2018), visit: https://fabienmaussion.info/scipro_ws2018
These notes are written as a companion to the lectures. During class I will go through the major concepts (using slides or the good old blackboard), these notes are here to help you learn at home.
The notes are a mix of examples and small exercises. The exercises can happen in between the examples and are marked with a question mark logo. If you want to download the notebooks I used to write the notes, you can download them on the course's repository. This repository gets updated frequently, so you might have to re-download the files from time to time.
When you will be going through the examples of these notes, some sentences are marked in bold: this underlines their importance for the course. When single words are bold this symbolizes new concepts or new definitions: they need to be understood (and googled if needed).
The lecture will be graded based on three assessments:
climvis
package): 30%pelita
game): mandatoryA positive evaluation of each of these elements is mandatory to pass the class!
There will be bonus points for pointing me to typos / mistakes / broken links / incomprehensible or difficult passages in these lecture notes.
Each week there will be an assignment (unless specified otherwise, e.g. during the group projects). These assignments can be worked through alone or in groups. Each week, I will ask one randomly selected group to present their results to the rest of the class.
The class grants you 5 ECTS if successfully passed: in theory, this represents 8 to 10 hours work per week (not including holidays). For this course, it means that you will spend about twice as much time doing homework than sitting in class.
Resources used (and linked) in these lecture notes:
Linux and bash scripting:
Preparation MOOCs:
Python tutorials:
Python reference:
Testing
Floating point precision errors
Numpy
Scientific Python
Python namespaces and scopes
Object Oriented Programming The web is full of blog posts and basic tutorials about OOP in python. Unfortunately, most of them make a poor job at explaining why OOP can be useful and when not. I will try to find better resources, but for now I recommend:
Documentation
Some youtube videos about tech in general
(this list will be updated when the notes get written further)
Seeking for information online is necessary and helpful at any level of programming skills. I would even argue that good programmers are the ones who know how to efficiently find information online.
When encountering an issue, the first question you should ask yourself is: "am I the only person likely to be affected by this problem?". The answer will be no in 99% of the cases. For these, here is a list of recommendations:
If every other thing fails (i.e the remaining 1% of the cases), then:
These lecture notes and exercises are licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) .