Assignment #05: C, python and numpy¶

Tutorials #05-00¶

numpy and matplotlib are two fundamental pillars of the scientific python stack. You will find numerous tutorials for both libraries online. I am asking you to learn the basics of both tools by yourself, at the pace that suits you. I can recommend these two tutorials:

They can be quite long if you are new to numpy - I'm not asking to do them all today! Sections 1.3.1.1 to 1.3.1.5 in the numpy tutorial should get you enough information for today's lectures.

Exercise #05-01: a glimpse in the C language¶

This exercise can be done on a linux machine only! You can use the university workstations for this.

Here is the C code sample from the lecture:

#include <stdio.h> 
int main ()
{
    int a = 2;
    int b = 3;
    int c = a + b;
    printf ("Sum of two numbers : %d \n", c);
}

Write this code in a C code file, compile and run it.

Now, replace the line int b = 3 with char b[] = "Hello";. Compile and run the program again (ignore warnings at compilation). Does the output match your expectations? Can you explain what happens? Compare this behavior to python's, and try to explain why this behavior can lead to faster execution times.

Exercise #05-02: Monte-Carlo estimation of $\pi$¶

A simple way to estimate $\pi$ using a computer is based on a Monte-Carlo method. By drawing a sample of N points with random 2D coordinates (x, y) in the [0, 1[ range, the ratio of points that fall within the unit circle divided by the total number of points (N) gives an estimate of $\pi / 4$.

Provide two implementations of the monte-carlo estimation of $\pi$: a pure python version (standard library) and a vectorized version using numpy. Time their execution for N = [1e2, 1e3, ..., 1e7]. Plot the numpy speed-up as a function of N.

Optional: try the numpy version with N=1e8 and above. Make conclusions about a new trade-off happening for large values of N.

Tip: you can try to mimic %timeit in your code by running each function at least three times and taking the fastest execution of all three.

Exercise #05-03: numpy cycles¶

Monthly averages of temperature data at Innsbruck can be downloaded from this lecture's github via:

from urllib.request import Request, urlopen
import json

# Parse the given url
url = 'https://raw.githubusercontent.com/fmaussion/scientific_programming/master/data/innsbruck_temp.json'
req = urlopen(Request(url)).read()
# Read the data
inn_data = json.loads(req.decode('utf-8'))

(original data obtained from NOAA's Global Surface Summary of the Day)

Explore the inn_data variable. What is the type of "inn_data", and of the data it contains? Convert the data series to numpy arrays.

Using numpy/scipy, matplotlib, and the standard library only, compute and plot the mean monthly annual cycle for 1981-2010 and the mean annual temperature timeseries for 1977-2017. Compute the linear trend (using scipy.stats.linregress) of the average annual temperature over 1977-2017. Repeat with winter (DJF) and summer (JJA) trends.

Tip 1: to select part of an array (indexing) based on a condition, you can use the following syntax:

import numpy as np
x = np.arange(10)
y = x**2
y[x > 4]  # select y based on the values in x

array([25, 36, 49, 64, 81])

Tip 2: there are more than one way to compute the annual and monthly means. Some use loops, some use reshaping on the original 1D array.

Back to the table of contents