Assignment 08#

Due date: 24.05.2023

This week’s assignment has to be returned in the form a jupyter notebook.

Don’t forget the instructions!

01 - Data preparation#

Rewrite the acinn_meteo_data function we used in week 05 to return a pandas dataframe instead of individual arrays. When creating the dataframe, rename the column as follows:

  • “rr” -> “rainfall” (invalid: < 0)

  • “dd” -> “wind_dir” (invalid: < 0)

  • “ff” -> “wind_speed” (invalid: < 0)

  • “tp” -> “dewpoint” (invalid: < -50)

  • “p” -> “pressure” (invalid: < 500)

  • “tl” -> “temperature” (invalid: < -50)

  • “so” -> “sunshine_min” (invalid: < 0)

  • “rf” -> “relative_humidity” (invalid: < 0)

You should ignore the datumsec key and instead use the converted time as index for the dataframe you’ll create. Not all stations have all the variables! Your algorithm should work regardless of if the variable is available or not.

Filter for missing data before placing the data in the pandas dafaframe: replace all values which are below the invalid threshold in the list above with np.nan.

def acinn_meteo_data(station="innsbruck", ndays=3):
    """Parse live meteorological data from the ACINN servers.

    Requires an internet connection and the pandas library!

    Parameters
    ----------
    station : str
        one of "innsbruck", "obergurgl"
    ndays : int
        either 3 or 7 days

    Returns
    -------
    df : pd.DataFrame
        the meteorological data in a pandas dataframe

    Examples
    --------
    >>> df = acinn_meteo_data()
    >>> type(df)
    <class 'pandas.core.frame.DataFrame'>
    >>> type(df.index)
    <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
    >>> len(df.columns)
    7
    >>> df['temperature'].dtype
    dtype('float64')
    """
    from urllib.request import Request, urlopen
    from datetime import datetime, timedelta
    import json

    url = f'https://acinn-data.uibk.ac.at/{station}/{ndays}'
    req = urlopen(Request(url)).read()
    # Read the data
    data = json.loads(req.decode('utf-8'))

    # Convert the time
    time = [datetime(1970, 1, 1) + timedelta(milliseconds=ds) for ds in data['datumsec']]
    
    <your code here>
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Your answer here
import doctest
doctest.testmod()
TestResults(failed=0, attempted=0)

02 - Wind#

1. write a new function called sd_to_uv which accept two arrays as input (or equivalent): wind speed and wind direction, and which converts the values into u and v, the vector components of the wind.

def sd_to_uv(speed, direction):
    """Converts wind speed and direction to (u, v) vector components.

    Parameters
    ----------
    speed : ndarray-like
        wind speed in m/s
    direction : ndarray-like
        wind direction in degrees, meteorological convention (0° = North)

    Returns
    -------
    (u, v) : wind u and v vector components (unit: m/s)

    Examples
    --------
    >>> from numpy.testing import assert_allclose
    >>> assert_allclose(sd_to_uv(1, 90), [-1, 0], atol=1e-7)
    >>> assert_allclose(sd_to_uv(0, 234), [0., 0.], atol=1e-7)
    >>> u, v = sd_to_uv([1, 1, 1], [0, 90, 180])
    >>> assert_allclose(u, [0, -1, 0], atol=1e-7)
    >>> assert_allclose(v, [-1, 0, 1], atol=1e-7)
    """
    <your code here>

2. write another function called uv_to_sd which does the conversion in the other direction.

def uv_to_sd(u, v):
    """Converts (u, v) wind vector components to wind speed and direction.

    Parameters
    ----------
    u : ndarray-like
        u-component of the wind speed in m/s
    v : ndarray-like
        v-component of the wind speed in m/s

    Returns
    -------
    (speed, direction) : wind speed (unit: m/s) and direction (° in the meterological convention)

    Examples
    --------
    >>> from numpy.testing import assert_allclose
    >>> assert_allclose(uv_to_sd(1, 1), [2**0.5, 225])
    >>> assert_allclose(uv_to_sd(-1, -1), [2**0.5, 45])
    >>> u, v = sd_to_uv([1, 2], [90, 235])
    >>> s, d = uv_to_sd(u, v)  # round trip
    >>> assert_allclose(s, [1, 2])
    >>> assert_allclose(d, [90, 235])
    """
    <your code here>

This webpage contains all the info you need to compute this conversion. Don’t forget to run the tests!

In order to help you out with one particular aspect, here is a useful function that I recommend to apply in your code:

def check_wind_dir(direction):
    """Makes sure that a wind direction value is comprised between 0 and 360°.

    Parameters
    ----------
    direction : ndarray-like
        wind direction in degrees, in the range [-360; +720]

    Returns
    -------
    direction : the direction, mapped to the [0; 360[ range.

    Examples
    --------
    >>> check_wind_dir(0)
    0
    >>> check_wind_dir(360)
    0
    >>> print(check_wind_dir([-180, 90, 420]))
    [180  90  60]
    """
    direction = np.asanyarray(direction)
    if np.any(direction < -360):
        raise ValueError("Invalid wind direction value")
    return np.fmod(direction + 360, 360)


doctest.testmod()
TestResults(failed=0, attempted=3)
# Your answer here
doctest.testmod()
TestResults(failed=0, attempted=3)

3. use the sd_to_uv functions to add two columns to the Innsbruck dataframe: u_wind and v_wind. Tip: this is very easy to do and does not require any complicated pandas function! Start by noticing that sd_to_uv and uv_to_sd return numpy arrays regardless of the type of the input (even with pd.Series), and then remember that numpy arrays can be added as columns to a pd.DataFrame.

4. now compute the average statistics of wind for the 3 days period (average speed, average direction), by noting that wind speed is best averaged from the original data, and that wind direction is best averaged in vector space and converted back to trigonometry space. Compare the differences between the “naive” average and the “more correct” one.

# Your answer here

03 - Station comparison#

For the two stations 'innsbruck', 'obergurgl', parse the data and put all temperature variables in a single merged dataframe looking like this:

                     innsbruck  obergurgl
2023-05-10 16:10:00       11.4        2.2
2023-05-10 16:20:00       11.6        2.0
2023-05-10 16:30:00       11.6        1.9
2023-05-10 16:40:00       11.4        1.9
2023-05-10 16:50:00       11.4        1.8

Then, plot them all together in a single plot with legend.

Which station recorded the warmest temperature over the period? How do their standard deviations compare with each other?

# Your answer here