Assignment 08#
Due date: 24.05.2023
This week’s assignment has to be returned in the form a jupyter notebook.
Don’t forget the instructions!
01 - Data preparation#
Rewrite the acinn_meteo_data
function we used in week 05 to return a pandas dataframe instead of individual arrays. When creating the dataframe, rename the column as follows:
“rr” -> “rainfall” (invalid: < 0)
“dd” -> “wind_dir” (invalid: < 0)
“ff” -> “wind_speed” (invalid: < 0)
“tp” -> “dewpoint” (invalid: < -50)
“p” -> “pressure” (invalid: < 500)
“tl” -> “temperature” (invalid: < -50)
“so” -> “sunshine_min” (invalid: < 0)
“rf” -> “relative_humidity” (invalid: < 0)
You should ignore the datumsec
key and instead use the converted time as index
for the dataframe you’ll create. Not all stations have all the variables! Your algorithm should work regardless of if the variable is available or not.
Filter for missing data before placing the data in the pandas dafaframe: replace all values which are below the invalid threshold in the list above with np.nan
.
def acinn_meteo_data(station="innsbruck", ndays=3):
"""Parse live meteorological data from the ACINN servers.
Requires an internet connection and the pandas library!
Parameters
----------
station : str
one of "innsbruck", "obergurgl"
ndays : int
either 3 or 7 days
Returns
-------
df : pd.DataFrame
the meteorological data in a pandas dataframe
Examples
--------
>>> df = acinn_meteo_data()
>>> type(df)
<class 'pandas.core.frame.DataFrame'>
>>> type(df.index)
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
>>> len(df.columns)
7
>>> df['temperature'].dtype
dtype('float64')
"""
from urllib.request import Request, urlopen
from datetime import datetime, timedelta
import json
url = f'https://acinn-data.uibk.ac.at/{station}/{ndays}'
req = urlopen(Request(url)).read()
# Read the data
data = json.loads(req.decode('utf-8'))
# Convert the time
time = [datetime(1970, 1, 1) + timedelta(milliseconds=ds) for ds in data['datumsec']]
<your code here>
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Your answer here
import doctest
doctest.testmod()
TestResults(failed=0, attempted=0)
02 - Wind#
1. write a new function called sd_to_uv
which accept two arrays as input (or equivalent): wind speed and wind direction, and which converts the values into u and v, the vector components of the wind.
def sd_to_uv(speed, direction):
"""Converts wind speed and direction to (u, v) vector components.
Parameters
----------
speed : ndarray-like
wind speed in m/s
direction : ndarray-like
wind direction in degrees, meteorological convention (0° = North)
Returns
-------
(u, v) : wind u and v vector components (unit: m/s)
Examples
--------
>>> from numpy.testing import assert_allclose
>>> assert_allclose(sd_to_uv(1, 90), [-1, 0], atol=1e-7)
>>> assert_allclose(sd_to_uv(0, 234), [0., 0.], atol=1e-7)
>>> u, v = sd_to_uv([1, 1, 1], [0, 90, 180])
>>> assert_allclose(u, [0, -1, 0], atol=1e-7)
>>> assert_allclose(v, [-1, 0, 1], atol=1e-7)
"""
<your code here>
2. write another function called uv_to_sd
which does the conversion in the other direction.
def uv_to_sd(u, v):
"""Converts (u, v) wind vector components to wind speed and direction.
Parameters
----------
u : ndarray-like
u-component of the wind speed in m/s
v : ndarray-like
v-component of the wind speed in m/s
Returns
-------
(speed, direction) : wind speed (unit: m/s) and direction (° in the meterological convention)
Examples
--------
>>> from numpy.testing import assert_allclose
>>> assert_allclose(uv_to_sd(1, 1), [2**0.5, 225])
>>> assert_allclose(uv_to_sd(-1, -1), [2**0.5, 45])
>>> u, v = sd_to_uv([1, 2], [90, 235])
>>> s, d = uv_to_sd(u, v) # round trip
>>> assert_allclose(s, [1, 2])
>>> assert_allclose(d, [90, 235])
"""
<your code here>
This webpage contains all the info you need to compute this conversion. Don’t forget to run the tests!
In order to help you out with one particular aspect, here is a useful function that I recommend to apply in your code:
def check_wind_dir(direction):
"""Makes sure that a wind direction value is comprised between 0 and 360°.
Parameters
----------
direction : ndarray-like
wind direction in degrees, in the range [-360; +720]
Returns
-------
direction : the direction, mapped to the [0; 360[ range.
Examples
--------
>>> check_wind_dir(0)
0
>>> check_wind_dir(360)
0
>>> print(check_wind_dir([-180, 90, 420]))
[180 90 60]
"""
direction = np.asanyarray(direction)
if np.any(direction < -360):
raise ValueError("Invalid wind direction value")
return np.fmod(direction + 360, 360)
doctest.testmod()
TestResults(failed=0, attempted=3)
# Your answer here
doctest.testmod()
TestResults(failed=0, attempted=3)
3. use the sd_to_uv
functions to add two columns to the Innsbruck dataframe: u_wind
and v_wind
. Tip: this is very easy to do and does not require any complicated pandas function! Start by noticing that sd_to_uv
and uv_to_sd
return numpy arrays regardless of the type of the input (even with pd.Series
), and then remember that numpy arrays can be added as columns to a pd.DataFrame
.
4. now compute the average statistics of wind for the 3 days period (average speed, average direction), by noting that wind speed is best averaged from the original data, and that wind direction is best averaged in vector space and converted back to trigonometry space. Compare the differences between the “naive” average and the “more correct” one.
# Your answer here
03 - Station comparison#
For the two stations 'innsbruck', 'obergurgl'
, parse the data and put all temperature variables in a single merged dataframe looking like this:
innsbruck obergurgl
2023-05-10 16:10:00 11.4 2.2
2023-05-10 16:20:00 11.6 2.0
2023-05-10 16:30:00 11.6 1.9
2023-05-10 16:40:00 11.4 1.9
2023-05-10 16:50:00 11.4 1.8
Then, plot them all together in a single plot with legend.
Which station recorded the warmest temperature over the period? How do their standard deviations compare with each other?
# Your answer here