Assignment: scenario dependence of climate risks#
Import the packages. I’ll do this for you:
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
Part 1: past climate at Heathrow - observations and models#
Read the data#
Historical observed: global summary of the day#
Let’s read the observed Heathrow dataset (you should have it in your folder after last week). Let’s also fill the gaps, and convert the units :
# Read the data
dfh = pd.read_csv('../data/csv/gsod-heathrow.csv', index_col=1, parse_dates=True)
# Missing values
dfh = dfh.replace([99.99, 999.9, 9999.9], np.nan)
dfh = dfh.reindex(pd.date_range(dfh.index[0], dfh.index[-1], freq='D'))
# Period selection - note that we go only from 1979 until 2014 here
dfh = dfh.loc['1979':'2014']
# Outlier filtering
dfh['PRCP'] = dfh['PRCP'].where(dfh['PRCP'] < 10)
# Keep the variables we want
dfh = dfh[['TEMP', 'MAX', 'MIN', 'PRCP']].copy()
# Convert units
dfh[['TEMP', 'MAX', 'MIN']] = (dfh[['TEMP', 'MAX', 'MIN']] - 32) * 5/9
dfh['PRCP'] = dfh['PRCP'] * 25.4
# Separate tas to stay organised
dftas = dfh[['TEMP']].copy()
dftas.columns = ['obs']
Exercise: familiarise yourself with the dftas
dataframe. What is the time period they cover? Their units? Make sure you understand what the code above does.
# Your answer here
Reanalysis: W5E5#
Go to the download page (link to the files) to download the ISIMIP W5E5v2.0
tas
dataset for Heathrow. W5E5 is derived from ERA5 and has been further corrected and blended with observational datasets to improve its representation of historical climate conditions. It provides a consistent, bias-adjusted dataset for climate impact studies (see the W5E5 description paper for more info).
I’ll read the data for you, and add a new column to dftas
.
dfh_w5e5 = pd.read_csv('../data/csv/isimip/lhr/W5E5v2.0_tas_lhr_daily.csv',
index_col=0, parse_dates=True)
dftas['w5e5'] = dfh_w5e5['tas'] - 273.15 # Units!
Exercise: explore the dfh_w5e5 dataframe. Check the lon and lat values, and compare them to the true lon and lat values for Heathrow. Where does this difference come from?
# Your answer here
Historical simulated: gfdl-esm4#
Go to the download page (link to the files) to download the ISIMIP gfdl-esm4
historical
tas
dataset for Heathrow.
I’ll read the data for you, and add a new column to dftas
:
dfh_hist = pd.read_csv('../data/csv/isimip/lhr/gfdl-esm4_r1i1p1f1_w5e5_historical_tas_lhr_daily.csv',
index_col=0, parse_dates=True)
dftas['gfdl_hist'] = dfh_hist['tas'] - 273.15 # Units!
A look into ISIMIP’s bias correction#
In Workshop 04, we established that climate models, such as GFDL-ESM4, often exhibit both systematic and random biases. Biases of several degrees for temperature or factors of 2 to 3 for precipitation amounts are not uncommon. The ISIMIP protocol for bias adjustment specifically addresses these biases to provide homogenized, bias-corrected inputs for impact models.
The ISIMIP methodology applies a process called quantile mapping to correct climate model biases, using the W5E5 dataset as a reference.
Exercise: eevelop a few simple tests to verify that gfdl_hist
does not exhibit strong, systematic biases compared to w5e5
at Heathrow. A well-structured analysis should include:
at least 2 key metrics to quantify the similarity between
gfdl_hist
andw5e5
climatologies.at least 3 plots to visually compare the datasets.
at least one metric related to extreme values to check whether bias correction properly preserves extreme events
Hint: Keep in mind that gfdl_hist
is not expected to reproduce the observed weather for specific years (e.g., whether a given year was cold or warm). Instead, it should represent the overall statistics of weather patterns well enough.
# Your answer here
Is W5E5 the same as “ground truth”?#
By now, you should have gained confidence that the bias correction method applied by ISIMIP is quite robust: most statistics are well preserved, even though extreme value statistics may differ more. However, so far, we have only compared GFDL-ESM4 with W5E5. How well does W5E5 match actual ground-truth observations at Heathrow?
Exercise: comparing W5E5 with observations. Repeat the comparisons you conducted above, but this time:
Compare W5E5 against actual observations at Heathrow instead of GFDL-ESM4.
Plot the annual average time series of temperature for both W5E5 and observations.
Confirm that W5E5, unlike a climate model, aims to represent actual weather at that location, not just long-term climate statistics.
Write down your findings, discussing any systematic biases.
Hints about expected findings:
There is a systematic bias between W5E5 and observations.
This bias is relatively constant throughout the year but tends to be slightly larger in summer.
Extreme temperatures tend to be slightly less extreme in W5E5 compared to observations. Consider these aspects when analyzing your results.
# Your answer here
Conclusions part 1: past climate#
Despite the systematic biases between reanalysis and observations, it is very common for climate impact studies to skip the comparison step we just performed. This is mainly because:
High-quality, long-term observations like those available at Heathrow are not available for most locations.
Reanalysis datasets are not designed to represent the weather at a single location but rather the average climate over a larger grid area.
The appropriate next step depends on the specific research question:
If I am interested in predicting future climate extremes at Heathrow, I would apply an additional bias correction to both reanalysis and projected data. This is not straightforward to do well, but as a first approximation, adjusting for the systematic bias (~1.18°C for temperature) would be a reasonable starting point.
If I am only interested in changes in climate extremes, then no bias correction is necessary. Instead, I can focus on the relative changes in extremes over time. This is the approach we will take next.
Part 2: future climate at Heathrow#
Now return to the download page (link to the files), and download the gfdl-esm4
projections for the scenarios ssp126
and ssp585
at Heathrow. Put them in the same folder, and let me read the data for you:
# Read SSP126
dfh_ssp = pd.read_csv('../data/csv/isimip/lhr/gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lhr_daily.csv',
index_col=0, parse_dates=True)
dftas_ssp = dfh_ssp[['tas']] - 273.15 # Units!
dftas_ssp.columns = ['ssp126']
# Read SSP585
dftas_ssp['ssp585'] = pd.read_csv('../data/csv/isimip/lhr/gfdl-esm4_r1i1p1f1_w5e5_ssp585_tas_lhr_daily.csv',
index_col=0, parse_dates=True)['tas'] - 273.15 # Units!
Exercise: explore the dftas_ssp
dataframe. What period does it cover? What is the time resolution? Etc. Now plot the annual averages of temperature for w5e5
, ssp126
, and ssp585
on the same timeseries plot.
# Your answer here
Exercise: now compute the following three variables:
annual_max_hist
: annual maximum temperature forw5e5
over the historical periodannual_max_ssp126
: annual maximum temperature forssp126
over the period 2065-2100annual_max_ssp585
: annual maximum temperature forssp585
over the period 2065-2100
(verify that the lentgh is 36 years in all three cases)
# Your answer here
Now fit a GEV distribution for each of these samples, like we did in the lesson.
# Your answer here
Plot the GEV return values / periods for all three sample (hist
, ssp126
, ssp585
) on the same plot. I’m looking for a plot similar to the one we discussed during the lecture (the superstorm Sandy sea-level with and without anthropogenic climate change).
# Your answer here
Finally, compute the return level of a 100-year event according to W5E5 at Heathrow during the historical period. Then, use this value to determine the return periods of this event under each SSP scenario. Write down the increased likelihood of such an event in each scenario.
Hint: Your answer should be in the form of a statement like:
“By 2100, the return period of a present-day 100-year event in the SSP126 scenario is x years (y times more likely).”
# Your answer here
Going further (optional): other variables / locations, include uncertainties#
The data download page lists several other locations / variables where you could run an extreme value analysis. You may also want to consider uncertainties as well!