Lesson: more tools for statistical and climatological analysis#

Last week we learned how to open a NetCDF file (a data format which is very common in atmospheric applications), select variables, do simple statistics on them, and plot them. Today we are going to introduce some more data-crunching tools and we are going to learn how to make our plots more precise and informative.

More control on plots#

Let’s compute the time average of the air temperature:

t2_tavg = ds.t2m.mean(dim='time')
ax = plt.axes(projection=ccrs.Robinson())
t2_tavg.plot(ax=ax, transform=ccrs.PlateCarree())
ax.coastlines(); ax.gridlines();

../_images/01_Lesson_MoreDataCrunching_13_0.png

Discrete levels#

Smooth (continuous) color tables like the above “look nice”, but the human eye is not trained to see such small differences in color. For example, it would be quite difficult to tell the temperature of the Peruvian coast (above 280K? or below?). Sometimes, discrete levels are the way to go:

ax = plt.axes(projection=ccrs.Robinson())
t2_tavg.plot(ax=ax, transform=ccrs.PlateCarree(), levels=[240, 260, 280, 285, 290, 295, 300]) 
ax.coastlines(); ax.gridlines(); 

../_images/01_Lesson_MoreDataCrunching_16_0.png

Q: What is the use of the unevenly spaced levels we set? In which range can we place the temperature off the Peruvian coast? If your eyes still can´t make out the difference, how can we be sure?

E: Make a new plot, but this time set levels=12.

# your answer here

Color tables#

Let’s make a new variable called t2c_tavg, which is t2_tavg converted to degrees celsius:

t2c_tavg = t2_tavg - 273.15
ax = plt.axes(projection=ccrs.Robinson())
t2c_tavg.plot(ax=ax, transform=ccrs.PlateCarree()) 
ax.coastlines(); ax.gridlines(); 

../_images/01_Lesson_MoreDataCrunching_21_0.png

What happened to our plot? Note the location of the 0 on the colorbar and the automated choice of a new colorscale. Note also that the data range is mostly dictated by very cold temperatures in Antarctica. These automated choices are not always meaningful. Let’s play around a little bit:

ax = plt.axes(projection=ccrs.Robinson())
t2c_tavg.plot(ax=ax, transform=ccrs.PlateCarree(), cmap='inferno', center=False, 
              vmin=-40, vmax=20, levels=7, cbar_kwargs={'label': '°C'}) 
ax.set_title('Average annual 2m air temperature, ERA5 1979-2018')
ax.coastlines(); ax.gridlines(); 

../_images/01_Lesson_MoreDataCrunching_23_0.png

Q: try to understand the role of each keyword by trying to use each of them separately. If you’re still unsure, a look at xarray’s documentation might be helpful.

Note: a list of matplotlib’s color tables can be found here.

# your playground here

Slightly faster map plots#

xarray’s .plot method internally uses matplotlib’s pcolormesh which, for reasons too long to explain here, is the more accurate way to represent gridded data on a map. If you are willing to sacrifice some accuracy (not visible with the bare eye at the global scale), you can also use imshow:

t2_tavg = ds.t2m.mean(dim='time')
ax = plt.axes(projection=ccrs.Robinson())
t2_tavg.plot.imshow(ax=ax, transform=ccrs.PlateCarree())
ax.coastlines(); ax.gridlines();

../_images/01_Lesson_MoreDataCrunching_28_0.png

This plot should render about 4 times faster than the default plot, which is useful for data exploration. It should not be used for final rendering or for regional plots, though.

Dimensional juggling!#

I am now going to apply a series of commands to our data. Let’s see if you can follow each step:

t2_m_reg = ds.t2m.sel(longitude=slice(-20, 60), latitude=slice(40, -40)).groupby('time.month').mean(dim='time') - 273.15
t2_m_reg_z = t2_m_reg.mean(dim='longitude')
t2_m_reg_z.T.plot();

../_images/01_Lesson_MoreDataCrunching_83_0.png

Can you follow what I did? If not, decompose each step and see if you can follow the operations one by one.

The plot above is called a Hovmöller diagram, used very often in climatology. Q: Can you describe its features?

Contour plots#

Reconsider the Africa plot and the Hovmöller plot above. Both are quite “pixellized” (the first one because at regional scale, the coarse spatial resolution of these data becomes visible). Xarray’s method of choice to display 2d data is to represent it as if it was an “image”. It’s ok most of the time, but sometimes you’d like to plot more contoured data. For example:

t2_m_reg_z.T.plot.contourf(levels=np.linspace(10, 30, 11));

../_images/01_Lesson_MoreDataCrunching_87_0.png

Multi-line plots#

Maps and contours are nice, but often the most powerful and quantitative way to plot data is to use line plots. We already showed that 1D data is plotted by xarray as a line automatically. Can we plot multiple lines from a 2D data array?

# The following line selects data along given latitudes (here in 10° steps)
# 'nearest' tells xarray to select the closest available coordinate when the 
# exact one is not available
t2_m_reg_sel = t2_m_reg_z.sel(latitude=np.linspace(-30, 30, 7), method='nearest')

# Let's plot the lines!
t2_m_reg_sel.plot(hue='latitude');

../_images/01_Lesson_MoreDataCrunching_90_0.png

What’s next?#

You are now ready for this week’s assignment! If you want to know more about xarray’s plotting capabilities, visit the excellent documentation.

Physics of the Climate System

Lesson: more tools for statistical and climatological analysis

Contents

Lesson: more tools for statistical and climatological analysis#

Temperature data#

More control on plots#

Discrete levels#

Color tables#

Slightly faster map plots#

Working with time series#

Time series of globally averaged fields#

Resample time series data#

Compute the monthly climatographies (or annual cycle)#

Averages and anomalies#

Selecting specific areas of our data#

Selection based on a condition#

Dimensional juggling!#

Contour plots#

Multi-line plots#

What’s next?#