Global Biases Diagnostic

Description

The GlobalBiases class is a tool for analyzing and visualizing 2D biases between two datasets. It enables a comparative analysis where one dataset is treated as the reference, often observational data like ERA5. However, it can also be used to compare two model datasets, making it suitable for examining differences between historical and scenario experiments.

This class provides functionality for bias analysis, including the ability to: - Plot bias maps to visualize spatial variations. - Analyze seasonal variations in biases. - Generate vertical profiles to assess biases across pressure levels.

Structure

global_biases.py: contains the GlobalBiases class
cli_global_biases.py: the command line interface (CLI) script to run the diagnostic.

Input variables

The diagnostic requires the variables that the user wants to analyse. A list of the variables that are compared automatically when running the full diagnostic is provided in the configuration files available in the config/diagnostics/global_biases directory.

Some of the variables that are tipically used in this diagnostic are:

2m temperature (2t)
Total Precipitation (tprate)
Zonal and meridional wind (u, v)
Specific humidity (q)

The data we retrieve through the provided functions have monthly timesteps and a 1x1 deg resolution via the Low Resolution Archive. A higher resolution is not necessary for this diagnostic.

Basic usage

The basic usage of this diagnostic is explained with a working example in the notebook provided in the notebooks/diagnostics/global_biases directory. The basic structure of the analysis is the following:

from aqua import Reader
from aqua.diagnostics import GlobalBiases

#define and retrieve the data to use for the analysis
reader_ifs_nemo = Reader(model = 'IFS-NEMO', exp = 'historical-1990', source = 'lra-r100-monthly')
data_ifs_nemo = reader_ifs_nemo.retrieve()

reader_era5 = Reader(model="ERA5", exp="era5", source="monthly")
data_era5 = reader_era5.retrieve()

global_biases = GlobalBiases(data=data_ifs_nemo, data_ref=data_era5, var_name='2t', loglevel = 'INFO',
                             model="IFS-NEMO", exp="historical-1990", model_obs="ERA5")

global_biases.plot_bias()

The user can also define the start and end date of the analysis and the reference dataset.

Note

A catalogs argument can be passed to the class to define the catalogs to use for the analysis. If not provided, the Reader will identify the catalogs to use based on the models, experiments and sources provided.

CLI usage

The diagnostic can be run from the command line interface (CLI) by running the following command:

cd $AQUA/src/aqua_diagnostics/global_biases
python cli_global_biases.py --config_file <path_to_config_file>

Additionally the CLI can be run with the following optional arguments:

--catalog: Catalog to use for the analysis. It can be defined in the config file.
--model: Model to analyse. It can be defined in the config file.
--exp: Experiment to analyse. It can be defined in the config file.
--source: Source to analyse. It can be defined in the config file.
--cluster: Dask cluster address.
--config, -c: Path to the configuration file.
--loglevel, -l: Logging level. Default is WARNING.
--nworkers, -n: Number of workers to use for parallel processing.
--outputdir: Output directory for the plots.
--regrid: Target grid for regridding.

Config file structure

The configuration file is a YAML file that contains the following information:

data and obs : dictionaries with the information about the model and reference data to use for the analysis.

data:
  catalog: null
  model: 'IFS-NEMO'
  exp: 'historical-1990'
  source: 'lra-r100-monthly'

obs:
  catalog: null
  model: 'ERA5'
  exp: 'era5'
  source: 'monthly'

output: a block describing the details of the output. Is contains:
- outputdir: the output directory for the plots.
- rebuild: a boolean that enables the rebuilding of the plots.
- filename_keys: a list of keys for constructing the output filenames.
- save_netcdf: a boolean that enables the saving of the plots in NetCDF format.
- save_pdf: a boolean that enables the saving of the plots in pdf format.
- save_png: a boolean that enables the saving of the plots in png format.
- dpi: the resolution of the plots.
diagnostic_attributes: a block with the following information:
- variables: the list of variables to analyse.
- plev: the specific pressure level to analyse (default: null)
- seasons: a boolean that enables the seasonal analysis.
- seasons_stat: the statistic to use for the seasonal analysis (e.g., ‘mean’).
- vertical: a boolean that enables the vertical profiles.
- regrid: the grid you want your data to be regridded to (e.g. ‘r100’).
- startdate_data: the start date of the dataset.
- enddate_data: the end date of the dataset.
- startdate_obs: the start date of the reference dataset.
- enddate_obs: the end date of the reference dataset.
biases_plot_params: a block defining colorbar limits for plotting biases. Each variable can have its own range.

Output

The diagnostic generates three types of plots for each variable:

The global bias of the model compared to the reference dataset.

The global bias of the model compared to the reference dataset for each season.

If the variable is 3d, a vertical profile of the bias of the model compared to the reference dataset at each pressure level.

These plots are saved in a PDF and png format as well as NetCDF files.

Observations

The diagnostic uses ERA5 as a default reference dataset for the bias analysis. Custom reference datasets can be used.

Example Plots

All these plots can be produced by running the notebooks in the notebooks directory on LUMI HPC.

../_images/global_bias.png — Global mean temperature bias of IFS-NEMO historical-1990 with respect to ERA5 climatology.

../_images/seasonal_bias.png — Seasonal temperature bias of IFS-NEMO historical-1990 with respect to ERA5 climatology.

../_images/vertical_bias.png — Vertical bias of q of IFS-NEMO historical-1990 with respect to ERA climatology.

Available demo notebooks

Notebooks are stored in diagnostics/global_biases/notebooks:

global_biases.ipynb

Detailed API

This section provides a detailed reference for the Application Programming Interface (API) of the timeseries diagnostic, produced from the diagnostic function docstrings.

class aqua.diagnostics.global_biases.GlobalBiases(data=None, data_ref=None, var_name=None, plev=None, outputdir=None, loglevel='WARNING', model=None, exp=None, startdate_data=None, enddate_data=None, model_obs=None, startdate_obs=None, enddate_obs=None)

Bases: object

A class to process and visualize global mean data.

data

Input data for analysis.

Type:: xr.Dataset

data_ref

Reference data for comparison.

Type:: xr.Dataset

var_name

Name of the variable to analyze.

Type:: str

plev

Pressure level to select.

Type:: float

outputdir

Directory to save output plots.

Type:: str

loglevel

Logging level.

Type:: str

model

Model name for labeling.

Type:: str, optional

exp

Experiment name for labeling.

Type:: str, optional

startdate_data

Start date of data period.

Type:: str, optional

enddate_data

End date of data period.

Type:: str, optional

model_obs

Obs name for labeling.

Type:: str, optional

startdate_obs

Start date of reference period.

Type:: str, optional

enddate_obs

End date of reference period.

Type:: str, optional

plot_bias(stat='mean', vmin=None, vmax=None)

Plots global biases or a single dataset map if reference data is unavailable.

Parameters:

stat (str) – Statistic for calculation (‘mean’ by default).
vmin (float, optional) – Minimum colorbar value.
vmax (float, optional) – Maximum colorbar value.

Returns:

Matplotlib figure, axis objects, and xarray Dataset of the calculated bias if available.

Return type:

tuple

plot_seasonal_bias(seasons_stat='mean', vmin=None, vmax=None)

Plots seasonal biases for each season (DJF, MAM, JJA, SON) and returns an xarray.Dataset containing the calculated seasonal biases.

Parameters:

seasons_stat (str) – Statistic for seasonal analysis (‘mean’ by default).
vmin (float, optional) – Minimum colorbar value.
vmax (float, optional) – Maximum colorbar value.

Returns:

Matplotlib figure and an xarray.Dataset of the calculated seasonal biases.

Return type:

tuple

plot_vertical_bias(data=None, data_ref=None, var_name=None, plev_min=None, plev_max=None, vmin=None, vmax=None)

Calculates and plots the vertical bias between two datasets.

Parameters:

data (xr.Dataset, optional) – Dataset for analysis.
data_ref (xr.Dataset, optional) – Reference dataset for comparison.
var_name (str, optional) – Variable name to analyze.
plev_min (float, optional) – Minimum pressure level.
plev_max (float, optional) – Maximum pressure level.
vmin (float, optional) – Minimum colorbar value.
vmax (float, optional) – Maximum colorbar value.

Returns:

Matplotlib figure, axis objects, and xarray Dataset of the calculated bias.

Return type:

tuple

static select_pressure_level(data, plev, var_name)

Selects a specified pressure level from the dataset.

Parameters:

data (xr.Dataset) – Dataset to select from.
plev (float) – Desired pressure level.
var_name (str) – Variable name to filter by.

Returns:

Filtered dataset at specified pressure level.

Return type:

xr.Dataset

Raises:

NoDataError – If specified pressure level is not available.