Core Functions

A set of functions available for the development of new diagnostics is provided in the aqua.diagnostics.core module.

class aqua.diagnostics.core.Diagnostic(catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, loglevel: str = 'WARNING')

Bases: object

Initialize the diagnostic class. This is a general purpose class that can be used by the diagnostic classes to retrieve data from a single model and to save the data to a netcdf file. It is not a working diagnostic class by itself.

Parameters:
  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • regrid (str) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str) – The start date of the data to be retrieved. If None, all available data will be retrieved.

  • enddate (str) – The end date of the data to be retrieved. If None, all available data will be retrieved.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

retrieve(var: str = None)

Retrieve the data from the model.

Parameters:

var (str) – The variable to be retrieved. If None, all variables will be retrieved.

self.data

The data retrieved from the model. If return_data is True, the data will be returned.

self.catalog

The catalog used to retrieve the data if no catalog was provided.

save_netcdf(data, diagnostic: str, diagnostic_product: str = None, outdir: str = '.', rebuild: bool = True, **kwargs)

Save the data to a netcdf file.

Parameters:
  • data (xarray Dataset or DataArray) – The data to be saved.

  • diagnostic (str) – The diagnostic name.

  • diagnostic_product (str) – The diagnostic product.

  • outdir (str) – The path to save the data. Default is ‘.’.

  • rebuild (bool) – If True, the netcdf file will be rebuilt. Default is True.

Keyword Arguments:

**kwargs – Additional keyword arguments to be passed to the OutputSaver.save_netcdf method.

select_region(region: str = None, diagnostic: str = None, drop: bool = True)

Selects a geographic region from the dataset and updates self.data accordingly.

If a region name is provided, the method filters the data using the region’s predefined latitude and longitude bounds. The selected region name is stored in the dataset attributes.

Parameters:
  • region (str, optional) – Name of the region to select. If None, no filtering is applied.

  • diagnostic (str, optional) – Diagnostic category used to determine region bounds.

  • drop (bool, optional) – Whether to drop coordinates outside the selected region. Default is True.

Returns:

(region, lon_limits, lat_limits)

Return type:

tuple

class aqua.diagnostics.core.OutputSaver(diagnostic: str, catalog: str = None, model: str = None, exp: str = None, catalog_ref: str = None, model_ref: str = None, exp_ref: str = None, outdir: str = '.', rebuild: bool = True, loglevel: str = 'WARNING')

Bases: object

Class to manage saving outputs, including NetCDF, PDF, and PNG files, with customized naming based on provided parameters and metadata.

Initialize the OutputSaver with diagnostic parameters and output directory. All the catalog, model, and experiment can be both a string or a list of strings.

Parameters:
  • diagnostic (str) – Name of the diagnostic.

  • catalog (str, optional) – Catalog name.

  • model (str, optional) – Model name.

  • exp (str, optional) – Experiment name.

  • catalog_ref (str, optional) – Reference catalog name.

  • model_ref (str, optional) – Reference model name.

  • exp_ref (str, optional) – Reference experiment name.

  • outdir (str, optional) – Output directory. Defaults to current directory.

  • loglevel (str, optional) – Logging level. Defaults to ‘WARNING’.

create_metadata(diagnostic_product: str, extra_keys: dict = None, metadata: dict = None) dict

Create metadata dictionary for a plot or output file.

Parameters:
  • diagnostic_product (str) – Product of the diagnostic analysis.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

  • metadata (dict, optional) – Additional metadata to include in the PNG file.

generate_name(diagnostic_product: str, catalog: str = None, model: str = None, exp: str = None, catalog_ref: str = None, model_ref: str = None, exp_ref: str = None, extra_keys: dict = None) str

Generate a filename based on provided parameters and additional user-defined keywords

Parameters:
  • diagnostic_product (str, optional) – Product of the diagnostic analysis.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

Returns:

A string representing the generated filename.

Return type:

str

save_netcdf(dataset: Dataset, diagnostic_product: str, rebuild: bool = True, extra_keys: dict = None, metadata: dict = None)

Save an xarray Dataset as a NetCDF file with a generated filename.

Parameters:
  • dataset (xr.Dataset) – The xarray Dataset to save.

  • diagnostic_product (str) – Product of the diagnostic analysis.

  • rebuild (bool, optional) – Whether to rebuild the output file if it already exists. Defaults to True.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

  • metadata (dict, optional) – Additional metadata to include in the NetCDF file.

save_pdf(fig: Figure, diagnostic_product: str, rebuild: bool = True, extra_keys: dict = None, metadata: dict = None)

Save a Matplotlib figure as a PDF file with a generated filename.

Parameters:
  • fig (plt.Figure) – The Matplotlib figure to save.

  • diagnostic_product (str) – Product of the diagnostic analysis.

  • rebuild (bool, optional) – Whether to rebuild the output file if it already exists. Defaults to True.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

  • metadata (dict, optional) – Additional metadata to include in the PDF file.

save_png(fig: Figure, diagnostic_product: str, rebuild: bool = True, extra_keys: dict = None, metadata: dict = None, dpi: int = 300)

Save a Matplotlib figure as a PNG file with a generated filename.

Parameters:
  • fig (plt.Figure) – The Matplotlib figure to save.

  • diagnostic_product (str) – Product of the diagnostic analysis.

  • rebuild (bool, optional) – Whether to rebuild the output file if it already exists. Defaults to True.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

  • metadata (dict, optional) – Additional metadata to include in the PNG file.

  • dpi (int, optional) – Dots per inch for the PNG file.

aqua.diagnostics.core.close_cluster(client, cluster, private_cluster, loglevel: str = 'WARNING')

Close the dask cluster and client.

Parameters:
  • client (dask.distributed.Client) – dask client

  • cluster (dask.distributed.LocalCluster) – dask cluster

  • private_cluster (bool) – whether the cluster is private

  • loglevel (str) – logging level

aqua.diagnostics.core.convert_data_units(data, var: str, units: str, loglevel: str = 'WARNING')

Make sure that the data is in the correct units.

Parameters:
  • data (xarray Dataset or DataArray) – The data to be checked.

  • var (str) – The variable to be checked.

  • units (str) – The units to be checked.

aqua.diagnostics.core.load_diagnostic_config(diagnostic: str, config: str = None, default_config: str = 'config.yaml', loglevel: str = 'WARNING')

Load the diagnostic configuration file and return the configuration dictionary.

Parameters:
  • diagnostic (str) – diagnostic name

  • config (str) – config argument can modify the default configuration file.

  • default_config (str) – default name configuration file (yaml format)

  • loglevel (str) – logging level. Default is ‘WARNING’.

Returns:

configuration dictionary

Return type:

dict

aqua.diagnostics.core.merge_config_args(config: dict, args: Namespace, loglevel: str = 'WARNING') dict

Merge the configuration dictionary with the arguments of the CLI.

Parameters:
  • config (dict) – configuration dictionary

  • args (argparse.Namespace) – arguments of the CLI

  • loglevel (str) – logging level. Default is ‘WARNING’.

Returns:

merged configuration dictionary

Return type:

dict

aqua.diagnostics.core.open_cluster(nworkers, cluster, loglevel: str = 'WARNING')

Open a dask cluster if nworkers is provided, otherwise connect to an existing cluster.

Parameters:
  • nworkers (int) – number of workers

  • cluster (str) – cluster address

  • loglevel (str) – logging level

Returns:

dask client cluster (dask.distributed.LocalCluster): dask cluster private_cluster (bool): whether the cluster is private

Return type:

client (dask.distributed.Client)

aqua.diagnostics.core.round_enddate(enddate)

Round the end date to the end of the month

Parameters:

enddate (str or pandas.Timestamp) – end date for the data retrieve

Returns:

end date rounded to the end of the month

Return type:

pandas.Timestamp

aqua.diagnostics.core.round_startdate(startdate)

Round the start date to the beginning of the month

Parameters:

startdate (str or pandas.Timestamp) – start date for the data retrieve

Returns:

start date rounded to the beginning of the month

Return type:

pandas.Timestamp

aqua.diagnostics.core.start_end_dates(startdate=None, enddate=None, start_std=None, end_std=None)

Evaluate start and end dates for the reference data retrieve, in the case both are provided, to minimize the Reader calls. They should be of the form ‘YYYY-MM-DD’ or ‘YYYYMMDD’. The function will translate them to the form ‘YYYY-MM-DD’ and then use pandas Timestamp to evaluate the minimum and maximum dates.

Parameters:
  • startdate (str) – start date for the data retrieve

  • enddate (str) – end date for the data retrieve

  • start_std (str) – start date for the standard deviation data retrieve

  • end_std (str) – end date for the standard deviation data retrieve

Returns:

start and end dates for the data retrieve

Return type:

tuple (str, str)

aqua.diagnostics.core.template_parse_arguments(parser: ArgumentParser)

Add the default arguments to the parser.

Parameters:

parser – argparse.ArgumentParser

Returns:

argparse.ArgumentParser