Core Functions

A set of functions available for the development of new diagnostics is provided in the aqua.diagnostics.core module.

class aqua.diagnostics.core.Diagnostic(catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, loglevel: str = 'WARNING')

Bases: object

Initialize the diagnostic class. This is a general purpose class that can be used by the diagnostic classes to retrieve data from a single model and to save the data to a netcdf file. It is not a working diagnostic class by itself.

Parameters:
  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • regrid (str) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str) – The start date of the data to be retrieved. If None, all available data will be retrieved.

  • enddate (str) – The end date of the data to be retrieved. If None, all available data will be retrieved.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

retrieve(var: str = None)

Retrieve the data from the model.

Parameters:

var (str) – The variable to be retrieved. If None, all variables will be retrieved.

self.data

The data retrieved from the model. If return_data is True, the data will be returned.

self.catalog

The catalog used to retrieve the data if no catalog was provided.

save_netcdf(data, diagnostic: str, diagnostic_product: str = None, default_path: str = '.', rebuild: bool = True, **kwargs)

Save the data to a netcdf file.

Parameters:
  • data (xarray Dataset or DataArray) – The data to be saved.

  • diagnostic (str) – The diagnostic name.

  • diagnostic_product (str) – The diagnostic product.

  • default_path (str) – The default path to save the data. Default is ‘.’.

  • rebuild (bool) – If True, the netcdf file will be rebuilt. Default is True.

Keyword Arguments:

**kwargs – Additional keyword arguments to be passed to the OutputSaver.save_netcdf method.

class aqua.diagnostics.core.OutputSaver(diagnostic: str, catalog: str = None, model: str = None, exp: str = None, catalog_ref: str = None, model_ref: str = None, exp_ref: str = None, outdir: str = '.', rebuild: bool = True, loglevel: str = 'WARNING')

Bases: object

Class to manage saving outputs, including NetCDF, PDF, and PNG files, with customized naming based on provided parameters and metadata.

Initialize the OutputSaver with diagnostic parameters and output directory. All the catalog, model, and experiment can be both a string or a list of strings.

Parameters:
  • diagnostic (str) – Name of the diagnostic.

  • catalog (str, optional) – Catalog name.

  • model (str, optional) – Model name.

  • exp (str, optional) – Experiment name.

  • catalog_ref (str, optional) – Reference catalog name.

  • model_ref (str, optional) – Reference model name.

  • exp_ref (str, optional) – Reference experiment name.

  • outdir (str, optional) – Output directory. Defaults to current directory.

  • rebuild (bool, optional) – Whether to rebuild the output directory. Defaults to True.

  • loglevel (str, optional) – Logging level. Defaults to ‘WARNING’.

create_metadata(diagnostic_product: str, extra_keys: dict = None, metadata: dict = None) dict

Create metadata dictionary for a plot or output file.

Parameters:
  • diagnostic_product (str) – Product of the diagnostic analysis.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

  • metadata (dict, optional) – Additional metadata to include in the PNG file.

generate_name(diagnostic_product: str, catalog: str = None, model: str = None, exp: str = None, catalog_ref: str = None, model_ref: str = None, exp_ref: str = None, extra_keys: dict = None) str

Generate a filename based on provided parameters and additional user-defined keywords, including precise time intervals.

Parameters:
  • diagnostic_product (str, optional) – Product of the diagnostic analysis.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

Returns:

A string representing the generated filename.

Return type:

str

save_netcdf(dataset: Dataset, diagnostic_product: str, extra_keys: dict = None)

Save an xarray Dataset as a NetCDF file with a generated filename.

Parameters:
  • dataset (xr.Dataset) – The xarray Dataset to save.

  • diagnostic_product (str) – Product of the diagnostic analysis.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

save_pdf(fig: Figure, diagnostic_product: str, extra_keys: dict = None, metadata: dict = None)

Save a Matplotlib figure as a PDF file with a generated filename.

Parameters:
  • fig (plt.Figure) – The Matplotlib figure to save.

  • diagnostic_product (str) – Product of the diagnostic analysis.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

  • metadata (dict, optional) – Additional metadata to include in the PDF file.

save_png(fig: Figure, diagnostic_product: str, extra_keys: dict = None, metadata: dict = None, dpi: int = 300)

Save a Matplotlib figure as a PNG file with a generated filename.

Parameters:
  • fig (plt.Figure) – The Matplotlib figure to save.

  • diagnostic_product (str) – Product of the diagnostic analysis.

  • extra_keys (dict, optional) – Dictionary of additional keys to include in the filename.

  • metadata (dict, optional) – Additional metadata to include in the PNG file.

  • dpi (int, optional) – Dots per inch for the PNG file.

aqua.diagnostics.core.close_cluster(client, cluster, private_cluster, loglevel: str = 'WARNING')

Close the dask cluster and client.

Parameters:
  • client (dask.distributed.Client) – dask client

  • cluster (dask.distributed.LocalCluster) – dask cluster

  • private_cluster (bool) – whether the cluster is private

  • loglevel (str) – logging level

aqua.diagnostics.core.convert_data_units(data, var: str, units: str, loglevel: str = 'WARNING')

Make sure that the data is in the correct units.

Parameters:
  • data (xarray Dataset or DataArray) – The data to be checked.

  • var (str) – The variable to be checked.

  • units (str) – The units to be checked.

aqua.diagnostics.core.load_diagnostic_config(diagnostic: str, args: Namespace, default_config: str = 'config.yaml', loglevel: str = 'WARNING')

Load the diagnostic configuration file and return the configuration dictionary.

Parameters:
  • diagnostic (str) – diagnostic name

  • args (argparse.Namespace) – arguments of the CLI. “config” argument can modify the default configuration file.

  • default_config (str) – default name configuration file (yaml format)

  • loglevel (str) – logging level. Default is ‘WARNING’.

Returns:

configuration dictionary

Return type:

dict

aqua.diagnostics.core.merge_config_args(config: dict, args: Namespace, loglevel: str = 'WARNING') dict

Merge the configuration dictionary with the arguments of the CLI.

Parameters:
  • config (dict) – configuration dictionary

  • args (argparse.Namespace) – arguments of the CLI

  • loglevel (str) – logging level. Default is ‘WARNING’.

Returns:

merged configuration dictionary

Return type:

dict

aqua.diagnostics.core.open_cluster(nworkers, cluster, loglevel: str = 'WARNING')

Open a dask cluster if nworkers is provided, otherwise connect to an existing cluster.

Parameters:
  • nworkers (int) – number of workers

  • cluster (str) – cluster address

  • loglevel (str) – logging level

Returns:

dask client cluster (dask.distributed.LocalCluster): dask cluster private_cluster (bool): whether the cluster is private

Return type:

client (dask.distributed.Client)

aqua.diagnostics.core.start_end_dates(startdate=None, enddate=None, start_std=None, end_std=None)

Evaluate start and end dates for the reference data retrieve, in the case both are provided, to minimize the Reader calls. They should be of the form ‘YYYY-MM-DD’ or ‘YYYYMMDD’. The function will translate them to the form ‘YYYY-MM-DD’ and then use pandas Timestamp to evaluate the minimum and maximum dates.

Parameters:
  • startdate (str) – start date for the data retrieve

  • enddate (str) – end date for the data retrieve

  • start_std (str) – start date for the standard deviation data retrieve

  • end_std (str) – end date for the standard deviation data retrieve

Returns:

start and end dates for the data retrieve

Return type:

tuple (str, str)

aqua.diagnostics.core.template_parse_arguments(parser: ArgumentParser)

Add the default arguments to the parser.

Parameters:

parser – argparse.ArgumentParser

Returns:

argparse.ArgumentParser