The AQUA console
What is the AQUA console?
The AQUA console is a command line interface that has two main purposes:
A central access to install and to manage where the configuration (also fixes and grids) and catalog files are stored has been added.
A tool for more complex operations:
DROP (see aqua drop -c <config_file> <drop-options> and DROP - Data Reduction OPerator)
FDB catalog generator (see Catalog Generator).
Diagnostics wrapper for a complete experiment analysis (see AQUA analysis wrapper).
Here we give a brief overview of the features. If you are a developer, you may want to read the Developer notes section.
The entry point for the console is the command aqua.
It has the following subcommands:
The main command has some options listed below:
- --version
To show the AQUA version.
- --path
To show the path where the source code is installed. This is particularly useful if you’re running a script that uses AQUA.
Warning
Some of the CLI commands (see Command Line Interface tools) are still relying on the existance
of an environment variable AQUA pointing to the main AQUA folder.
This is deprecated in favor of the new console command.
- --help, -h
To show the help message.
It is possible to set the level of verbosity with two options:
- --verbose, -v
It increases the verbosity level, setting it to INFO.
- --very-verbose, -vv
It increases the verbosity level, setting it to DEBUG.
In both cases the level of verbosity has to be specified before the subcommand.
aqua install
With this command the configuration file and the default data models, grids and fixes are copied to the destination folder.
If the aqua-diagnostics package is found in the current environment,
also the configuration files for the diagnostics are copied (see aqua-diagnostics documentation).
By default, the destination folder will be $HOME/.aqua.
It is possible to specify from where to copy and where to store.
It is also possible to ask for an editable installation, so that only links are created, ideal for developers,
which can keep their catalogs or fixes files under version control.
Note
A configuration file is necessary to run AQUA.
In the AQUA release a template is provided.
Even if the aqua install is done in editable mode, this configuration file will be copied and customized in the destination folder.
Mandatory arguments are:
- machine-name
The name of the machine where you are installing. It is a mandatory argument. Even if you are working on your local machine, always define it (even a random name would suffice!) Setting machine to lumi, levante or MN5 is fundamental to use AQUA on these platforms.
Optional arguments are:
- --core, --core <path/to/aqua-core/repo>
If used without specifying a path, it will copy the configuration files from the AQUA core package installed in the current environment. If a path is specified, the folders containing the configuration files will be linked from the specified path, allowing developers to work on their local copy of AQUA core.
Warning
In version v0.19 and earlier, the path needed to point to the config folder inside the AQUA core repository.
This is not anymore necessary, as the command will determine the correct path automatically.
- --diagnostics, --diagnostics <path/to/aqua-diagnostics/repo>
If used without specifying a path, it will copy the configuration files from the AQUA diagnostics package installed in the current environment. If a path is specified, the folders containing the configuration files will be linked from the specified path, allowing developers to work on their local copy of AQUA diagnostics.
Note
The default behaviour, when no extra arguments are specified, is to implicitly try both --core and --diagnostics options,
depending on the presence of the corresponding packages in the current environment.
- --path, -p <path>
The folder where the configuration file is copied to. Default is
$HOME/.aqua. If this option is used, the tool will ask the user if they want a link in the default folder$HOME/.aqua. If this link is not created, the environment variableAQUA_CONFIGhas to be set to the folder specified in order to expose it to AQUA.
aqua avail
This simple command will print all the available catalogs to be installed from a repository. By default this will be the Climate-DT-catalog.
- -r, --repository <user/repo>
It is possible to specify a different repository to explore. The format is
user/repo. For example,DestinE-Climate-DT/Climate-DT-catalog. If this option is not specified, the default repository will be used.
aqua add <catalog>
This command adds a catalog to the list of available catalogs. It will copy the catalog folder and files to the destination folder. As before, it is possible to specify if symbolic links have to be created and it is possible to install extra catalogs not present in the AQUA release.
Note
The default catalog is detached from the AQUA repository and it is available here. It is possible to use other catalogs as well. The folder structure has to be the same as the default catalog.
Multiple catalogs can be installed with multiple calls to aqua add.
By default the catalog will be downloaded from the external Climate-DT catalog repository,
if a matching catalog is found. It is possible to specify a different repository.
As shown below, it is also possible to specify a local path and install the catalog from there.
Similarly to the installation command, this will create symbolic links to the local folder,
ideal for developers.
- catalog
The name of the catalog to be added. It is a mandatory argument. If the installation is done in editable mode, this name can be customized.
- --editable, -e <path>
It installs the catalog based on the path given. It will create a symbolic link to the catalog folder. This is very recommended for developers. Please read the Developer notes section.
- --repository, -r <user/repo>
It is possible to specify a different repository to explore. The format is
user/repo. For example,DestinE-Climate-DT/Climate-DT-catalog. If this option is not specified, the default repository will be used.
Warning
Adding a catalog not in editable mode makes use of GitHub API.
These are limited to 60 requests per hour for unauthenticated users and it may easily hit the limit.
If you encounter this issue, you can generate a personal access token and set it as an environment variable
GITHUB_TOKEN, together with a GITHUB_USER variable with your GitHub username.
aqua remove <catalog>
It removes a catalog from the list of available catalogs. This means that the catalog folder will be removed from the installation folder or the link will be deleted if the catalog is installed in editable mode.
- catalog
The name of the catalog to be removed. It is a mandatory argument.
aqua set <catalog>
This command sets the default main catalog to be used.
Since it is possible to have multiple catalogs installed and accessible at the same time,
if more than one catalog is present it will move the selected catalog to the top of the list.
The Reader behaviour will be then, if multiple triplets of model, exp, source are found in multiple
catalogs, to use the first one found in the selected catalog.
- catalog
The name of the catalog to be set as default. It is a mandatory argument.
aqua list
This command lists the available catalogs in the installation folder. It will show also if a catalog is installed in editable mode.
- -a, -all
This will show also all the fixes, grids and data models installed
aqua uninstall
This command removes the configuration and catalog files from the installation folder. If the installation was done in editable mode, only the links will be removed.
Note
If you need to reinstall aqua, the command aqua install will ask if you want to overwrite the existing files.
aqua update
This command will update all the configuration files, both from AQUA core and AQUA diagnostics, if installed in the current environment.
- -c, --catalog
This command will check if there is a new version of the catalog available and update it by overwriting the current installation. This will work only for catalogs installed from the Climate-DT repository. If the catalog is installed in editable mode, this command will not work. It is possible to specify ‘all’ as catalog name to update all the catalogs installed not in editable mode.
aqua fixes {add,remove} <fixes-file>
This submcommand is able to add or remove a fixes YAML file to the list of available installed fixes. It will copy the fix file to the destination folder, or create a symbolic link if the editable mode is used. This is useful if a new external fix is created and needs to be added to the list of available fixes.
- <fix-file>
The path of the file to be added. This is a mandatory field.
- -e, --editable
It will create a symbolic link to the fix folder. Valid only for
aqua fixes add
aqua grids {add,remove} <grid-file>
This subcommand is able to add or remove a grids YAML file to the list of available installed grids. It will copy the grids file to the destination folder, or create a symbolic link if the editable mode is used. This is useful if new external grids are created and need to be added to the list of available grids.
- <grid-file>
The path of the file to be added. This is a mandatory field.
- -e, --editable
It will create a symbolic link to the grid folder. Valid only for
aqua grids add
aqua grids set <path>
This subcommand sets in the configuration file the path to the grids, areas and weights folders.
If you need to deploy grids in this path, you can use the aqua grids deploy command (see aqua grids deploy <grid-name>).
- <path>
The path to the grids, areas and weights folders. This is a mandatory field. The code will create the subfolders
grids,areasandweightsin the specified path.
Note
By default, it is not needed to set the path to the grids, areas and weights folders. AQUA will determine the path automatically based on the machine in the configuration file. This command is useful in new machines or if you don’t have access to the default folders. In a locall installation for example, catalogs will not be able to find the grids, areas and weights unless this command is used to set the correct path.
aqua grids deploy <grid-name>
This subcommand is used to deploy a grid from a bucket to the local file system. It is useful to set up AQUA in a new machine, where the grids are not available yet. A match for the grid name (wildcard are supported) will be searched in the grids file in the configuration folder. If a match is found, the grid will be deployed from the bucket to the local file system.
Note
In order to avoid unwanted overriding of existing grids in a shared system,
the command will work only if the user has set a custom path for the grids, areas and weights folders with the aqua grids set command.
See the aqua grids set <path> section for more details.
- <grid-name>
The name of the grid to be deployed. This is a mandatory field.
Example usage:
To deploy the atmospheric healpix10 grid, you can run the following command:
aqua grids deploy hpz10-nested
aqua grids build
This subcommand is used to build grids from sources. Given a specific Reader() source, it tries to build a grid file based on the data available.
This is available for regular, healpix, curvilinear grids. Partial support for unstructured grids is also available, while gaussian grids are not supported yet.
It also create the correspondent grid entry in the grid file in the config/grids folder.
The following options are available for aqua grids build:
- -c, --config <file>
YAML configuration file for the builder. If not specified, options must be provided via CLI.
- --catalog <catalog>
Catalog identifying the source for the Reader() call.
- -m, --model <model>
Model name (e.g. “IFS”) for the Reader() call. (Required)
- -e, --exp <experiment>
Experiment name for the Reader() call. (Required)
- -s, --source <source>
Data source for the Reader() call. (Required)
- -l, --loglevel <level>
Log level for the builder. Default is WARNING.
- --rebuild
Rebuild the grid even if it already exists.
- --version <version>
Version number for the grid file. Currently integer versioning is supported. Useful for multiple versions of the same grid.
- --outdir <directory>
Output directory for the grid file. Default is the current directory.
- --original <resolution>
Original resolution of the grid. Useful for masked grids which have been remapped to a different resolution.
- --modelname <name>
Alternative name for the model for grid naming. Useful for coupled models sources.
- --gridname <name>
Alternative name for the grid for grid naming. Required for Curvilinear and Unstructured grids, where the CDO grids cannot be guessed.
- --vert_coord <name>
Name of the vertical coordinate in the dataset. Useful for 3D datasets where the vertical coordinate is not automatically detected by
GridInspector.
- --fix
Apply fixes and data model to the original source before building the grid. Useful for models with very specific coordinates/dimensions. Suggested as default setting, disabled if issues arise.
- --verify
Verify the grid file after creation. This is done by calling CDO via
smmregridto check if the weights generation is valid.
- --yaml
Create the grid entry in the grid file after building. This has to be added to catalog source_grid_name manually to be used by the Reader. Please keep in mind that this is not verified yet.
- --force_unstructured
Force the grid detection to use unstructured grid type. Useful for datasets with ambiguous grid types (e.g. gaussian regular with inverted lon/lat dimensions).
aqua drop -c <config_file> <drop-options>
This subcommand launch DROP (Data Reduction OPerator). For full description of the DROP functionalities, please refer to the DROP - Data Reduction OPerator section. In most of cases, it is better to embed this tool within a batch job.
aqua analysis <analysis-options>
This subcommand launch the analysis tool, which is a flexible wrapper for the diagnostics. It allows to run a set of diagnostics on a specific experiment. For a complete description of the analysis tool, please refer to the AQUA analysis wrapper section.