pycontrails.datalib.ecmwf.arco_era5

Support for ARCO ERA5.

This module supports:

  • Downloading ARCO ERA5 model level data for specific times and pressure level variables.

  • Downloading ARCO ERA5 single level data for specific times and single level variables.

  • Interpolating model level data to a target lat-lon grid and pressure levels.

  • Local caching of the downloaded and interpolated data as netCDF files.

  • Opening cached data as a pycontrails.MetDataset object.

This module requires the following additional dependencies:

Functions

open_arco_era5_model_level_data(t, ...)

Open ARCO ERA5 model level data for a specific time and variables.

open_arco_era5_single_level(t, variables)

Open ARCO ERA5 single level data for a specific date and variables.

pressure_levels_at_model_levels(alt_ft_min, ...)

Return the pressure levels at each model level assuming a constant surface pressure.

Classes

ARCOERA5(time, variables[, pressure_levels, ...])

ARCO ERA5 data accessed remotely through Google Cloud Storage.

class pycontrails.datalib.ecmwf.arco_era5.ARCOERA5(time, variables, pressure_levels=None, grid=0.25, cachestore=<object object>, n_jobs=1, cleanup_metview_tempfiles=True)

Bases: ECMWFAPI

ARCO ERA5 data accessed remotely through Google Cloud Storage.

This is a high-level interface to access and cache ARCO ERA5 for a predefined set of times, variables, and pressure levels.

Added in version 0.50.0.

Parameters:
  • time (TimeInput) – Time of the data to open.

  • variables (VariableInput) – List of variables to open.

  • pressure_levels (PressureLevelInput, optional) – Target pressure levels, [\(hPa\)]. For pressure level data, this should be a sorted (increasing or decreasing) list of integers. For single level data, this should be -1. By default, the pressure levels are set to the pressure levels at each model level between 20,000 and 50,000 ft assuming a constant surface pressure.

  • grid (float, optional) – Target grid resolution, [\(\deg\)]. Default is 0.25.

  • cachestore (CacheStore, optional) – Cache store to use. By default, a new disk cache store is used. If None, no caching is done.

  • n_jobs (int, optional) – EXPERIMENTAL: Number of parallel jobs to use for downloading data. By default, 1.

  • cleanup_metview_tempfiles (bool, optional) – If True, cleanup all TEMP_DIRECTORY/tmp*.grib files. Implementation is brittle and may not work on all systems. By default, True.

References

[Carver and Merose, 2023]

create_cachepath(t)

Return cachepath to local data file based on datetime.

Parameters:

t (datetime) – Datetime of datafile

Returns:

str – Path to cached data file

download_dataset(times)

Download data from data source for input times.

Parameters:

times (list[datetime]) – List of datetimes to download a store in cache

grid

Lat / Lon grid spacing

open_metdataset(dataset=None, xr_kwargs=None, **kwargs)

Open MetDataset from data source.

This method should download / load any required datafiles and returns a MetDataset of the multi-file dataset opened by xarray.

Parameters:
  • dataset (xr.Dataset | None, optional) – Input xr.Dataset loaded manually. The dataset must have the same format as the original data source API or files.

  • xr_kwargs (dict[str, Any] | None, optional) – Dictionary of keyword arguments passed into xarray.open_mfdataset() when opening files. Examples include “chunks”, “engine”, “parallel”, etc. Ignored if dataset is input.

  • **kwargs (Any) – Keyword arguments passed through directly into MetDataset constructor.

Returns:

MetDataset – Meteorology dataset

paths

Path to local source files to load. Set to the paths of files cached in cachestore if no paths input is provided on init.

property pressure_level_variables

Variables available in the ARCO ERA5 model level data.

Returns:

list[MetVariable] | None – List of MetVariable available in datasource

pressure_levels

List of pressure levels. Set to [-1] for data without level coordinate. Use parse_pressure_levels() to handle PressureLevelInput.

set_metadata(ds)

Set met source metadata on ds.attrs.

This is called within the open_metdataset() method to set metadata on the returned MetDataset instance.

Parameters:

ds (xr.Dataset | MetDataset) – Dataset to set metadata on. Mutated in place.

property single_level_variables

Variables available in the ARCO ERA5 single level data.

Returns:

list[MetVariable] | None – List of MetVariable available in datasource

timesteps

List of individual timesteps from data source derived from time Use parse_time() to handle TimeInput.

variables

Variables requested from data source Use parse_variables() to handle VariableInput.

pycontrails.datalib.ecmwf.arco_era5.open_arco_era5_model_level_data(t, variables, pressure_levels, grid)

Open ARCO ERA5 model level data for a specific time and variables.

This function downloads moisture, wind, and surface data from the ARCO ERA5 Zarr stores and interpolates the data to a target grid and pressure levels.

This function requires the metview package to be installed. It is not available as an optional pycontrails dependency, and instead must be installed manually.

Parameters:
  • t (datetime.datetime) – Time of the data to open.

  • variables (list[met_var.MetVariable]) – List of variables to open. Unsupported variables are ignored.

  • pressure_levels (list[int]) – Target pressure levels, [\(hPa\)]. For metview compatibility, this should be a sorted (increasing or decreasing) list of integers. Floating point values are treated as integers in metview.

  • grid (float) – Target grid resolution, [\(\deg\)]. A value of 0.25 is recommended.

Returns:

xarray.Dataset – Dataset with the requested variables on the target grid and pressure levels. Data is reformatted for MetDataset conventions. Data is not cached.

References

pycontrails.datalib.ecmwf.arco_era5.open_arco_era5_single_level(t, variables)

Open ARCO ERA5 single level data for a specific date and variables.

Parameters:
  • t (datetime.date) – Date of the data to open.

  • variables (list[met_var.MetVariable]) – List of variables to open.

Returns:

xarray.Dataset – Dataset with the requested variables. Data is reformatted for MetDataset conventions. Data is not cached.

Raises:

FileNotFoundError – If the variable is not found at the requested date. This could indicate that the variable is not available in the ARCO ERA5 dataset, or that the time requested is outside the available range.

pycontrails.datalib.ecmwf.arco_era5.pressure_levels_at_model_levels(alt_ft_min, alt_ft_max)

Return the pressure levels at each model level assuming a constant surface pressure.

The pressure levels are rounded to the nearest hPa.

Parameters:
  • alt_ft_min (float) – Minimum altitude, [\(ft\)].

  • alt_ft_max (float) – Maximum altitude, [\(ft\)].

Returns:

list[int] – List of pressure levels, [\(hPa\)].