pycontrails.datalib.ecmwf.era5_model_level

Model-level ERA5 data access.

This module supports

  • Retrieving model-level ERA5 data by submitting MARS requests through the Copernicus CDS.

  • Processing retrieved model-level files to produce netCDF files on target pressure levels.

  • Local caching of processed netCDF files.

  • Opening processed and cached files as a pycontrails.MetDataset object.

Consider using pycontrails.datalib.ecmwf.ARCOERA5 to access model-level data from the nominal ERA5 reanalysis between 1959 and 2022. pycontrails.datalib.ecmwf.ARCOERA5 accesses data through Google’s Analysis-Ready, Cloud Optimized ERA5 dataset and has lower latency than this module, which retrieves data from the Copernicus Climate Data Store. This module must be used to retrieve model-level data from ERA5 ensemble members or for more recent dates.

Classes

ERA5ModelLevel(time, variables, *[, ...])

Class to support model-level ERA5 data access, download, and organization.

class pycontrails.datalib.ecmwf.era5_model_level.ERA5ModelLevel(time, variables, *, pressure_levels=None, timestep_freq=None, product_type='reanalysis', grid=None, model_levels=None, ensemble_members=None, cachestore=<object object>, cache_download=False, url=None, key=None)

Bases: ECMWFAPI

Class to support model-level ERA5 data access, download, and organization.

The interface is similar to pycontrails.datalib.ecmwf.ERA5, which downloads pressure-level with much lower vertical resolution.

Requires account with Copernicus Data Portal and local credentials.

API credentials can be stored in a ~/.cdsapirc file or as CDSAPI_URL and CDSAPI_KEY environment variables.

export CDSAPI_URL=…

export CDSAPI_KEY=…

Credentials can also be provided directly url and key keyword args.

See cdsapi documentation for more information.

Parameters:
  • time (metsource.TimeInput | None) – The time range for data retrieval, either a single datetime or (start, end) datetime range. Input must be datetime-like or tuple of datetime-like (datetime.datetime, pandas.Timestamp, numpy.datetime64) specifying the (start, end) of the date range, inclusive. NetCDF files will be downloaded from CDS in chunks no larger than 1 month for the nominal reanalysis and no larger than 1 day for ensemble members. This ensures that exactly one request is submitted per file on tape accessed. If None, paths must be defined and all time coordinates will be loaded from files.

  • variables (metsource.VariableInput) – Variable name (i.e. “t”, “air_temperature”, [“air_temperature, specific_humidity”])

  • pressure_levels (metsource.PressureLevelInput, optional) – Pressure levels for data, in hPa (mbar). To download surface-level parameters, use pycontrails.datalib.ecmwf.ERA5. Defaults to pressure levels that match model levels at a nominal surface pressure.

  • timestep_freq (str, optional) – Manually set the timestep interval within the bounds defined by time. Supports any string that can be passed to pd.date_range(freq=...). By default, this is set to “1h” for reanalysis products and “3h” for ensemble products.

  • product_type (str, optional) – Product type, one of “reanalysis” and “ensemble_members”. Unlike pycontrails.datalib.ecmwf.ERA5, this class does not support direct access to the ensemble mean and spread, which are not available on model levels.

  • grid (float, optional) – Specify latitude/longitude grid spacing in data. By default, this is set to 0.25 for reanalysis products and 0.5 for ensemble products.

  • model_levels (list[int], optional) – Specify ECMWF model levels to include in MARS requests. By default, this is set to include all model levels.

  • ensemble_members (list[int], optional) – Specify ensemble members to include. Valid only when the product type is “ensemble_members”. By default, includes every available ensemble member.

  • cachestore (cache.CacheStore | None, optional) – Cache data store for staging processed netCDF files. Defaults to pycontrails.core.cache.DiskCacheStore. If None, cache is turned off.

  • cache_download (bool, optional) – If True, cache downloaded model-level files rather than storing them in a temporary file. By default, False.

  • url (str | None) – Override the default cdsapi url. As of August 2024, the url for the CDS-Beta is “https://cds-beta.climate.copernicus.eu/api”, and the url for the legacy server is “https://cds.climate.copernicus.eu/api/v2”. If None, the url is set by the CDSAPI_URL environment variable. If this is not defined, the cdsapi package will determine the url.

  • key (str | None) – Override default cdsapi key. If None, the key is set by the CDSAPI_KEY environment variable. If this is not defined, the cdsapi package will determine the key.

create_cachepath(t)

Return cachepath to local ERA5 data file based on datetime.

This uniquely defines a cached data file with class parameters.

Parameters:

t (datetime | pd.Timestamp) – Datetime of datafile

Returns:

str – Path to local ERA5 data file

property dataset

Select dataset for downloading model-level data.

Always returns “reanalysis-era5-complete”.

Returns:

str – Model-level ERA5 dataset name in CDS

download_dataset(times)

Download data from data source for input times.

Parameters:

times (list[datetime]) – List of datetimes to download a store in cache

grid

Lat / Lon grid spacing

mars_request(times)

Generate MARS request for specific list of times.

Parameters:

times (list[datetime]) – Times included in MARS request.

Returns:

dict[str, str] – MARS request for submission to Copernicus CDS.

open_metdataset(dataset=None, xr_kwargs=None, **kwargs)

Open MetDataset from data source.

This method should download / load any required datafiles and returns a MetDataset of the multi-file dataset opened by xarray.

Parameters:
  • dataset (xr.Dataset | None, optional) – Input xr.Dataset loaded manually. The dataset must have the same format as the original data source API or files.

  • xr_kwargs (dict[str, Any] | None, optional) – Dictionary of keyword arguments passed into xarray.open_mfdataset() when opening files. Examples include “chunks”, “engine”, “parallel”, etc. Ignored if dataset is input.

  • **kwargs (Any) – Keyword arguments passed through directly into MetDataset constructor.

Returns:

MetDataset – Meteorology dataset

paths

Path to local source files to load. Set to the paths of files cached in cachestore if no paths input is provided on init.

property pressure_level_variables

ECMWF pressure level parameters available on model levels.

Returns:

list[MetVariable] – List of MetVariable available in datasource

pressure_levels

List of pressure levels. Set to [-1] for data without level coordinate. Use parse_pressure_levels() to handle PressureLevelInput.

set_metadata(ds)

Set met source metadata on ds.attrs.

This is called within the open_metdataset() method to set metadata on the returned MetDataset instance.

Parameters:

ds (xr.Dataset | MetDataset) – Dataset to set metadata on. Mutated in place.

property single_level_variables

ECMWF single-level parameters available on model levels.

Returns:

list[MetVariable] – Always returns an empty list. To access single-level variables, used pycontrails.datalib.ecmwf.ERA5.

timesteps

List of individual timesteps from data source derived from time Use parse_time() to handle TimeInput.

variables

Variables requested from data source Use parse_variables() to handle VariableInput.