pycontrails.datalib.ecmwf.era5_model_level¶
Model-level ERA5 data access.
This module supports
Retrieving model-level ERA5 data by submitting MARS requests through the Copernicus CDS.
Processing retrieved model-level files to produce netCDF files on target pressure levels.
Local caching of processed netCDF files.
Opening processed and cached files as a
pycontrails.MetDatasetobject.
Consider using pycontrails.datalib.ecmwf.ERA5ARCO
to access model-level data from the nominal ERA5 reanalysis between 1959 and 2022.
pycontrails.datalib.ecmwf.ERA5ARCO accesses data through Google’s
Analysis-Ready, Cloud Optimized ERA5 dataset
and has lower latency than this module, which retrieves data from the
Copernicus Climate Data Store.
This module must be used to retrieve model-level data from ERA5 ensemble members
or for more recent dates.
Classes
|
Class to support model-level ERA5 data access, download, and organization. |
- class pycontrails.datalib.ecmwf.era5_model_level.ERA5ModelLevel(time, variables, *, pressure_levels=None, timestep_freq=None, product_type='reanalysis', grid=None, model_levels=None, ensemble_members=None, cachestore=<object object>, cache_download=False, url=None, key=None)¶
Bases:
ECMWFAPIClass to support model-level ERA5 data access, download, and organization.
The interface is similar to
pycontrails.datalib.ecmwf.ERA5, which downloads pressure-level with much lower vertical resolution.Requires account with Copernicus Data Portal and local credentials.
API credentials can be stored in a
~/.cdsapircfile or asCDSAPI_URLandCDSAPI_KEYenvironment variables.export CDSAPI_URL=…
export CDSAPI_KEY=…
Credentials can also be provided directly
urlandkeykeyword args.See cdsapi documentation for more information.
- Parameters:
time (
metsource.TimeInput | None) – The time range for data retrieval, either a single datetime or (start, end) datetime range. Input must be datetime-like or tuple of datetime-like (datetime.datetime,pandas.Timestamp,numpy.datetime64) specifying the (start, end) of the date range, inclusive. NetCDF files will be downloaded from CDS in chunks no larger than 1 month for the nominal reanalysis and no larger than 1 day for ensemble members. This ensures that exactly one request is submitted per file on tape accessed. If None,pathsmust be defined and all time coordinates will be loaded from files.variables (
metsource.VariableInput) – Variable name (i.e. “t”, “air_temperature”, [“air_temperature, specific_humidity”])pressure_levels (
metsource.PressureLevelInput, optional) – Pressure levels for data, in hPa (mbar). To download surface-level parameters, usepycontrails.datalib.ecmwf.ERA5. Defaults to pressure levels that match model levels at a nominal surface pressure.timestep_freq (
str, optional) – Manually set the timestep interval within the bounds defined bytime. Supports any string that can be passed topd.date_range(freq=...). By default, this is set to “1h” for reanalysis products and “3h” for ensemble products.product_type (
str, optional) – Product type, one of “reanalysis” and “ensemble_members”. Unlikepycontrails.datalib.ecmwf.ERA5, this class does not support direct access to the ensemble mean and spread, which are not available on model levels.grid (
float, optional) – Specify latitude/longitude grid spacing in data. By default, this is set to 0.25 for reanalysis products and 0.5 for ensemble products.model_levels (
list[int], optional) – Specify ECMWF model levels to include in MARS requests. By default, this is set to include all model levels.ensemble_members (
list[int], optional) – Specify ensemble members to include. Valid only when the product type is “ensemble_members”. By default, includes every available ensemble member.cachestore (
cache.CacheStore | None, optional) – Cache data store for staging processed netCDF files. Defaults topycontrails.core.cache.DiskCacheStore. If None, cache is turned off.cache_download (
bool, optional) – If True, cache downloaded model-level files rather than storing them in a temporary file. By default, False.url (
str | None) – Override the default cdsapi url. As of January 2025, the url for the CDS Server is “https://cds.climate.copernicus.eu/api”. If None, the url is set by theCDSAPI_URLenvironment variable. If this is not defined, thecdsapipackage will determine the url.key (
str | None) – Override default cdsapi key. If None, the key is set by theCDSAPI_KEYenvironment variable. If this is not defined, thecdsapipackage will determine the key.
- cache_dataset(dataset)¶
Cache data from data source.
- Parameters:
dataset (
xarray.Dataset) – Dataset loaded from remote API or local files. The dataset must have the same format as the original data source API or files.
- cachestore¶
Cache store for intermediates while processing data source If None, cache is turned off.
- create_cachepath(t)¶
Return cachepath to local ERA5 data file based on datetime.
This uniquely defines a cached data file with class parameters.
- Parameters:
t (
datetime | pd.Timestamp) – Datetime of datafile- Returns:
str– Path to local ERA5 data file
- property dataset¶
Select dataset for downloading model-level data.
Always returns “reanalysis-era5-complete”.
- Returns:
str– Model-level ERA5 dataset name in CDS
- download(**xr_kwargs)¶
Confirm all data files are downloaded and available locally in the
cachestore.- Parameters:
**xr_kwargs – Passed into
xarray.open_dataset()viais_datafile_cached().
- download_dataset(times)¶
Download data from data source for input times.
- Parameters:
times (
list[datetime]) – List of datetimes to download a store in cache
- grid¶
Lat / Lon grid spacing
- property hash¶
Generate a unique hash for this datasource.
- Returns:
str– Unique hash for met instance (sha1)
- is_datafile_cached(t, **xr_kwargs)¶
Check datafile defined by datetime for variables and pressure levels in class.
If using a cloud cache store (i.e.
cache.GCPCacheStore), this is where the datafile will be mirrored to a local file for access.- Parameters:
t (
datetime) – Datetime of datafile**xr_kwargs (
Any) – Additional kwargs passed directly toxarray.open_mfdataset()when opening files. By default, the following values are used if not specified:chunks: {“time”: 1}
engine: “netcdf4”
parallel: False
- Returns:
bool– True if data file exists for datetime with all variables and pressure levels, False otherwise
- property is_single_level¶
Return True if the datasource is single level data.
Added in version 0.50.0.
- list_timesteps_cached(**xr_kwargs)¶
Get a list of data files available locally in the
cachestore.- Parameters:
**xr_kwargs – Passed into
xarray.open_dataset()viais_datafile_cached().
- list_timesteps_not_cached(**xr_kwargs)¶
Get a list of data files not available locally in the
cachestore.- Parameters:
**xr_kwargs – Passed into
xarray.open_dataset()viais_datafile_cached().
- mars_request(times)¶
Generate MARS request for specific list of times.
- Parameters:
times (
list[datetime]) – Times included in MARS request.- Returns:
dict[str,str]– MARS request for submission to Copernicus CDS.
- open_dataset(disk_paths, **xr_kwargs)¶
Open multi-file dataset in xarray.
- Parameters:
disk_paths (
str | list[str] | pathlib.Path | list[pathlib.Path]) – list of string paths to local files to open**xr_kwargs (
Any) – Additional kwargs passed directly toxarray.open_mfdataset()when opening files. By default, the following values are used if not specified:chunks: {“time”: 1}
engine: “netcdf4”
parallel: False
lock: False
- Returns:
xarray.Dataset– Open xarray dataset
- open_metdataset(dataset=None, xr_kwargs=None, **kwargs)¶
Open MetDataset from data source.
This method should download / load any required datafiles and returns a MetDataset of the multi-file dataset opened by xarray.
- Parameters:
dataset (
xr.Dataset | None, optional) – Inputxr.Datasetloaded manually. The dataset must have the same format as the original data source API or files.xr_kwargs (
dict[str,Any] | None, optional) – Dictionary of keyword arguments passed intoxarray.open_mfdataset()when opening files. Examples include “chunks”, “engine”, “parallel”, etc. Ignored ifdatasetis input.**kwargs (
Any) – Keyword arguments passed through directly intoMetDatasetconstructor.
- Returns:
MetDataset– Meteorology dataset
See also
- paths¶
Path to local source files to load. Set to the paths of files cached in
cachestoreif nopathsinput is provided on init.
- property pressure_level_variables¶
ECMWF pressure level parameters available on model levels.
- Returns:
list[MetVariable]– List of MetVariable available in datasource
- pressure_levels¶
List of pressure levels. Set to [-1] for data without level coordinate. Use
parse_pressure_levels()to handlePressureLevelInput.
- set_metadata(ds)¶
Set met source metadata on
ds.attrs.This is called within the
open_metdataset()method to set metadata on the returnedMetDatasetinstance.- Parameters:
ds (
xr.Dataset | MetDataset) – Dataset to set metadata on. Mutated in place.
- property single_level_variables¶
ECMWF single-level parameters available on model levels.
- Returns:
list[MetVariable]– Always returns an empty list. To access single-level variables, usedpycontrails.datalib.ecmwf.ERA5.
- property supported_pressure_levels¶
Pressure levels available from datasource.
- Returns:
list[int] | None– List of integer pressure levels for class. If None, no pressure level information available for class.
- property supported_variables¶
Parameters available from data source.
- Returns:
list[MetVariable] | None– List of MetVariable available in datasource
- timesteps¶
List of individual timesteps from data source derived from
timeUseparse_time()to handleTimeInput.
- property variable_ecmwfids¶
Return a list of variable ecmwf_ids.
- Returns:
list[int]– List of int ECMWF param ids.
- property variable_shortnames¶
Return a list of variable short names.
- Returns:
list[str]– Lst of variable short names.
- property variable_standardnames¶
Return a list of variable standard names.
- Returns:
list[str]– Lst of variable standard names.
- variables¶
Variables requested from data source Use
parse_variables()to handleVariableInput.