pycontrails.datalib.ecmwf.ERA5¶
- class pycontrails.datalib.ecmwf.ERA5(time, variables, pressure_levels=-1, paths=None, timestep_freq=None, product_type='reanalysis', grid=None, cachestore=<object object>, url=None, key=None)¶
Bases:
ECMWFAPI
Class to support ERA5 data access, download, and organization.
Requires account with Copernicus Data Portal and local credentials.
API credentials can be stored in a
~/.cdsapirc
file or asCDSAPI_URL
andCDSAPI_KEY
environment variables.export CDSAPI_URL=…
export CDSAPI_KEY=…
Credentials can also be provided directly
url
andkey
keyword args.See cdsapi documentation for more information.
- Parameters:
time (
metsource.TimeInput | None
) – The time range for data retrieval, either a single datetime or (start, end) datetime range. Input must be datetime-like or tuple of datetime-like (datetime,pd.Timestamp
,np.datetime64
) specifying the (start, end) of the date range, inclusive. Datafiles will be downloaded from CDS for each day to reduce requests. If None,paths
must be defined and all time coordinates will be loaded from files.variables (
metsource.VariableInput
) – Variable name (i.e. “t”, “air_temperature”, [“air_temperature, relative_humidity”])pressure_levels (
metsource.PressureLevelInput
, optional) – Pressure levels for data, in hPa (mbar) Set to -1 for to download surface level parameters. Defaults to -1.paths (
str | list[str] | pathlib.Path | list[pathlib.Path] | None
, optional) – Path to CDS NetCDF files to load manually. Can include glob patterns to load specific files. Defaults to None, which looks for files in thecachestore
or CDS.timestep_freq (
str
, optional) – Manually set the timestep interval within the bounds defined bytime
. Supports any string that can be passed to pd.date_range(freq=…). By default, this is set to “1h” for reanalysis products and “3h” for ensemble products.product_type (
str
, optional) – Product type, one of “reanalysis”, “ensemble_mean”, “ensemble_members”, “ensemble_spread”grid (
float
, optional) – Specify latitude/longitude grid spacing in data. By default, this is set to 0.25 for reanalysis products and 0.5 for ensemble products.cachestore (
cache.CacheStore | None
, optional) – Cache data store for staging ECMWF ERA5 files. Defaults tocache.DiskCacheStore
. If None, cache is turned off.url (
str | None
) – Override the default cdsapi url. As of August 2024, the url for the CDS-Beta is “https://cds-beta.climate.copernicus.eu/api”, and the url for the legacy server is “https://cds.climate.copernicus.eu/api/v2”. If None, the url is set by theCDSAPI_URL
environment variable. If this is not defined, thecdsapi
package will determine the url.key (
str | None
) – Override default cdsapi key. If None, the key is set by theCDSAPI_KEY
environment variable. If this is not defined, thecdsapi
package will determine the key.
Notes
ERA5 parameter list: https://confluence.ecmwf.int/pages/viewpage.action?pageId=82870405#ERA5:datadocumentation-Parameterlistings
All radiative quantities are accumulated. See https://www.ecmwf.int/sites/default/files/elibrary/2015/18490-radiation-quantities-ecmwf-model-and-mars.pdf for more information.
Local
paths
are loaded usingxarray.open_mfdataset()
. Passxr_kwargs
inputs toopen_metdataset()
to customize file loading.Examples
>>> from datetime import datetime >>> from pycontrails.datalib.ecmwf import ERA5 >>> from pycontrails import GCPCacheStore
>>> # Store data files from CDS to local disk (default behavior) >>> era5 = ERA5( ... "2020-06-01 12:00:00", ... variables=["air_temperature", "relative_humidity"], ... pressure_levels=[350, 300] ... )
>>> # cache files to google cloud storage >>> gcp_cache = GCPCacheStore( ... bucket="contrails-301217-unit-test", ... cache_dir="ecmwf", ... ) >>> era5 = ERA5( ... "2020-06-01 12:00:00", ... variables=["air_temperature", "relative_humidity"], ... pressure_levels=[350, 300], ... cachestore=gcp_cache ... )
- __init__(time, variables, pressure_levels=-1, paths=None, timestep_freq=None, product_type='reanalysis', grid=None, cachestore=<object object>, url=None, key=None)¶
Methods
__init__
(time, variables[, pressure_levels, ...])cache_dataset
(dataset)Cache data from data source.
Return cachepath to local ERA5 data file based on datetime.
download
(**xr_kwargs)Confirm all data files are downloaded and available locally in the
cachestore
.download_dataset
(times)Download data from data source for input times.
is_datafile_cached
(t, **xr_kwargs)Check datafile defined by datetime for variables and pressure levels in class.
list_timesteps_cached
(**xr_kwargs)Get a list of data files available locally in the
cachestore
.list_timesteps_not_cached
(**xr_kwargs)Get a list of data files not available locally in the
cachestore
.open_dataset
(disk_paths, **xr_kwargs)Open multi-file dataset in xarray.
open_metdataset
([dataset, xr_kwargs])Open MetDataset from data source.
set_metadata
(ds)Set met source metadata on
ds.attrs
.Attributes
Product type, one of "reanalysis", "ensemble_mean", "ensemble_members", "ensemble_spread"
Handle to
cdsapi.Client
User provided
cdsapi.Client
urlUser provided
cdsapi.Client
urlSelect dataset for download based on
pressure_levels
.grid
Lat / Lon grid spacing
Generate a unique hash for this datasource.
is_single_level
Return True if the datasource is single level data.
paths
Path to local source files to load.
ECMWF pressure level parameters.
pressure_levels
List of pressure levels.
ECMWF surface level parameters.
Get pressure levels available from ERA5 pressure level dataset.
supported_variables
Parameters available from data source.
timesteps
List of individual timesteps from data source derived from
time
Useparse_time()
to handleTimeInput
.variable_ecmwfids
Return a list of variable ecmwf_ids.
variable_shortnames
Return a list of variable short names.
variable_standardnames
Return a list of variable standard names.
variables
Variables requested from data source Use
parse_variables()
to handleVariableInput
.cachestore
Cache store for intermediates while processing data source If None, cache is turned off.
- cds¶
Handle to
cdsapi.Client
- create_cachepath(t)¶
Return cachepath to local ERA5 data file based on datetime.
This uniquely defines a cached data file ith class parameters.
- Parameters:
t (
datetime | pd.Timestamp
) – Datetime of datafile- Returns:
str
– Path to local ERA5 data file
- property dataset¶
Select dataset for download based on
pressure_levels
.One of “reanalysis-era5-pressure-levels” or “reanalysis-era5-single-levels”
- Returns:
str
– ERA5 dataset name in CDS
- download_dataset(times)¶
Download data from data source for input times.
- Parameters:
times (
list[datetime]
) – List of datetimes to download a store in cache
- property hash¶
Generate a unique hash for this datasource.
- Returns:
str
– Unique hash for met instance (sha1)
- key¶
User provided
cdsapi.Client
url
- open_metdataset(dataset=None, xr_kwargs=None, **kwargs)¶
Open MetDataset from data source.
This method should download / load any required datafiles and returns a MetDataset of the multi-file dataset opened by xarray.
- Parameters:
dataset (
xr.Dataset | None
, optional) – Inputxr.Dataset
loaded manually. The dataset must have the same format as the original data source API or files.xr_kwargs (
dict[str
,Any] | None
, optional) – Dictionary of keyword arguments passed intoxarray.open_mfdataset()
when opening files. Examples include “chunks”, “engine”, “parallel”, etc. Ignored ifdataset
is input.**kwargs (
Any
) – Keyword arguments passed through directly intoMetDataset
constructor.
- Returns:
MetDataset
– Meteorology dataset
See also
- property pressure_level_variables¶
ECMWF pressure level parameters.
- Returns:
list[MetVariable] | None
– List of MetVariable available in datasource
- product_type¶
Product type, one of “reanalysis”, “ensemble_mean”, “ensemble_members”, “ensemble_spread”
- set_metadata(ds)¶
Set met source metadata on
ds.attrs
.This is called within the
open_metdataset()
method to set metadata on the returnedMetDataset
instance.- Parameters:
ds (
xr.Dataset | MetDataset
) – Dataset to set metadata on. Mutated in place.
- property single_level_variables¶
ECMWF surface level parameters.
- Returns:
list[MetVariable] | None
– List of MetVariable available in datasource
- property supported_pressure_levels¶
Get pressure levels available from ERA5 pressure level dataset.
- Returns:
list[int]
– List of integer pressure level values
- url¶
User provided
cdsapi.Client
url