pycontrails.datalib.ecmwf.HRES¶
- class pycontrails.datalib.ecmwf.HRES(time, variables, pressure_levels=-1, paths=None, cachepath=None, grid=0.25, stream='oper', field_type='fc', forecast_time=None, cachestore=<object object>, url=None, key=None, email=None)¶
Bases:
ECMWFAPI
Class to support HRES data access, download, and organization.
Requires account with ECMWF and API key.
API credentials set in local
~/.ecmwfapirc
file:{ "url": "https://api.ecmwf.int/v1", "email": "<email>", "key": "<key>" }
Credentials can also be provided directly
url
key
, andemail
keyword args.See ecmwf-api-client documentation for more information.
- Parameters:
time (
metsource.TimeInput | None
) – The time range for data retrieval, either a single datetime or (start, end) datetime range. Input must be a datetime-like or tuple of datetime-like (datetime,pandas.Timestamp
,numpy.datetime64
) specifying the (start, end) of the date range, inclusive. Ifforecast_time
is unspecified, the forecast time will be assumed to be the nearest synoptic hour: 00, 06, 12, 18. All subsequent times will be downloaded for relative toforecast_time
. If None,paths
must be defined and all time coordinates will be loaded from files.variables (
metsource.VariableInput
) – Variable name (i.e. “air_temperature”, [“air_temperature, relative_humidity”]) Seepressure_level_variables
for the list of available variables.pressure_levels (
metsource.PressureLevelInput
, optional) – Pressure levels for data, in hPa (mbar) Set to -1 for to download surface level parameters. Defaults to -1.paths (
str | list[str] | pathlib.Path | list[pathlib.Path] | None
, optional) – Path to CDS NetCDF files to load manually. Can include glob patterns to load specific files. Defaults to None, which looks for files in thecachestore
or CDS.grid (
float
, optional) – Specify latitude/longitude grid spacing in data. Defaults to 0.25.stream (
str
, optional) – “oper” = atmospheric model/HRES, “enfo” = ensemble forecast. Defaults to “oper” (HRES),field_type (
str
, optional) – Field type can be e.g. forecast (fc), perturbed forecast (pf), control forecast (cf), analysis (an). Defaults to “fc”.forecast_time (
DatetimeLike
, optional) – Specify forecast run by runtime. Defaults to None.cachestore (
cache.CacheStore | None
, optional) – Cache data store for staging data files. Defaults tocache.DiskCacheStore
. If None, cache is turned off.url (
str
) – Override ecmwf-api-client urlkey (
str
) – Override ecmwf-api-client keyemail (
str
) – Override ecmwf-api-client email
Notes
class: in most cases this will be operational data, or “od”
stream: “enfo” = ensemble forecast, “oper” = atmospheric model/HRES
expver: experimental version, production data is 1 or 2
date: there are numerous acceptible date formats
time: forecast base time, always in synoptic time (0,6,12,18 UTC)
type: forecast (oper), perturbed or control forecast (enfo only), or analysis
levtype: options include surface, pressure levels, or model levels
levelist: list of levels in format specified by levtype levelist
param: list of variables in catalog number, long name or short name
step: hourly time steps from base forecast time
number: for ensemble forecasts, ensemble numbers
format: specify netcdf instead of default grib, DEPRECATED format
grid: specify model return grid spacing
Local
paths
are loaded usingxarray.open_mfdataset()
. Passxr_kwargs
inputs toopen_metdataset()
to customize file loading.Examples
>>> from datetime import datetime >>> from pycontrails import GCPCacheStore >>> from pycontrails.datalib.ecmwf import HRES
>>> # Store data files to local disk (default behavior) >>> times = (datetime(2021, 5, 1, 2), datetime(2021, 5, 1, 3)) >>> hres = HRES(times, variables="air_temperature", pressure_levels=[300, 250])
>>> # Cache files to google cloud storage >>> gcp_cache = GCPCacheStore( ... bucket="contrails-301217-unit-test", ... cache_dir="ecmwf", ... ) >>> hres = HRES( ... times, ... variables="air_temperature", ... pressure_levels=[300, 250], ... cachestore=gcp_cache ... )
- __init__(time, variables, pressure_levels=-1, paths=None, cachepath=None, grid=0.25, stream='oper', field_type='fc', forecast_time=None, cachestore=<object object>, url=None, key=None, email=None)¶
Methods
__init__
(time, variables[, pressure_levels, ...])cache_dataset
(dataset)Cache data from data source.
Return cachepath to local data file based on datetime.
create_synoptic_time_ranges
(timesteps)Create synoptic time bounds encompassing date range.
download
(**xr_kwargs)Confirm all data files are downloaded and available locally in the
cachestore
.download_dataset
(times)Download data from data source for input times.
generate_mars_request
([forecast_time, ...])Generate MARS request in MARS request syntax.
is_datafile_cached
(t, **xr_kwargs)Check datafile defined by datetime for variables and pressure levels in class.
List metadata on query from MARS.
list_timesteps_cached
(**xr_kwargs)Get a list of data files available locally in the
cachestore
.list_timesteps_not_cached
(**xr_kwargs)Get a list of data files not available locally in the
cachestore
.open_dataset
(disk_paths, **xr_kwargs)Open multi-file dataset in xarray.
open_metdataset
([dataset, xr_kwargs])Open MetDataset from data source.
set_metadata
(ds)Set met source metadata on
ds.attrs
.Attributes
Field type, forecast ("fc"), perturbed forecast ("pf"), control forecast ("cf"), analysis ("an").
Forecast run time, either specified or assigned by the closest previous forecast run
Handle to ECMWFService client
stream type, "oper" = atmospheric model/HRES, "enfo" = ensemble forecast.
Lat / Lon grid spacing
Generate a unique hash for this datasource.
Return True if the datasource is single level data.
Path to local source files to load.
ECMWF pressure level parameters.
List of pressure levels.
ECMWF surface level parameters.
Difference between
forecast_time
and first timestep.Forecast steps from
forecast_time
corresponding within inputtime
.Get pressure levels available from MARS.
Parameters available from data source.
List of individual timesteps from data source derived from
time
Useparse_time()
to handleTimeInput
.Return a list of variable ecmwf_ids.
Return a list of variable short names.
Return a list of variable standard names.
Variables requested from data source Use
parse_variables()
to handleVariableInput
.Cache store for intermediates while processing data source If None, cache is turned off.
- cache_dataset(dataset)¶
Cache data from data source.
- Parameters:
dataset (
xarray.Dataset
) – Dataset loaded from remote API or local files. The dataset must have the same format as the original data source API or files.
- cachestore¶
Cache store for intermediates while processing data source If None, cache is turned off.
- create_cachepath(t)¶
Return cachepath to local data file based on datetime.
- Parameters:
t (
datetime
) – Datetime of datafile- Returns:
str
– Path to cached data file
- static create_synoptic_time_ranges(timesteps)¶
Create synoptic time bounds encompassing date range.
Extracts time bounds for synoptic time range ([00:00, 11:59], [12:00, 23:59]) for a list of input timesteps.
- Parameters:
timesteps (
list[pd.Timestamp]
) – List of timesteps formatted aspd.Timestamps
. Often this it the output from pd.date_range()- Returns:
list[tuple[pd.Timestamp
,pd.Timestamp]]
– List of tuple time bounds that can be used as inputs toHRES(time=...)
- download(**xr_kwargs)¶
Confirm all data files are downloaded and available locally in the
cachestore
.- Parameters:
**xr_kwargs – Passed into
xarray.open_dataset()
viais_datafile_cached()
.
- download_dataset(times)¶
Download data from data source for input times.
- Parameters:
times (
list[:class:`datetime
]`) – List of datetimes to download and store in cache datastore
- email¶
- field_type¶
Field type, forecast (“fc”), perturbed forecast (“pf”), control forecast (“cf”), analysis (“an”).
- forecast_time¶
Forecast run time, either specified or assigned by the closest previous forecast run
- generate_mars_request(forecast_time=None, steps=None, request_type='retrieve', request_format='mars')¶
Generate MARS request in MARS request syntax.
- Parameters:
forecast_time (
datetime
, optional) – Base datetime for the forecast. Defaults toforecast_time
.steps (
list[int]
, optional) – list of steps. Defaults tosteps
.request_type (
str
, optional) – “retrieve” for download request or “list” for metadata request. Defaults to “retrieve”.request_format (
str
, optional) – “mars” for MARS string format, or “dict” for dict version. Defaults to “mars”.
- Returns:
str | dict[str
,Any]
– Returns MARS query string ifrequest_format
is “mars”. Returns dict query ifrequest_format
is “dict”
Notes
Brief overview of MARS request syntax
- grid¶
Lat / Lon grid spacing
- property hash¶
Generate a unique hash for this datasource.
- Returns:
str
– Unique hash for met instance (sha1)
- is_datafile_cached(t, **xr_kwargs)¶
Check datafile defined by datetime for variables and pressure levels in class.
If using a cloud cache store (i.e.
cache.GCPCacheStore
), this is where the datafile will be mirrored to a local file for access.- Parameters:
t (
datetime
) – Datetime of datafile**xr_kwargs (
Any
) – Additional kwargs passed directly toxarray.open_mfdataset()
when opening files. By default, the following values are used if not specified:chunks: {“time”: 1}
engine: “netcdf4”
parallel: False
- Returns:
bool
– True if data file exists for datetime with all variables and pressure levels, False otherwise
- property is_single_level¶
Return True if the datasource is single level data.
Added in version 0.50.0.
- key¶
- list_from_mars()¶
List metadata on query from MARS.
- Returns:
str
– Metadata for MARS request. Note this is queued the same as data requests.
- list_timesteps_cached(**xr_kwargs)¶
Get a list of data files available locally in the
cachestore
.- Parameters:
**xr_kwargs – Passed into
xarray.open_dataset()
viais_datafile_cached()
.
- list_timesteps_not_cached(**xr_kwargs)¶
Get a list of data files not available locally in the
cachestore
.- Parameters:
**xr_kwargs – Passed into
xarray.open_dataset()
viais_datafile_cached()
.
- open_dataset(disk_paths, **xr_kwargs)¶
Open multi-file dataset in xarray.
- Parameters:
disk_paths (
str | list[str] | pathlib.Path | list[pathlib.Path]
) – list of string paths to local files to open**xr_kwargs (
Any
) – Additional kwargs passed directly toxarray.open_mfdataset()
when opening files. By default, the following values are used if not specified:chunks: {“time”: 1}
engine: “netcdf4”
parallel: False
lock: False
- Returns:
xarray.Dataset
– Open xarray dataset
- open_metdataset(dataset=None, xr_kwargs=None, **kwargs)¶
Open MetDataset from data source.
This method should download / load any required datafiles and returns a MetDataset of the multi-file dataset opened by xarray.
- Parameters:
dataset (
xr.Dataset | None
, optional) – Inputxr.Dataset
loaded manually. The dataset must have the same format as the original data source API or files.xr_kwargs (
dict[str
,Any] | None
, optional) – Dictionary of keyword arguments passed intoxarray.open_mfdataset()
when opening files. Examples include “chunks”, “engine”, “parallel”, etc. Ignored ifdataset
is input.**kwargs (
Any
) – Keyword arguments passed through directly intoMetDataset
constructor.
- Returns:
MetDataset
– Meteorology dataset
See also
- paths¶
Path to local source files to load. Set to the paths of files cached in
cachestore
if nopaths
input is provided on init.
- property pressure_level_variables¶
ECMWF pressure level parameters.
- Returns:
list[MetVariable] | None
– List of MetVariable available in datasource
- pressure_levels¶
List of pressure levels. Set to [-1] for data without level coordinate. Use
parse_pressure_levels()
to handlePressureLevelInput
.
- server¶
Handle to ECMWFService client
- set_metadata(ds)¶
Set met source metadata on
ds.attrs
.This is called within the
open_metdataset()
method to set metadata on the returnedMetDataset
instance.- Parameters:
ds (
xr.Dataset | MetDataset
) – Dataset to set metadata on. Mutated in place.
- property single_level_variables¶
ECMWF surface level parameters.
- Returns:
list[MetVariable] | None
– List of MetVariable available in datasource
- property step_offset¶
Difference between
forecast_time
and first timestep.
- property steps¶
Forecast steps from
forecast_time
corresponding within inputtime
.- Returns:
list[int]
– List of forecast steps relative toforecast_time
- stream¶
stream type, “oper” = atmospheric model/HRES, “enfo” = ensemble forecast.
- property supported_pressure_levels¶
Get pressure levels available from MARS.
- Returns:
list[int]
– List of integer pressure level values
- property supported_variables¶
Parameters available from data source.
- Returns:
list[MetVariable] | None
– List of MetVariable available in datasource
- timesteps¶
List of individual timesteps from data source derived from
time
Useparse_time()
to handleTimeInput
.
- url¶
- property variable_ecmwfids¶
Return a list of variable ecmwf_ids.
- Returns:
list[int]
– List of int ECMWF param ids.
- property variable_shortnames¶
Return a list of variable short names.
- Returns:
list[str]
– Lst of variable short names.
- property variable_standardnames¶
Return a list of variable standard names.
- Returns:
list[str]
– Lst of variable standard names.
- variables¶
Variables requested from data source Use
parse_variables()
to handleVariableInput
.