Load ECMWF data

Requires [ecmwf] optional dependencies:

$ pip install pycontrails[ecmwf]

Support provided for:

For both ERA5 and HRES, we provide interfaces for accessing “pressure-level data” (fields pre-interpolated to a fixed set of pressure levels) or “model-level data” (fields retrieved on the native vertical grid and interpolated after retrieval to an arbitrary set of pressure levels). We recommend using model-level data when possible, as the resolution of pressure-level data is coarse relative to the vertical scale of ice-supersaturated regions.

Note that tools for accessing ECMWF data are not thoroughly tested in CI because they are vulnerable to upstream failures in external APIs. If you think you have found a problem please open an issue!

ERA5

Access

Reference

ERA5 Pressure Levels

[1]:
from pycontrails.datalib.ecmwf import ERA5
[2]:
# get a single time
era5 = ERA5(
    time="2022-03-01 00:00:00",
    variables=["t", "q", "u", "v", "w", "ciwc", "z", "cc"],  # supports CF name or short names
    pressure_levels=[200, 250, 300],
    # url="https://cds.climate.copernicus.eu/api",
    # key="<key>"
)
era5
[2]:
ERA5
        Timesteps: ['2022-03-01 00']
        Variables: ['t', 'q', 'u', 'v', 'w', 'ciwc', 'z', 'cc']
        Pressure levels: [200, 250, 300]
        Grid: 0.25
        Dataset: reanalysis-era5-pressure-levels
        Product type: reanalysis
[3]:
# get a range of time and all available pressure levels between 450 and 125 hPa
era5 = ERA5(
    time=("2022-03-01 00:00:00", "2022-03-01 03:00:00"),
    variables=[
        "air_temperature",
        "q",
        "u",
        "v",
        "w",
        "ciwc",
        "z",
        "cc",
    ],  # supports CF name or short names
    pressure_levels=[300, 250, 200],
    # url="https://cds.climate.copernicus.eu/api",
    # key="<key>"
)
era5
[3]:
ERA5
        Timesteps: ['2022-03-01 00', '2022-03-01 01', '2022-03-01 02', '2022-03-01 03']
        Variables: ['t', 'q', 'u', 'v', 'w', 'ciwc', 'z', 'cc']
        Pressure levels: [200, 250, 300]
        Grid: 0.25
        Dataset: reanalysis-era5-pressure-levels
        Product type: reanalysis
[4]:
# this triggers a download from CDS if file isn't in cache store
met_pl = era5.open_metdataset()
met_pl
[4]:
MetDataset with data:

<xarray.Dataset> Size: 797MB
Dimensions:                              (longitude: 1440, latitude: 721,
                                          level: 3, time: 4)
Coordinates:
  * latitude                             (latitude) float64 6kB -90.0 ... 90.0
  * level                                (level) float64 24B 200.0 250.0 300.0
  * time                                 (time) datetime64[ns] 32B 2022-03-01...
  * longitude                            (longitude) float64 12kB -180.0 ... ...
    air_pressure                         (level) float64 24B 2e+04 2.5e+04 3e+04
    altitude                             (level) float64 24B 1.178e+04 ... 9....
Data variables:
    air_temperature                      (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
    specific_humidity                    (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
    eastward_wind                        (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
    northward_wind                       (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
    lagrangian_tendency_of_air_pressure  (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
    specific_cloud_ice_water_content     (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
    geopotential                         (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
    fraction_of_cloud_cover              (longitude, latitude, level, time) float64 100MB dask.array<chunksize=(1440, 721, 3, 1), meta=np.ndarray>
Attributes:
    Conventions:          CF-1.6
    history:              2024-04-17 23:08:13 GMT by grib_to_netcdf-2.25.1: /...
    pycontrails_version:  0.50.3.dev18
    provider:             ECMWF
    dataset:              ERA5
    product:              reanalysis

ERA5 Single Level

[5]:
era5 = ERA5(
    time=("2022-03-01 00:00:00", "2022-03-01 03:00:00"),
    variables=["tsr", "ttr"],
    # url="https://cds.climate.copernicus.eu/api",
    # key="<key>"
)
era5
[5]:
ERA5
        Timesteps: ['2022-03-01 00', '2022-03-01 01', '2022-03-01 02', '2022-03-01 03']
        Variables: ['tsr', 'ttr']
        Pressure levels: [-1]
        Grid: 0.25
        Dataset: reanalysis-era5-single-levels
        Product type: reanalysis
[6]:
met = era5.open_metdataset()
met
[6]:
MetDataset with data:

<xarray.Dataset> Size: 66MB
Dimensions:                    (level: 1, time: 4, latitude: 721,
                                longitude: 1440)
Coordinates:
  * level                      (level) float64 8B -1.0
  * latitude                   (latitude) float64 6kB -90.0 -89.75 ... 90.0
  * time                       (time) datetime64[ns] 32B 2022-03-01 ... 2022-...
  * longitude                  (longitude) float64 12kB -180.0 -179.8 ... 179.8
Data variables:
    top_net_solar_radiation    (longitude, latitude, level, time) float64 33MB dask.array<chunksize=(1440, 721, 1, 1), meta=np.ndarray>
    top_net_thermal_radiation  (longitude, latitude, level, time) float64 33MB dask.array<chunksize=(1440, 721, 1, 1), meta=np.ndarray>
Attributes:
    Conventions:          CF-1.6
    history:              2024-04-24 21:36:27 GMT by grib_to_netcdf-2.28.1: /...
    pycontrails_version:  0.50.3.dev18
    provider:             ECMWF
    dataset:              ERA5
    product:              reanalysis

ERA5 Model Levels

[7]:
from pycontrails.datalib.ecmwf import ERA5ModelLevel

Model-level data has much higher vertical resolution than pressure-level data, so we download at coarser horizontal resolution to decrease data volume.

If target pressure levels are not explicitly provided, ERA5ModelLevel defaults to pressure levels near model levels between 20,000 and 50,000 feet. These levels are determined by reading a static file based on https://confluence.ecmwf.int/display/UDOC/L137+model+level+definitions.

[8]:
era5 = ERA5ModelLevel(
    time=("2022-03-01 00:00:00", "2022-03-01 03:00:00"),
    variables=["t", "q", "u", "v", "w", "ciwc"],
    grid=1.0,
)
era5
[8]:
ERA5ModelLevel
        Timesteps: ['2022-03-01 00', '2022-03-01 01', '2022-03-01 02', '2022-03-01 03']
        Variables: ['t', 'q', 'u', 'v', 'w', 'ciwc']
        Pressure levels: [121, 127, 134, 141, 148, 155, 163, 171, 180, 188, 197, 207, 217, 227, 237, 248, 260, 272, 284, 297, 310, 323, 337, 352, 367, 383, 399, 416, 433, 451]
        Grid: 1.0
        Dataset: reanalysis-era5-complete
        Product type: reanalysis
[9]:
met_ml = era5.open_metdataset()
met_ml
[9]:
MetDataset with data:

<xarray.Dataset> Size: 188MB
Dimensions:                              (longitude: 360, latitude: 181,
                                          level: 30, time: 4)
Coordinates:
  * time                                 (time) datetime64[ns] 32B 2022-03-01...
    step                                 timedelta64[ns] 8B 00:00:00
  * level                                (level) float64 240B 121.0 ... 451.0
  * latitude                             (latitude) float64 1kB -90.0 ... 90.0
    valid_time                           (time) datetime64[ns] 32B 2022-03-01...
  * longitude                            (longitude) float64 3kB -180.0 ... 1...
    air_pressure                         (level) float32 120B 1.21e+04 ... 4....
    altitude                             (level) float32 120B 1.497e+04 ... 6...
Data variables:
    air_temperature                      (longitude, latitude, level, time) float32 31MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
    specific_humidity                    (longitude, latitude, level, time) float32 31MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
    eastward_wind                        (longitude, latitude, level, time) float32 31MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
    northward_wind                       (longitude, latitude, level, time) float32 31MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
    lagrangian_tendency_of_air_pressure  (longitude, latitude, level, time) float32 31MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
    specific_cloud_ice_water_content     (longitude, latitude, level, time) float32 31MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
Attributes:
    GRIB_edition:            2
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-04-24T23:56 GRIB to CDM+CF via cfgrib-0.9.1...
    pycontrails_version:     0.50.3.dev24
    provider:                ECMWF
    dataset:                 ERA5
    product:                 reanalysis

HRES

Access

Users within ECMWF Member and Co-operating States may contact their Computing Representative to obtain access to MARS. All other users may request a username and password and then get an api key.

Provide url, key, and email credentials on input, or see ECMWF API Client documentation to configure local ~/.ecmwfapirc file:

{
    "url": "https://api.ecmwf.int/v1",
    "email": "<email>",
    "key": "<key>"
}

Reference

  • HRES High resolution forecast

  • ENS Ensemble forecast

HRES Pressure Levels

[10]:
from datetime import datetime

from pycontrails.datalib.ecmwf import HRES
[11]:
# NOTE / TODO: Including the "ciwc" variable here, the HRES request
# fails with on historic data. However, the request seems to go through
# when the time field is recent (within the last 48 hours?)
time = datetime(2022, 3, 26, 0), datetime(2022, 3, 26, 2)
hres = HRES(
    time=time,
    variables=["t", "q", "u", "v", "w", "z"],
    pressure_levels=[300, 250, 200],
    grid=1,
    # url="https://api.ecmwf.int/v1",
    # key="<key>"
    # email="<email>"
)
hres
[11]:
HRES
        Timesteps: ['2022-03-26 00', '2022-03-26 01', '2022-03-26 02']
        Variables: ['t', 'q', 'u', 'v', 'w', 'z']
        Pressure levels: [200, 250, 300]
        Grid: 1
        Forecast time: 2022-03-26 00:00:00
        Steps: [0, 1, 2]
[12]:
# convience method to see the underlying MARS request
print(hres.generate_mars_request())
retrieve,
        class=od,
        stream=oper,
        expver=1,
        date=20220326,
        time=00,
        type=fc,
        param=t/q/u/v/w/z,
        step=0/1/2,
        grid=1/1,
        levtype=pl,
        levelist=200/250/300
[13]:
# this triggers a download if file isn't in cache store
met_pl = hres.open_metdataset()
met_pl
[13]:
MetDataset with data:

<xarray.Dataset> Size: 14MB
Dimensions:                              (longitude: 360, latitude: 181,
                                          level: 3, time: 3)
Coordinates:
    forecast_time                        datetime64[ns] 8B 2022-03-26
  * level                                (level) float64 24B 200.0 250.0 300.0
  * latitude                             (latitude) float64 1kB -90.0 ... 90.0
  * time                                 (time) datetime64[ns] 24B 2022-03-26...
  * longitude                            (longitude) float64 3kB -180.0 ... 1...
    air_pressure                         (level) float32 12B 2e+04 2.5e+04 3e+04
    altitude                             (level) float32 12B 1.178e+04 ... 9....
Data variables:
    air_temperature                      (longitude, latitude, level, time) float32 2MB dask.array<chunksize=(360, 181, 3, 1), meta=np.ndarray>
    specific_humidity                    (longitude, latitude, level, time) float32 2MB dask.array<chunksize=(360, 181, 3, 1), meta=np.ndarray>
    eastward_wind                        (longitude, latitude, level, time) float32 2MB dask.array<chunksize=(360, 181, 3, 1), meta=np.ndarray>
    northward_wind                       (longitude, latitude, level, time) float32 2MB dask.array<chunksize=(360, 181, 3, 1), meta=np.ndarray>
    lagrangian_tendency_of_air_pressure  (longitude, latitude, level, time) float32 2MB dask.array<chunksize=(360, 181, 3, 1), meta=np.ndarray>
    geopotential                         (longitude, latitude, level, time) float32 2MB dask.array<chunksize=(360, 181, 3, 1), meta=np.ndarray>
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-04-24T23:57 GRIB to CDM+CF via cfgrib-0.9.1...
    pycontrails_version:     0.50.3.dev24
    provider:                ECMWF
    dataset:                 HRES
    product:                 forecast
    radiation_accumulated:   True

HRES Single Level

Note that accumulated parameters (i.e. top_net_thermal_radiation, toa_incident_solar_radiation and other radiation parameters) are accumulated from the start of the forecast

[14]:
hres = HRES(
    time=time,
    variables=["tsr", "ttr"],
    grid=1,
    # url="https://api.ecmwf.int/v1",
    # key="<key>"
    # email="<email>"
)
[15]:
met = hres.open_metdataset()
met
[15]:
MetDataset with data:

<xarray.Dataset> Size: 2MB
Dimensions:                    (level: 1, time: 3, latitude: 181, longitude: 360)
Coordinates:
  * level                      (level) float64 8B -1.0
    forecast_time              datetime64[ns] 8B 2022-03-26
    surface                    float64 8B 0.0
  * latitude                   (latitude) float64 1kB -90.0 -89.0 ... 89.0 90.0
  * time                       (time) datetime64[ns] 24B 2022-03-26 ... 2022-...
  * longitude                  (longitude) float64 3kB -180.0 -179.0 ... 179.0
Data variables:
    top_net_solar_radiation    (longitude, latitude, level, time) float32 782kB dask.array<chunksize=(360, 181, 1, 1), meta=np.ndarray>
    top_net_thermal_radiation  (longitude, latitude, level, time) float32 782kB dask.array<chunksize=(360, 181, 1, 1), meta=np.ndarray>
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-04-24T23:57 GRIB to CDM+CF via cfgrib-0.9.1...
    pycontrails_version:     0.50.3.dev24
    provider:                ECMWF
    dataset:                 HRES
    product:                 forecast
    radiation_accumulated:   True

Specify forecast by runtime

Select data from specific forecast run by forecast_time

[16]:
hres = HRES(
    time=("2022-03-26 01:00:00", "2022-03-26 02:00:00"),
    variables=["t", "q"],
    pressure_levels=[300, 250, 200],
    forecast_time="2022-03-25 12:00:00",
    # url="https://api.ecmwf.int/v1",
    # key="<key>"
    # email="<email>"
)
hres
[16]:
HRES
        Timesteps: ['2022-03-26 01', '2022-03-26 02']
        Variables: ['t', 'q']
        Pressure levels: [200, 250, 300]
        Grid: 0.25
        Forecast time: 2022-03-25 12:00:00
        Steps: [13, 14]

HRES Model Levels

[17]:
from pycontrails.datalib.ecmwf import HRESModelLevel

Similar to the model-level ERA5 demo, we download at a relatively coarse horizontal resolution to decrease data volume.

[18]:
hres = HRESModelLevel(
    time=("2022-03-26 01:00:00", "2022-03-26 02:00:00"),
    variables=["t", "q"],
    forecast_time="2022-03-25 12:00:00",
    grid=1.0,
)
hres
[18]:
HRESModelLevel
        Timesteps: ['2022-03-26 01', '2022-03-26 02']
        Variables: ['t', 'q']
        Pressure levels: [121, 127, 134, 141, 148, 155, 163, 171, 180, 188, 197, 207, 217, 227, 237, 248, 260, 272, 284, 297, 310, 323, 337, 352, 367, 383, 399, 416, 433, 451]
        Grid: 1.0
        Forecast time: 2022-03-25 12:00:00
        Steps: [13, 14]
[19]:
met_ml = hres.open_metdataset()
met_ml
[19]:
MetDataset with data:

<xarray.Dataset> Size: 31MB
Dimensions:              (longitude: 360, latitude: 181, level: 30, time: 2)
Coordinates:
    initialization_time  datetime64[ns] 8B 2022-03-25T12:00:00
  * time                 (time) datetime64[ns] 16B 2022-03-26T01:00:00 2022-0...
  * level                (level) float64 240B 121.0 127.0 134.0 ... 433.0 451.0
  * latitude             (latitude) float64 1kB -90.0 -89.0 -88.0 ... 89.0 90.0
    valid_time           (time) datetime64[ns] 16B 2022-03-26T01:00:00 2022-0...
  * longitude            (longitude) float64 3kB -180.0 -179.0 ... 178.0 179.0
    air_pressure         (level) float32 120B 1.21e+04 1.27e+04 ... 4.51e+04
    altitude             (level) float32 120B 1.497e+04 1.466e+04 ... 6.328e+03
Data variables:
    air_temperature      (longitude, latitude, level, time) float32 16MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
    specific_humidity    (longitude, latitude, level, time) float32 16MB dask.array<chunksize=(360, 181, 30, 1), meta=np.ndarray>
Attributes:
    GRIB_edition:            2
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-04-24T23:59 GRIB to CDM+CF via cfgrib-0.9.1...
    pycontrails_version:     0.50.3.dev24
    provider:                ECMWF
    dataset:                 HRES
    product:                 forecast
    radiation_accumulated:   True

IFS

In development

Integrated Forecasting System from ECMWF

Access

IFS files must be downloaded to a local directory before accessing.

Reference

[20]:
from pycontrails.datalib.ecmwf import IFS
[21]:
ifs = IFS(
    time=("2021-10-02 00:00:00", "2021-10-02 14:00:00"),
    variables=["air_temperature"],
    forecast_path="ifs",
    forecast_date="2021-10-01",
)

ECMWF Variables

ECMWF_VARIABLES attribute lists the supported parameters from the ECMWF Pameter DB as a list[MetVariable]

[22]:
from pycontrails.datalib.ecmwf import ECMWF_VARIABLES
[23]:
[met_var.standard_name for met_var in ECMWF_VARIABLES]
[23]:
['air_temperature',
 'specific_humidity',
 'geopotential',
 'eastward_wind',
 'northward_wind',
 'lagrangian_tendency_of_air_pressure',
 'relative_humidity',
 'atmosphere_upward_relative_vorticity',
 'fraction_of_cloud_cover',
 'specific_cloud_ice_water_content',
 'specific_cloud_liquid_water_content',
 'potential_vorticity',
 'surface_air_pressure',
 'toa_incident_solar_radiation',
 'top_net_solar_radiation',
 'top_net_thermal_radiation',
 'total_cloud_cover',
 'surface_solar_downward_radiation']
[24]:
from pycontrails.datalib.ecmwf import TopNetSolarRadiation
[25]:
# ECMWF variables contain a link to the param-db entry
TopNetSolarRadiation.ecmwf_link
[25]:
'https://apps.ecmwf.int/codes/grib/param-db?id=178'

Cache Data Files to GCP

Requires [gcp] optional dependencies:

$ pip install pycontrails[gcp]

By default, data files are cached to the local disk in the users Caches directory.

To cache files to a remote Google Cloud Storage bucket, use the GCPCacheStore

ERA5

[26]:
from pycontrails import GCPCacheStore
[27]:
variables = ["air_temperature", "relative_humidity"]

gcp = GCPCacheStore(bucket="contrails-301217-unit-test", cache_dir="test/era5", read_only=False)

era5 = ERA5(
    time=(datetime(2019, 1, 1, 0), datetime(2019, 1, 1, 2)),
    variables=variables,
    pressure_levels=[300, 250, 150],
    cachestore=gcp,
    # url="https://cds.climate.copernicus.eu/api",
    # key="<key>"
)
[28]:
# download data to cache - uncomment to run
# met = era5.open_metdataset()