pycontrails.GeoVectorDataset

class pycontrails.GeoVectorDataset(data=None, *, longitude=None, latitude=None, altitude=None, altitude_ft=None, level=None, time=None, attrs=None, copy=True, **attrs_kwargs)

Bases: VectorDataset

Base class to hold 1D geospatial arrays of consistent size.

GeoVectorDataset is required to have geospatial coordinate keys defined in required_keys.

Expect latitude-longitude CRS in WGS 84. Expect altitude in [\(m\)]. Expect level in [\(hPa\)].

Each spatial variable is expected to have “float32” or “float64” dtype. The time variable is expected to have “datetime64[ns]” dtype.

Use the attribute attr["crs"] to specify coordinate reference system using PROJ or EPSG syntax.

Parameters:
  • data (dict[str, npt.ArrayLike] | pd.DataFrame | VectorDataDict | VectorDataset | None, optional) – Data dictionary or pandas.DataFrame . Must include keys/columns time, latitude, longitude, altitude or level. Keyword arguments for time, latitude, longitude, altitude or level override data inputs. Expects altitude in meters and time as a DatetimeLike (or array that can processed with pd.to_datetime()). Additional waypoint-specific data can be included as additional keys/columns.

  • longitude (npt.ArrayLike, optional) – Longitude data. Defaults to None.

  • latitude (npt.ArrayLike, optional) – Latitude data. Defaults to None.

  • altitude (npt.ArrayLike, optional) – Altitude data, [\(m\)]. Defaults to None.

  • altitude_ft (npt.ArrayLike, optional) – Altitude data, [\(ft\)]. Defaults to None.

  • level (npt.ArrayLike, optional) – Level data, [\(hPa\)]. Defaults to None.

  • time (npt.ArrayLike, optional) – Time data. Expects an array of DatetimeLike values, or array that can processed with pd.to_datetime(). Defaults to None.

  • attrs (dict[Hashable, Any] | AttrDict, optional) – Additional properties as a dictionary. Defaults to {}.

  • copy (bool, optional) – Copy data on class creation. Defaults to True.

  • **attrs_kwargs (Any) – Additional properties passed as keyword arguments.

Raises:

KeyError – Raises if data input does not contain at least time, latitude, longitude, (altitude or level).

__init__(data=None, *, longitude=None, latitude=None, altitude=None, altitude_ft=None, level=None, time=None, attrs=None, copy=True, **attrs_kwargs)

Methods

T_isa()

Calculate the ICAO standard atmosphere temperature at each point.

__init__([data, longitude, latitude, ...])

broadcast_attrs(keys[, overwrite, raise_error])

Attach values from keys in attrs onto data.

broadcast_numeric_attrs([ignore_keys, overwrite])

Attach numeric values in attrs onto data.

coords_intersect_met(met)

Return boolean mask of data inside the bounding box defined by met.

copy(**kwargs)

Return a copy of this VectorDatasetType class.

create_empty([keys, attrs])

Create instance with variables defined by keys and size 0.

downselect_met(met, *[, longitude_buffer, ...])

Downselect met to encompass a spatiotemporal region of the data.

ensure_vars(vars[, raise_error])

Ensure variables exist in column of data or attrs.

filter(mask[, copy])

Filter data according to a boolean array mask.

from_dict(obj[, copy])

Create instance from dict representation containing data and attrs.

generate_splits(n_splits[, copy])

Split instance into n_split sub-vectors.

get(key[, default_value])

Get values from data with default_value if key not in data.

get_data_or_attr(key[, default])

Get value from data or attrs.

intersect_met(mda, *[, longitude, latitude, ...])

Intersect waypoints with MetDataArray.

select(keys[, copy])

Return new class instance only containing specified keys.

setdefault(key[, default])

Shortcut to VectorDataDict.setdefault().

sort(by)

Sort data by key(s).

sum(vectors[, infer_attrs, fill_value])

Sum a list of VectorDataset instances.

to_dataframe([copy])

Create pd.DataFrame in which each key-value pair in data is a column.

to_dict()

Create dictionary with data and attrs.

to_geojson_points()

Return dataset as GeoJSON FeatureCollection of Points.

to_lon_lat_grid(agg, *[, spatial_bbox, ...])

Convert vectors to a longitude-latitude grid.

to_pseudo_mercator([copy])

Convert data from attrs["crs"] to Pseudo Mercator (EPSG:3857).

transform_crs(crs[, copy])

Transform trajectory data from one coordinate reference system (CRS) to another.

update([other])

Update values in data dict without warning if overwriting.

Attributes

air_pressure

Get air_pressure values for points.

altitude

Get altitude.

altitude_ft

Get altitude in feet.

attrs

Generic dataset attributes

constants

Return a dictionary of constant attributes and data values.

coords

Get geospatial coordinates for compatibility with MetDataArray.

data

Vector data with labels as keys and numpy.ndarray as values

dataframe

Shorthand property to access to_dataframe() with copy=False.

hash

Generate a unique hash for this class instance.

level

Get pressure level values for points.

required_keys

Required keys for creating GeoVectorDataset

shape

Shape of each array in data.

size

Length of each array in data.

vertical_keys

At least one of these vertical-coordinate keys must also be included

T_isa()

Calculate the ICAO standard atmosphere temperature at each point.

Returns:

npt.NDArray[np.float64] – ISA temperature, [\(K\)]

property air_pressure

Get air_pressure values for points.

Returns:

npt.NDArray[np.float64] – Point air pressure values, [\(Pa\)]

property altitude

Get altitude.

Automatically calculates altitude using units.pl_to_m() using level key.

Note that if altitude key exists in data, the data at the altitude key will be returned. This allows an override of the default calculation of altitude from pressure level.

Returns:

npt.NDArray[np.float64] – Altitude, [\(m\)]

property altitude_ft

Get altitude in feet.

Returns:

npt.NDArray[np.float64] – Altitude, [\(ft\)]

property constants

Return a dictionary of constant attributes and data values.

Includes attrs and values from columns in data with a unique value.

Returns:

dict[str, Any] – Properties and their constant values

property coords

Get geospatial coordinates for compatibility with MetDataArray.

Returns:

pandas.DataFramepd.DataFrame with columns longitude, latitude, level, and time.

coords_intersect_met(met)

Return boolean mask of data inside the bounding box defined by met.

Parameters:

met (MetDataset | MetDataArray) – MetDataset or MetDataArray to compare.

Returns:

npt.NDArray[np.bool_] – True if point is inside the bounding box defined by met.

classmethod create_empty(keys=None, attrs=None, **attrs_kwargs)

Create instance with variables defined by keys and size 0.

If instance requires additional variables to be defined, these keys will automatically be attached to returned instance.

Parameters:
  • keys (Iterable[str]) – Keys to include in empty VectorDataset instance.

  • attrs (dict[str, Any] | None, optional) – Attributes to attach instance.

  • **attrs_kwargs (Any) – Define attributes as keyword arguments.

Returns:

VectorDatasetType – Empty VectorDataset instance.

downselect_met(met, *, longitude_buffer=(0.0, 0.0), latitude_buffer=(0.0, 0.0), level_buffer=(0.0, 0.0), time_buffer=(np.timedelta64(0, 'h'), np.timedelta64(0, 'h')), copy=True)

Downselect met to encompass a spatiotemporal region of the data.

Parameters:
  • met (MetDataset | MetDataArray) – MetDataset or MetDataArray to downselect.

  • longitude_buffer (tuple[float, float], optional) – Extend longitude domain past by longitude_buffer[0] on the low side and longitude_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (0, 0) degrees.

  • latitude_buffer (tuple[float, float], optional) – Extend latitude domain past by latitude_buffer[0] on the low side and latitude_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (0, 0) degrees.

  • level_buffer (tuple[float, float], optional) – Extend level domain past by level_buffer[0] on the low side and level_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (0, 0) [\(hPa\)].

  • time_buffer (tuple[np.timedelta64, np.timedelta64], optional) – Extend time domain past by time_buffer[0] on the low side and time_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (np.timedelta64(0, "h"), np.timedelta64(0, "h")).

  • copy (bool) – If returned object is a copy or view of the original. True by default.

Returns:

MetDataset | MetDataArray – Copy of downselected MetDataset or MetDataArray.

intersect_met(mda, *, longitude=None, latitude=None, level=None, time=None, use_indices=False, **interp_kwargs)

Intersect waypoints with MetDataArray.

Parameters:
  • mda (MetDataArray) – MetDataArray containing a meteorological variable at spatio-temporal coordinates.

  • longitude (npt.NDArray[np.float64], optional) – Override existing coordinates for met interpolation

  • latitude (npt.NDArray[np.float64], optional) – Override existing coordinates for met interpolation

  • level (npt.NDArray[np.float64], optional) – Override existing coordinates for met interpolation

  • time (npt.NDArray[np.datetime64], optional) – Override existing coordinates for met interpolation

  • use_indices (bool, optional) – Experimental.

  • **interp_kwargs (Any) – Additional keyword arguments to pass to MetDataArray.intersect_met(). Examples include method, bounds_error, and fill_value. If an error such as

    ValueError: One of the requested xi is out of bounds in dimension 2
    

    occurs, try calling this function with bounds_error=False. In addition, setting fill_value=0.0 will replace NaN values with 0.0.

Returns:

npt.NDArray[np.float64] – Interpolated values

Examples

>>> from datetime import datetime
>>> import pandas as pd
>>> import numpy as np
>>> from pycontrails.datalib.ecmwf import ERA5
>>> from pycontrails import Flight
>>> # Get met data
>>> times = (datetime(2022, 3, 1, 0),  datetime(2022, 3, 1, 3))
>>> variables = ["air_temperature", "specific_humidity"]
>>> levels = [300, 250, 200]
>>> era5 = ERA5(time=times, variables=variables, pressure_levels=levels)
>>> met = era5.open_metdataset()
>>> # Example flight
>>> df = pd.DataFrame()
>>> df['longitude'] = np.linspace(0, 50, 10)
>>> df['latitude'] = np.linspace(0, 10, 10)
>>> df['altitude'] = 11000
>>> df['time'] = pd.date_range("2022-03-01T00", "2022-03-01T02", periods=10)
>>> fl = Flight(df)
>>> # Intersect
>>> fl.intersect_met(met['air_temperature'], method='nearest')
array([231.62969892, 230.72604651, 232.24318771, 231.88338483,
       231.06429438, 231.59073409, 231.65125393, 231.93064004,
       232.03344087, 231.65954432])
>>> fl.intersect_met(met['air_temperature'], method='linear')
array([225.77794552, 225.13908414, 226.231218  , 226.31831528,
       225.56102321, 225.81192149, 226.03192642, 226.22056121,
       226.03770174, 225.63226188])
>>> # Interpolate and attach to `Flight` instance
>>> for key in met:
...     fl[key] = fl.intersect_met(met[key])
>>> # Show the final three columns of the dataframe
>>> fl.dataframe.iloc[:, -3:].head()
                 time  air_temperature  specific_humidity
0 2022-03-01 00:00:00       225.777946           0.000132
1 2022-03-01 00:13:20       225.139084           0.000132
2 2022-03-01 00:26:40       226.231218           0.000107
3 2022-03-01 00:40:00       226.318315           0.000171
4 2022-03-01 00:53:20       225.561022           0.000109
property level

Get pressure level values for points.

Automatically calculates pressure level using units.m_to_pl() using altitude key.

Note that if level key exists in data, the data at the level key will be returned. This allows an override of the default calculation of pressure level from altitude.

Returns:

npt.NDArray[np.float64] – Point pressure level values, [\(hPa\)]

required_keys = ('longitude', 'latitude', 'time')

Required keys for creating GeoVectorDataset

to_geojson_points()

Return dataset as GeoJSON FeatureCollection of Points.

Each Feature has a properties attribute that includes time and other data besides latitude, longitude, and altitude in data.

Returns:

dict[str, Any] – Python representation of GeoJSON FeatureCollection

to_lon_lat_grid(agg, *, spatial_bbox=(-180.0, -90.0, 180.0, 90.0), spatial_grid_res=0.5)

Convert vectors to a longitude-latitude grid.

See also

vector_to_lon_lat_grid

to_pseudo_mercator(copy=True)

Convert data from attrs["crs"] to Pseudo Mercator (EPSG:3857).

Parameters:

copy (bool, optional) – Copy data on transformation. Defaults to True.

Returns:

GeoVectorDatasetType

transform_crs(crs, copy=True)

Transform trajectory data from one coordinate reference system (CRS) to another.

Parameters:
  • crs (str) – Target CRS. Passed into to pyproj.Transformer. The source CRS is inferred from the attrs["crs"] attribute.

  • copy (bool, optional) – Copy data on transformation. Defaults to True.

Returns:

GeoVectorDatasetType – Converted dataset with new coordinate reference system. attrs["crs"] reflects new crs.

vertical_keys = ('altitude', 'level', 'altitude_ft')

At least one of these vertical-coordinate keys must also be included