pycontrails.core.vector¶

Lightweight data structures for vector paths.

Functions

vector_to_lon_lat_grid(vector, agg, *[, ...])

Convert vectors to a longitude-latitude grid.

Classes

`AttrDict`	Thin wrapper around dict to warn when setting a key that already exists.
`GeoVectorDataset`([data, longitude, ...])	Base class to hold 1D geospatial arrays of consistent size.
`VectorDataDict`([data])	Thin wrapper around `dict[str, np.ndarray]` to ensure consistency.
`VectorDataset`([data, attrs, copy])	Base class to hold 1D arrays of consistent size.

class pycontrails.core.vector.AttrDict¶

Bases: dict[str, Any]

Thin wrapper around dict to warn when setting a key that already exists.

clear()¶: Remove all items from the dict.

copy()¶: Return a shallow copy of the dict.

fromkeys(value=None, /)¶: Create a new dictionary with keys from iterable and values set to value.

get(key, default=None, /)¶: Return the value for key if key is in the dictionary, else default.

items()¶: Return a set-like object providing a view on the dict’s items.

keys()¶: Return a set-like object providing a view on the dict’s keys.

pop(k[, d]) → v, remove specified key and return the corresponding value.¶: If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem()¶

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

setdefault(k, default=None)¶

Thin wrapper around dict.setdefault.

Overwrites value if value is None.

Parameters:

k (str) – Key
default (Any, optional) – Default value for key k

Returns:

Any – Value at k

update([E, ]**F) → None. Update D from mapping/iterable E and F.¶: If E is present and has a .keys() method, then does: for k in E.keys(): D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values()¶: Return an object providing a view on the dict’s values.

class pycontrails.core.vector.GeoVectorDataset(data=None, *, longitude=None, latitude=None, altitude=None, altitude_ft=None, level=None, time=None, attrs=None, copy=True, **attrs_kwargs)¶

Bases: VectorDataset

Base class to hold 1D geospatial arrays of consistent size.

GeoVectorDataset is required to have geospatial coordinate keys defined in required_keys.

Expect latitude-longitude CRS in WGS 84. Expect altitude in [\(m\)]. Expect level in [\(hPa\)].

Each spatial variable is expected to have “float32” or “float64” dtype. The time variable is expected to have “datetime64[ns]” dtype.

Parameters:

data (dict[str, npt.ArrayLike] | pd.DataFrame | VectorDataset | None, optional) – Data dictionary or pandas.DataFrame . Must include keys/columns time, latitude, longitude, altitude or level. Keyword arguments for time, latitude, longitude, altitude or level override data inputs. Expects altitude in meters and time as a DatetimeLike (or array that can processed with pd.to_datetime()). Additional waypoint-specific data can be included as additional keys/columns.
longitude (npt.ArrayLike | None, optional) – Longitude data. Defaults to None.
latitude (npt.ArrayLike | None, optional) – Latitude data. Defaults to None.
altitude (npt.ArrayLike | None, optional) – Altitude data, [\(m\)]. Defaults to None.
altitude_ft (npt.ArrayLike | None, optional) – Altitude data, [\(ft\)]. Defaults to None.
level (npt.ArrayLike | None, optional) – Level data, [\(hPa\)]. Defaults to None.
time (npt.ArrayLike | None, optional) – Time data. Expects an array of DatetimeLike values, or array that can processed with pd.to_datetime(). Defaults to None.
attrs (dict[str, Any] | None, optional) – Additional properties as a dictionary. Defaults to {}.
copy (bool, optional) – Copy data on class creation. Defaults to True.
**attrs_kwargs (Any) – Additional properties passed as keyword arguments.

Raises:

KeyError – Raises if data input does not contain at least time, latitude, longitude, (altitude or level).

T_isa()¶

Calculate the ICAO standard atmosphere temperature at each point.

Returns:: npt.NDArray[np.floating] – ISA temperature, [\(K\)]

property air_pressure¶

Get air_pressure values for points.

Returns:: npt.NDArray[np.floating] – Point air pressure values, [\(Pa\)]

property altitude¶

Get altitude.

Automatically calculates altitude using units.pl_to_m() using level key.

Note that if altitude key exists in data, the data at the altitude key will be returned. This allows an override of the default calculation of altitude from pressure level.

Returns:: npt.NDArray[np.floating] – Altitude, [\(m\)]

property altitude_ft¶

Get altitude in feet.

Returns:: npt.NDArray[np.floating] – Altitude, [\(ft\)]

attrs¶: Generic dataset attributes

broadcast_attrs(keys, overwrite=False, raise_error=True)¶

Attach values from keys in attrs onto data.

If possible, use dtype = np.float32 when broadcasting. If not possible, use whatever dtype is inferred from the data by numpy.full().

Parameters:

keys (str | Iterable[str]) – Keys to broadcast
overwrite (bool, optional) – If True, overwrite existing values in data. By default False.
raise_error (bool, optional) – Raise KeyError if self.attrs does not contain some of keys.

Raises:

KeyError – Not all keys found in attrs.

broadcast_numeric_attrs(ignore_keys=None, overwrite=False)¶

Attach numeric values in attrs onto data.

Iterate through values in attrs and attach float and int values to data.

This method modifies object in place.

Parameters:

ignore_keys (str | Iterable[str] | None, optional) – Do not broadcast selected keys. Defaults to None.
overwrite (bool, optional) – If True, overwrite existing values in data. By default False.

property constants¶

Return a dictionary of constant attributes and data values.

Includes attrs and values from columns in data with a unique value.

Returns:: dict[str, Any] – Properties and their constant values

property coords¶

Get geospatial coordinates for compatibility with MetDataArray.

Returns:: dict[str, np.ndarray] – A dictionary with fields longitude, latitude, level, and time.

coords_intersect_met(met)¶

Return boolean mask of data inside the bounding box defined by met.

Parameters:: met (met_module.MetDataset | met_module.MetDataArray) – MetDataset or MetDataArray to compare.
Returns:: npt.NDArray[np.bool_] – True if point is inside the bounding box defined by met.

copy(**kwargs)¶

Return a copy of this instance.

Parameters:: **kwargs (Any) – Additional keyword arguments passed into the constructor of the returned class.
Returns:: Self – Copy of class

classmethod create_empty(keys=None, attrs=None, **attrs_kwargs)¶

Create instance with variables defined by keys and size 0.

If instance requires additional variables to be defined, these keys will automatically be attached to returned instance.

Parameters:

keys (Iterable[str]) – Keys to include in empty VectorDataset instance.
attrs (dict[str, Any] | None, optional) – Attributes to attach instance.
**kwargs (Any) – Additional keyword arguments passed into the constructor of the returned class.

Returns:

Self – Empty VectorDataset instance.

data¶: Vector data with labels as keys and numpy.ndarray as values

property dataframe¶

Shorthand property to access to_dataframe() with copy=False.

Returns:: pandas.DataFrame – Equivalent to the output from to_dataframe()

downselect_met(met, *, longitude_buffer=(0.0, 0.0), latitude_buffer=(0.0, 0.0), level_buffer=(0.0, 0.0), time_buffer=(np.timedelta64(0, 'h'), np.timedelta64(0, 'h')))¶

Downselect met to encompass a spatiotemporal region of the data.

Changed in version 0.54.5: Returned object is no longer copied.

Parameters:

met (met_module.MetDataType) – MetDataset or MetDataArray to downselect.
longitude_buffer (tuple[float, float], optional) – Extend longitude domain past by longitude_buffer[0] on the low side and longitude_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (0, 0) degrees.
latitude_buffer (tuple[float, float], optional) – Extend latitude domain past by latitude_buffer[0] on the low side and latitude_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (0, 0) degrees.
level_buffer (tuple[float, float], optional) – Extend level domain past by level_buffer[0] on the low side and level_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (0, 0) [\(hPa\)].
time_buffer (tuple[np.timedelta64, np.timedelta64], optional) – Extend time domain past by time_buffer[0] on the low side and time_buffer[1] on the high side. Units must be the same as class coordinates. Defaults to (np.timedelta64(0, "h"), np.timedelta64(0, "h")).

Returns:

met_module.MetDataType – Copy of downselected MetDataset or MetDataArray.

ensure_vars(vars, raise_error=True)¶

Ensure variables exist in column of data or attrs.

Parameters:

vars (str | Iterable[str]) – A single string variable name or a sequence of string variable names.
raise_error (bool, optional) – Raise KeyError if data does not contain variables. Defaults to True.

Returns:

bool – True if all variables exist. False otherwise.

Raises:

KeyError – Raises when dataset does not contain variable in vars

filter(mask, copy=True, **kwargs)¶

Filter data according to a boolean array mask.

Entries corresponding to mask == True are kept.

Parameters:

mask (npt.NDArray[np.bool_]) – Boolean array with compatible shape.
copy (bool, optional) – Copy data on filter. Defaults to True. See numpy best practices for insight into whether copy is appropriate.
**kwargs (Any) – Additional keyword arguments passed into the constructor of the returned class.

Returns:

Self – Containing filtered data

Raises:

TypeError – If mask is not a boolean array.

classmethod from_dict(obj, copy=True, **obj_kwargs)¶

Create instance from dict representation containing data and attrs.

Parameters:

obj (dict[str, Any]) – Dict representation of VectorDataset (e.g. to_dict())
copy (bool, optional) – Passed to VectorDataset constructor. Defaults to True.
**obj_kwargs (Any) – Additional properties passed as keyword arguments.

Returns:

Self – VectorDataset instance.

See also

to_dict()

generate_splits(n_splits, copy=True)¶

Split instance into n_split sub-vectors.

Parameters:

n_splits (int) – Number of splits.
copy (bool, optional) – Passed into filter(). Defaults to True. Recommend to keep as True based on numpy best practices.

Yields:

Self – Generator of split vectors.

See also

numpy.array_split()

get(key, default_value=None)¶

Get values from data with default_value if key not in data.

Parameters:

key (str) – Key to get from data
default_value (Any, optional) – Return default_value if key not in data, by default None

Returns:

Any – Values at data[key] or default_value

get_constant(key, default=<object object>)¶

Get a constant value from attrs or data.

If key is found in attrs, the value is returned.
If key is found in data, the common value is returned if all values are equal.
If key is not found in attrs or data and a default is provided, the default is returned.
Otherwise, a KeyError is raised.

Parameters:

key (str) – Key to look for.
default (Any, optional) – Default value to return if key is not found in attrs or data.

Returns:

Any – The constant value for key.

Raises:

KeyError – If key is not found in attrs or the values in data are not equal and default is not provided.

Examples

>>> vector = VectorDataset({"a": [1, 1, 1], "b": [2, 2, 3]})
>>> vector.get_constant("a")
np.int64(1)
>>> vector.get_constant("b")
Traceback (most recent call last):
...
KeyError: "A constant key 'b' not found in attrs or data"
>>> vector.get_constant("b", 3)
3

get_data_or_attr(key, default=<object object>)¶

Get value from data or attrs.

This method first checks if key is in data and returns the value if so. If key is not in data, then this method checks if key is in attrs and returns the value if so. If key is not in data or attrs, then the default value is returned if provided. Otherwise a KeyError is raised.

Parameters:

key (str) – Key to get from data or attrs
default (Any, optional) – Default value to return if key is not in data or attrs.

Returns:

Any – Value at data[key] or attrs[key]

Raises:

KeyError – If key is not in data or attrs and default is not provided.

Examples

>>> vector = VectorDataset({"a": [1, 2, 3]}, attrs={"b": 4})
>>> vector.get_data_or_attr("a")
array([1, 2, 3])

>>> vector.get_data_or_attr("b")
4

>>> vector.get_data_or_attr("c")
Traceback (most recent call last):
...
KeyError: "Key 'c' not found in data or attrs."

>>> vector.get_data_or_attr("c", default=5)
5

See also

get_constant

property hash¶

Generate a unique hash for this class instance.

Returns:: str – Unique hash for flight instance (sha1)

intersect_met(mda, *, longitude=None, latitude=None, level=None, time=None, use_indices=False, **interp_kwargs)¶

Intersect waypoints with MetDataArray.

Parameters:

mda (met_module.MetDataArray) – MetDataArray containing a meteorological variable at spatio-temporal coordinates.
longitude (npt.NDArray[np.floating] | None, optional) – Override existing coordinates for met interpolation
latitude (npt.NDArray[np.floating] | None, optional) – Override existing coordinates for met interpolation
level (npt.NDArray[np.floating] | None, optional) – Override existing coordinates for met interpolation
time (npt.NDArray[np.datetime64] | None, optional) – Override existing coordinates for met interpolation
use_indices (bool, optional) – Experimental.
**interp_kwargs (Any) – Additional keyword arguments to pass to MetDataArray.intersect_met(). Examples include method, bounds_error, and fill_value. If an error such as
```
ValueError: One of the requested xi is out of bounds in dimension 2
```
occurs, try calling this function with bounds_error=False. In addition, setting fill_value=0.0 will replace NaN values with 0.0.

Returns:

npt.NDArray[np.floating] – Interpolated values

Examples

>>> from datetime import datetime
>>> import pandas as pd
>>> import numpy as np
>>> from pycontrails.datalib.ecmwf import ERA5
>>> from pycontrails import Flight

>>> # Get met data
>>> times = (datetime(2022, 3, 1, 0),  datetime(2022, 3, 1, 3))
>>> variables = ["air_temperature", "specific_humidity"]
>>> levels = [300, 250, 200]
>>> era5 = ERA5(time=times, variables=variables, pressure_levels=levels)
>>> met = era5.open_metdataset()

>>> # Example flight
>>> df = pd.DataFrame()
>>> df['longitude'] = np.linspace(0, 50, 10)
>>> df['latitude'] = np.linspace(0, 10, 10)
>>> df['altitude'] = 11000
>>> df['time'] = pd.date_range("2022-03-01T00", "2022-03-01T02", periods=10)
>>> fl = Flight(df)

>>> # Intersect
>>> fl.intersect_met(met['air_temperature'], method='nearest')
array([231.62969892, 230.72604651, 232.24318771, 231.88338483,
       231.06429438, 231.59073409, 231.65125393, 231.93064004,
       232.03344087, 231.65954432])

>>> fl.intersect_met(met['air_temperature'], method='linear')
array([225.77794552, 225.13908414, 226.231218  , 226.31831528,
       225.56102321, 225.81192149, 226.03192642, 226.22056121,
       226.03770174, 225.63226188])

>>> # Interpolate and attach to `Flight` instance
>>> for key in met:
...     fl[key] = fl.intersect_met(met[key])

>>> # Show the final three columns of the dataframe
>>> fl.dataframe.iloc[:, -3:].head()
                 time  air_temperature  specific_humidity
0 2022-03-01 00:00:00       225.777946           0.000132
1 2022-03-01 00:13:20       225.139084           0.000132
2 2022-03-01 00:26:40       226.231218           0.000107
3 2022-03-01 00:40:00       226.318315           0.000171
4 2022-03-01 00:53:20       225.561022           0.000109

property level¶

Get pressure level values for points.

Automatically calculates pressure level using units.m_to_pl() using altitude key.

Note that if level key exists in data, the data at the level key will be returned. This allows an override of the default calculation of pressure level from altitude.

Returns:: npt.NDArray[np.floating] – Point pressure level values, [\(hPa\)]

required_keys = ('longitude', 'latitude', 'time')¶: Required keys for creating GeoVectorDataset

select(keys, copy=True)¶

Return new class instance only containing specified keys.

Parameters:

keys (Iterable[str]) – An iterable of keys to filter by.
copy (bool, optional) – Copy data on selection. Defaults to True.

Returns:

VectorDataset – VectorDataset containing only data associated to keys. Note that this method always returns a VectorDataset, even if the calling class is a proper subclass of VectorDataset.

setdefault(key, default=None)¶

Shortcut to VectorDataDict.setdefault().

Parameters:

key (str) – Key in data dict.
default (npt.ArrayLike, optional) – Values to use as default, if key is not defined

Returns:

numpy.ndarray – Values at key

property shape¶

Shape of each array in data.

Returns:: tuple[int] – Shape of each array in data.

property size¶

Length of each array in data.

Returns:: int – Length of each array in data.

sort(by)¶

Sort data by key(s).

This method always creates a copy of the data by calling pandas.DataFrame.sort_values().

Parameters:: by (str | list[str]) – Key or list of keys to sort by.
Returns:: Self – Instance with sorted data.

classmethod sum(vectors, infer_attrs=True, fill_value=None)¶

Sum a list of VectorDataset instances.

Parameters:

vectors (Sequence[VectorDataset]) – List of VectorDataset instances to concatenate.
infer_attrs (bool, optional) – If True, infer attributes from the first element in the sequence.
fill_value (float | None, optional) – Fill value to use when concatenating arrays. By default None, which raises an error if incompatible keys are found.

Returns:

Self – Sum of all instances in vectors.

Raises:

KeyError – If incompatible data keys are found among vectors.

Examples

>>> from pycontrails import VectorDataset
>>> v1 = VectorDataset({"a": [1, 2, 3], "b": [4, 5, 6]})
>>> v2 = VectorDataset({"a": [7, 8, 9], "b": [10, 11, 12]})
>>> v3 = VectorDataset({"a": [13, 14, 15], "b": [16, 17, 18]})
>>> v = VectorDataset.sum([v1, v2, v3])
>>> v.dataframe
    a   b
0   1   4
1   2   5
2   3   6
3   7  10
4   8  11
5   9  12
6  13  16
7  14  17
8  15  18

to_dataframe(copy=True)¶

Create pd.DataFrame in which each key-value pair in data is a column.

DataFrame does not copy data by default. Use the copy parameter to copy data values on creation.

Parameters:: copy (bool, optional) – Copy data on DataFrame creation.
Returns:: pandas.DataFrame – DataFrame holding key-values as columns.

to_dict()¶

Create dictionary with data and attrs.

If geo-spatial coordinates (e.g. "latitude", "longitude", "altitude") are present, round to a reasonable precision. If a "time" variable is present, round to unix seconds. When the instance is a GeoVectorDataset, disregard any "altitude" or "level" coordinate and only include "altitude_ft" in the output.

Returns:: dict[str, Any] – Dictionary with data and attrs.

See also

from_dict()

Examples

>>> import pprint
>>> from pycontrails import Flight
>>> fl = Flight(
...     longitude=[-100, -110],
...     latitude=[40, 50],
...     level=[200, 200],
...     time=[np.datetime64("2020-01-01T09"), np.datetime64("2020-01-01T09:30")],
...     aircraft_type="B737",
... )
>>> fl = fl.resample_and_fill("5min")
>>> pprint.pprint(fl.to_dict())
{'aircraft_type': 'B737',
 'altitude_ft': [38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0],
 'latitude': [40.0, 41.724, 43.428, 45.111, 46.769, 48.399, 50.0],
 'longitude': [-100.0,
               -101.441,
               -102.959,
               -104.563,
               -106.267,
               -108.076,
               -110.0],
 'time': [1577869200,
          1577869500,
          1577869800,
          1577870100,
          1577870400,
          1577870700,
          1577871000]}

to_geojson_points()¶

Return dataset as GeoJSON FeatureCollection of Points.

Each Feature has a properties attribute that includes time and other data besides latitude, longitude, and altitude in data.

Returns:: dict[str, Any] – Python representation of GeoJSON FeatureCollection

to_lon_lat_grid(agg, *, spatial_bbox=(-180.0, -90.0, 180.0, 90.0), spatial_grid_res=0.5)¶: Convert vectors to a longitude-latitude grid.

See also

vector_to_lon_lat_grid

transform_crs(crs)¶

Transform trajectory data from one coordinate reference system (CRS) to another.

Parameters:: crs (str) – Target CRS. Passed into to pyproj.Transformer. The source CRS is assumed to be EPSG:4326.
Returns:: tuple[npt.NDArray[np.floating], npt.NDArray[np.floating]] – New x and y coordinates in the target CRS.

update(other=None, **kwargs)¶

Update values in data dict without warning if overwriting.

Parameters:

other (dict[str, npt.ArrayLike] | None, optional) – Fields to update as dict
**kwargs (npt.ArrayLike) – Fields to update as kwargs

vertical_keys = ('altitude', 'level', 'altitude_ft')¶: At least one of these vertical-coordinate keys must also be included

class pycontrails.core.vector.VectorDataDict(data=None)¶

Bases: dict[str, ndarray]

Thin wrapper around dict[str, np.ndarray] to ensure consistency.

Parameters:: data (dict[str, np.ndarray] | None, optional) – Dictionary input. A shallow copy is always made.

clear()¶: Remove all items from the dict.

copy()¶: Return a shallow copy of the dict.

fromkeys(value=None, /)¶: Create a new dictionary with keys from iterable and values set to value.

get(key, default=None, /)¶: Return the value for key if key is in the dictionary, else default.

items()¶: Return a set-like object providing a view on the dict’s items.

keys()¶: Return a set-like object providing a view on the dict’s keys.

pop(k[, d]) → v, remove specified key and return the corresponding value.¶: If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem()¶

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

setdefault(k, default=None)¶

Thin wrapper around dict.setdefault.

The main purpose of overriding is to run _validate_array() on set.

Parameters:

k (str) – Key
default (npt.ArrayLike | None, optional) – Default value for key k

Returns:

numpy.ndarray – Value at k

update(other=None, **kwargs)¶

Update values without warning if overwriting.

This method casts values in other to numpy.ndarray and ensures that the array sizes are consistent with the instance.

Parameters:

other (dict[str, npt.ArrayLike] | None, optional) – Fields to update as dict
**kwargs (npt.ArrayLike) – Fields to update as kwargs

values()¶: Return an object providing a view on the dict’s values.

class pycontrails.core.vector.VectorDataset(data=None, *, attrs=None, copy=True, **attrs_kwargs)¶

Bases: object

Base class to hold 1D arrays of consistent size.

Parameters:

data (dict[str, npt.ArrayLike] | pd.DataFrame | VectorDataset | None, optional) – Initial data, by default None. A shallow copy is always made. Use the copy parameter to copy the underlying array data.
attrs (dict[str, Any] | None, optional) – Dictionary of attributes, by default None. A shallow copy is always made.
copy (bool, optional) – Copy individual arrays on instantiation, by default True.
**attrs_kwargs (Any) – Additional attributes passed as keyword arguments.

Raises:

ValueError – If “time” variable cannot be converted to numpy array.

attrs¶: Generic dataset attributes

broadcast_attrs(keys, overwrite=False, raise_error=True)¶

Attach values from keys in attrs onto data.

If possible, use dtype = np.float32 when broadcasting. If not possible, use whatever dtype is inferred from the data by numpy.full().

Parameters:

keys (str | Iterable[str]) – Keys to broadcast
overwrite (bool, optional) – If True, overwrite existing values in data. By default False.
raise_error (bool, optional) – Raise KeyError if self.attrs does not contain some of keys.

Raises:

KeyError – Not all keys found in attrs.

broadcast_numeric_attrs(ignore_keys=None, overwrite=False)¶

Attach numeric values in attrs onto data.

Iterate through values in attrs and attach float and int values to data.

This method modifies object in place.

Parameters:

ignore_keys (str | Iterable[str] | None, optional) – Do not broadcast selected keys. Defaults to None.
overwrite (bool, optional) – If True, overwrite existing values in data. By default False.

copy(**kwargs)¶

Return a copy of this instance.

Parameters:: **kwargs (Any) – Additional keyword arguments passed into the constructor of the returned class.
Returns:: Self – Copy of class

classmethod create_empty(keys, attrs=None, **kwargs)¶

Create instance with variables defined by keys and size 0.

If instance requires additional variables to be defined, these keys will automatically be attached to returned instance.

Parameters:

keys (Iterable[str]) – Keys to include in empty VectorDataset instance.
attrs (dict[str, Any] | None, optional) – Attributes to attach instance.
**kwargs (Any) – Additional keyword arguments passed into the constructor of the returned class.

Returns:

Self – Empty VectorDataset instance.

data¶: Vector data with labels as keys and numpy.ndarray as values

property dataframe¶

Shorthand property to access to_dataframe() with copy=False.

Returns:: pandas.DataFrame – Equivalent to the output from to_dataframe()

ensure_vars(vars, raise_error=True)¶

Ensure variables exist in column of data or attrs.

Parameters:

vars (str | Iterable[str]) – A single string variable name or a sequence of string variable names.
raise_error (bool, optional) – Raise KeyError if data does not contain variables. Defaults to True.

Returns:

bool – True if all variables exist. False otherwise.

Raises:

KeyError – Raises when dataset does not contain variable in vars

filter(mask, copy=True, **kwargs)¶

Filter data according to a boolean array mask.

Entries corresponding to mask == True are kept.

Parameters:

mask (npt.NDArray[np.bool_]) – Boolean array with compatible shape.
copy (bool, optional) – Copy data on filter. Defaults to True. See numpy best practices for insight into whether copy is appropriate.
**kwargs (Any) – Additional keyword arguments passed into the constructor of the returned class.

Returns:

Self – Containing filtered data

Raises:

TypeError – If mask is not a boolean array.

classmethod from_dict(obj, copy=True, **obj_kwargs)¶

Create instance from dict representation containing data and attrs.

Parameters:

obj (dict[str, Any]) – Dict representation of VectorDataset (e.g. to_dict())
copy (bool, optional) – Passed to VectorDataset constructor. Defaults to True.
**obj_kwargs (Any) – Additional properties passed as keyword arguments.

Returns:

Self – VectorDataset instance.

See also

to_dict()

generate_splits(n_splits, copy=True)¶

Split instance into n_split sub-vectors.

Parameters:

n_splits (int) – Number of splits.
copy (bool, optional) – Passed into filter(). Defaults to True. Recommend to keep as True based on numpy best practices.

Yields:

Self – Generator of split vectors.

See also

numpy.array_split()

get(key, default_value=None)¶

Get values from data with default_value if key not in data.

Parameters:

key (str) – Key to get from data
default_value (Any, optional) – Return default_value if key not in data, by default None

Returns:

Any – Values at data[key] or default_value

get_constant(key, default=<object object>)¶

Get a constant value from attrs or data.

If key is found in attrs, the value is returned.
If key is found in data, the common value is returned if all values are equal.
If key is not found in attrs or data and a default is provided, the default is returned.
Otherwise, a KeyError is raised.

Parameters:

key (str) – Key to look for.
default (Any, optional) – Default value to return if key is not found in attrs or data.

Returns:

Any – The constant value for key.

Raises:

KeyError – If key is not found in attrs or the values in data are not equal and default is not provided.

Examples

>>> vector = VectorDataset({"a": [1, 1, 1], "b": [2, 2, 3]})
>>> vector.get_constant("a")
np.int64(1)
>>> vector.get_constant("b")
Traceback (most recent call last):
...
KeyError: "A constant key 'b' not found in attrs or data"
>>> vector.get_constant("b", 3)
3

get_data_or_attr(key, default=<object object>)¶

Get value from data or attrs.

Parameters:

key (str) – Key to get from data or attrs
default (Any, optional) – Default value to return if key is not in data or attrs.

Returns:

Any – Value at data[key] or attrs[key]

Raises:

KeyError – If key is not in data or attrs and default is not provided.

Examples

>>> vector = VectorDataset({"a": [1, 2, 3]}, attrs={"b": 4})
>>> vector.get_data_or_attr("a")
array([1, 2, 3])

>>> vector.get_data_or_attr("b")
4

>>> vector.get_data_or_attr("c")
Traceback (most recent call last):
...
KeyError: "Key 'c' not found in data or attrs."

>>> vector.get_data_or_attr("c", default=5)
5

See also

get_constant

property hash¶

Generate a unique hash for this class instance.

Returns:: str – Unique hash for flight instance (sha1)

select(keys, copy=True)¶

Return new class instance only containing specified keys.

Parameters:

keys (Iterable[str]) – An iterable of keys to filter by.
copy (bool, optional) – Copy data on selection. Defaults to True.

Returns:

VectorDataset – VectorDataset containing only data associated to keys. Note that this method always returns a VectorDataset, even if the calling class is a proper subclass of VectorDataset.

setdefault(key, default=None)¶

Shortcut to VectorDataDict.setdefault().

Parameters:

key (str) – Key in data dict.
default (npt.ArrayLike, optional) – Values to use as default, if key is not defined

Returns:

numpy.ndarray – Values at key

property shape¶

Shape of each array in data.

Returns:: tuple[int] – Shape of each array in data.

property size¶

Length of each array in data.

Returns:: int – Length of each array in data.

sort(by)¶

Sort data by key(s).

This method always creates a copy of the data by calling pandas.DataFrame.sort_values().

Parameters:: by (str | list[str]) – Key or list of keys to sort by.
Returns:: Self – Instance with sorted data.

classmethod sum(vectors, infer_attrs=True, fill_value=None)¶

Sum a list of VectorDataset instances.

Parameters:

vectors (Sequence[VectorDataset]) – List of VectorDataset instances to concatenate.
infer_attrs (bool, optional) – If True, infer attributes from the first element in the sequence.
fill_value (float | None, optional) – Fill value to use when concatenating arrays. By default None, which raises an error if incompatible keys are found.

Returns:

Self – Sum of all instances in vectors.

Raises:

KeyError – If incompatible data keys are found among vectors.

Examples

>>> from pycontrails import VectorDataset
>>> v1 = VectorDataset({"a": [1, 2, 3], "b": [4, 5, 6]})
>>> v2 = VectorDataset({"a": [7, 8, 9], "b": [10, 11, 12]})
>>> v3 = VectorDataset({"a": [13, 14, 15], "b": [16, 17, 18]})
>>> v = VectorDataset.sum([v1, v2, v3])
>>> v.dataframe
    a   b
0   1   4
1   2   5
2   3   6
3   7  10
4   8  11
5   9  12
6  13  16
7  14  17
8  15  18

to_dataframe(copy=True)¶

Create pd.DataFrame in which each key-value pair in data is a column.

DataFrame does not copy data by default. Use the copy parameter to copy data values on creation.

Parameters:: copy (bool, optional) – Copy data on DataFrame creation.
Returns:: pandas.DataFrame – DataFrame holding key-values as columns.

to_dict()¶

Create dictionary with data and attrs.

Returns:: dict[str, Any] – Dictionary with data and attrs.

See also

from_dict()

Examples

>>> import pprint
>>> from pycontrails import Flight
>>> fl = Flight(
...     longitude=[-100, -110],
...     latitude=[40, 50],
...     level=[200, 200],
...     time=[np.datetime64("2020-01-01T09"), np.datetime64("2020-01-01T09:30")],
...     aircraft_type="B737",
... )
>>> fl = fl.resample_and_fill("5min")
>>> pprint.pprint(fl.to_dict())
{'aircraft_type': 'B737',
 'altitude_ft': [38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0],
 'latitude': [40.0, 41.724, 43.428, 45.111, 46.769, 48.399, 50.0],
 'longitude': [-100.0,
               -101.441,
               -102.959,
               -104.563,
               -106.267,
               -108.076,
               -110.0],
 'time': [1577869200,
          1577869500,
          1577869800,
          1577870100,
          1577870400,
          1577870700,
          1577871000]}

update(other=None, **kwargs)¶

Update values in data dict without warning if overwriting.

Parameters:

other (dict[str, npt.ArrayLike] | None, optional) – Fields to update as dict
**kwargs (npt.ArrayLike) – Fields to update as kwargs

pycontrails.core.vector.vector_to_lon_lat_grid(vector, agg, *, spatial_bbox=(-180.0, -90.0, 180.0, 90.0), spatial_grid_res=0.5)¶

Convert vectors to a longitude-latitude grid.

Parameters:

vector (GeoVectorDataset) – Contains the longitude, latitude and variables for aggregation.
agg (dict[str, str]) – Variable name and the function selected for aggregation, i.e. {"segment_length": "sum"}.
spatial_bbox (tuple[float, float, float, float]) – Spatial bounding box, (lon_min, lat_min, lon_max, lat_max), [\(\deg\)]. By default, the entire globe is used.
spatial_grid_res (float) – Spatial grid resolution, [\(\deg\)]

Returns:

xarray.Dataset – Aggregated variables in a longitude-latitude grid.

Examples

>>> rng = np.random.default_rng(234)
>>> vector = GeoVectorDataset(
...     longitude=rng.uniform(-10, 10, 10000),
...     latitude=rng.uniform(-10, 10, 10000),
...     altitude=np.zeros(10000),
...     time=np.zeros(10000).astype("datetime64[ns]"),
... )
>>> vector["foo"] = rng.uniform(0, 1, 10000)
>>> ds = vector.to_lon_lat_grid({"foo": "sum"}, spatial_bbox=(-10, -10, 9.5, 9.5))
>>> da = ds["foo"]
>>> da.coords
Coordinates:
  * longitude  (longitude) float64 320B -10.0 -9.5 -9.0 -8.5 ... 8.0 8.5 9.0 9.5
  * latitude   (latitude) float64 320B -10.0 -9.5 -9.0 -8.5 ... 8.0 8.5 9.0 9.5

>>> da.values.round(2)
array([[2.23, 0.67, 1.29, ..., 4.66, 3.91, 1.93],
       [4.1 , 3.84, 1.34, ..., 3.24, 1.71, 4.55],
       [0.78, 3.25, 2.33, ..., 3.78, 2.93, 2.33],
       ...,
       [1.97, 3.02, 1.84, ..., 2.37, 3.87, 2.09],
       [3.74, 1.6 , 4.01, ..., 4.6 , 4.27, 3.4 ],
       [2.97, 0.12, 1.33, ..., 3.54, 0.74, 2.59]], shape=(40, 40))

>>> da.sum().item() == vector["foo"].sum()
np.True_