openghg_inversions.inversion_inputs#
Functions for creating the inputs needed by PyMC.
- openghg_inversions.inversion_inputs.add_min_error(ds: Dataset, fp_data: dict[str, Any], min_error: str | dict[str, float] | float = 0.0, min_error_per_site: bool = True) Dataset#
Add min_error to combined Dataset.
- openghg_inversions.inversion_inputs.add_site_indicator(ds: Dataset, sort: bool = False) Dataset#
Adds site_indicator and site_names data variables.
- openghg_inversions.inversion_inputs.concat_gather_data_arrays(da_dict: Mapping[Hashable, DataArray], key_dim: str, ragged_dim: str, stack_dim: str | None = None, **concat_kwargs) DataArray#
Concatenate DataArrays by gathering along ragged coordinate.
For example, if the keys are site codes and the ragged dimension is time, then the “stacked dimension” will be the usual nmeasure coordinate.
- Parameters:
da_dict – dictionary of DataArrays
key_dim – dimension name for the keys of the dictionary
ragged_dim – name of the ragged dimension
stack_dim – name for the “stacked” multi-index dimension
**concat_kwargs – arguments to pass to xr.concat
- Returns:
Combined DataArray with new stacked dimension.
- openghg_inversions.inversion_inputs.concat_gather_datasets(ds_dict: Mapping[Hashable, Dataset], key_dim: str, ragged_dim: str, stack_dim: str | None = None, **concat_kwargs) Dataset#
Concatenate dictionary of xr.Datasets by gathering ragged coordinates.
This assumes that all datasets have the same data variables.
TODO: need to handle missing data variables.
- openghg_inversions.inversion_inputs.concat_gather_datatree(dt: DataTree, key_dim: str, ragged_dim: str, stack_dim: str | None = None, **concat_kwargs) Dataset#
Concatenate xr.DataTree children by gathering ragged coordinates.
This assumes that all children have the same data variables.
- openghg_inversions.inversion_inputs.make_freq_indicator(time: DataArray, freq: Literal['monthly'] | str, *, anchor_time: str | datetime | datetime64 | Timestamp | None = None) DataArray#
- openghg_inversions.inversion_inputs.make_inv_inputs(fp_data: dict[str, Any], sites: list[str] | None = None, bc_freq: Literal['monthly'] | str | None = None, sigma_freq: Literal['monthly'] | str | None = None, min_error: str | dict[str, float] | float = 0.0, min_error_per_site: bool = True, start_date: str | datetime | datetime64 | Timestamp | None = None) Dataset#
- openghg_inversions.inversion_inputs.make_sigma_freq(time: DataArray, freq: Literal['monthly'] | str | None = None, anchor_time: str | datetime | datetime64 | Timestamp | None = None) DataArray#
- openghg_inversions.inversion_inputs.make_site_indicator(site_coord: DataArray) DataArray#
Make site_indicator from DataArray of site names.
For instance, the values [“TAC”, “TAC”, “MHD”] would be converted to [0, 0, 1].
- openghg_inversions.inversion_inputs.make_site_names(site_coord: DataArray) DataArray#
Make site names DataArray corresponding to site indicator.
- openghg_inversions.inversion_inputs.transform_bc(ds: Dataset, freq: Literal['monthly'] | str | None = None, anchor_time: str | datetime | datetime64 | Timestamp | None = None) Dataset#
Convert ds so that ds.H_bc is converted to (curtain, period) coordinates.
- openghg_inversions.inversion_inputs.xr_factorize(da: DataArray, indicator_name: str, label_name: str, label_dim: str, sort: bool = False) Dataset#
Create Dataset with integer indicators and labels for DataArray.
- Parameters:
da – DataArray to find indicator for.
indicator_name – name for indicator data variable
label_name – name for label data variable
label_dim – dimension for labels
sort – if True, the labels will be sorted and the indicator shuffled
accordingly
- Returns:
Dataset with indicator and label data variables.