openghg_inversions.postprocessing.inversion_output#

class openghg_inversions.postprocessing.inversion_output.InversionOutput(obs: DataArray, obs_err: DataArray, obs_repeatability: DataArray, obs_variability: DataArray, flux: DataArray, basis: DataArray, trace: InferenceData, site_indicators: DataArray, times: DataArray, start_date: str, end_date: str, species: str, domain: str, site_names: DataArray | None = None, model: Model | None = None)#

Bases: object

Outputs of inversion needed for post-processing.

basis: DataArray#
domain: str#
end_date: str#
property end_time: Timestamp#

End date of inversion.

flux: DataArray#
classmethod from_datatree(dt: DataTree) Self#

Construct InversionOutput from serialised InversionOutput xr.DataTree.

This method is the inverse of to_datatree.

Parameters:

dt – xr.DataTree constructed using InversionOutput.to_datatree

Returns:

reconstructed from datatree

Return type:

InversionOutput

get_flat_basis() DataArray#

Return 2D DataArray encoding basis regions.

The InversionOutput.basis matrix is sparse, with three dimensions: latitude, longitude, and region (which corresponds to a basis function). A sparse matrix cannot be saved directly to disk, or compared with other matrices of this type, and converting directly to a dense matrix could use a very large amount of memory.

This function converts the basis matrix to a 2D array with latitude and longitude coordinates, and basis regions encoded by numbers in this 2D array.

Returns:

xr.DataArray encoding basis functions, with latitude and longitude coordinates

get_model_data(var_names: str | list[str] | None = None) Dataset#

Return an xarray Dataset containing the data input to the model.

This data is captured using pm.Data, or when data is observed.

Parameters:
  • convert_nmeasure – if True, convert nmeasure coordinate to multi-index comprising time and site.

  • var_names – (list of) variables to select. For instance, “hx” or “min_error”

Returns:

xarray Dataset containing model data

get_model_err() DataArray#

Return model_error.

The model error is calculated by subtracting the square of the obs error from the square of the total error, and then taking a square root.

Returns:

xr.DataArray containing model error

get_obs_and_errors() Dataset#

Return dataset containing observations and related error terms.

The dataset return contains: obs, obs error (as used by the inversion), obs repeatability and variability, model error, and total error (i.e. the model-data mismatch error).

Returns:

xr.Dataset containing obs and error data

get_total_err(take_mean: bool = True) DataArray#

Return the posterior model-data mismatch error.

This is the variable epsilon in the RHIME model. It can be thought of as sqrt(repeatability**2 + variability**2 + model_error**2), although the actual definition is more complicated.

Parameters:

take_mean – if True, take mean over trace of error term, otherwise return the full trace.

Returns:

xr.DataArray containing total error

get_trace_dataset(var_names: str | list[str] | None = None) Dataset#

Return an xarray Dataset containing a prior/posterior parameter/predictive samples.

Parameters:
  • convert_nmeasure – if True, convert nmeasure coordinate to multi-index comprising time and site.

  • var_names – (list of) variables to select. For instance, “x” will return “x_prior” and “x_posterior”.

Returns:

xarray Dataset containing a prior/posterior parameter/predictive samples.

classmethod load(file_path: str | Path) Self#

Load InversionOutput from file.

Use this to load InversionOutput that was previously saved using InversionOutput.save.

Parameters:

file_path – path to saved InversionOutput

Returns:

InversionOutput loaded from saved file

model: Model | None = None#
nmeasure_to_site_time(data: XrDataArrayOrSet) XrDataArrayOrSet#

Convert nmeasure coordinate of dataset to stacked (site, time) coordinate.

Parameters:

data – xr.DataArray or xr.Dataset

Returns:

data with nmeasure converted to a stacked (site, time) coordinate.

obs: DataArray#
obs_err: DataArray#
obs_repeatability: DataArray#
obs_variability: DataArray#
property period_midpoint: Timestamp#

Midpoint of inversion period.

sample_predictive_distributions(ndraw: int | None = None) None#

Sample prior and posterior predictive distributions.

This creates prior samples as a side-effect.

Parameters:

ndraw – optional number of prior samples to draw; defaults to the number of posterior samples.

save(output_file: str | Path, output_format: Literal['netcdf', 'zarr'] | None = None) None#

Save InversionOutput to netCDF or Zarr.

There is a corresponding load method to recover the InversionOutput from a saved version.

Parameters:
  • output_file – path to file where the InversionOutput should be saved

  • output_format – format to save to; if None, this will be inferred by the extension of output_file

Raises:

ValueError – If output_format is not specified and cannot be inferred from the output file extension.

site_indicators: DataArray#
site_names: DataArray | None = None#
species: str#
start_date: str#
property start_time: Timestamp#

Start date of inversion.

times: DataArray#
to_datatree() DataTree#

Convert InversionOutput to xarray DataTree.

The output of this method can be saved to netCDF or zarr.

To make it possible to save the data, the nmeasure multi-index needs to be removed. The multi-index is restored by the from_datatree method.

Returns:

xr.DataTree containing the trace (as a sub-DataTree), obs and errors, the flat basis

functions, and the flux, as well as the start/end dates, species, and domain in its attributes.

trace: InferenceData#
openghg_inversions.postprocessing.inversion_output.convert_idata_to_dataset(idata: InferenceData, group_filters=['prior', 'posterior'], add_suffix=True) Dataset#

Merge all groups in an arviz InferenceData object into a single xr.Dataset.

Parameters:
  • idata – arviz InferenceData containing traces (and other data)

  • group_filters – Filters for the groups of the InferenceData. A group will be selected if a filter is a substring of the group name. So the groups “prior” and “prior_predictive” will both match the filter “prior”. The default filters select the “prior”, “prior_predictive”, “posterior”, and “posterior_predictive” groups.

  • add_suffix – if True, rename the data variables so that they end in the name of the group they came from.

Returns:

xr.Dataset containing all data variables in the selected groups of the InferenceData

openghg_inversions.postprocessing.inversion_output.filter_data_vars_by_prefix(ds: Dataset, var_name_prefixes: str | list[str], sep: str = '_') Dataset#

Select data variables that match the specified filters.

For instance, if var_name_prefixes = ‘prior’, then any data variable whose name begins with ‘prior_’ will be selected. The underscore ‘_’ is added by default, but can be changed by specifying sep.

Parameters:
  • ds – Dataset to filter.

  • var_name_prefixes – (List of) prefix(s) to filter data variables by.

  • sep – Separator for prefix; default is “_”.

Returns:

Dataset restricted to data variables whose names match the filter.

Return type:

xr.Dataset

openghg_inversions.postprocessing.inversion_output.make_inv_out_for_fixed_basis_mcmc(fp_data: dict, Y: ndarray, Ytime: ndarray, error: ndarray, obs_repeatability: ndarray, obs_variability: ndarray, site_indicator: ndarray, site_names: ndarray | list[str], mcmc_results: dict, start_date: str, end_date: str, species: str, domain: str) InversionOutput#

Create InversionOutput in fixedbasisMCMC.

openghg_inversions.postprocessing.inversion_output.make_inv_out_from_rhime_outputs(ds: Dataset, species: str, domain: str, start_date: str | None = None, end_date: str | None = None) InversionOutput#

Create inversion output from RHIME outputs.

This can be used to re-run flux and country total outputs using the PARIS postprocessing. However, this doesn’t recover enough information to re-compute concentration outputs.