openghg_inversions.postprocessing.inversion_output#
- class openghg_inversions.postprocessing.inversion_output.InversionOutput(obs: DataArray, obs_err: DataArray, obs_repeatability: DataArray, obs_variability: DataArray, flux: DataArray, basis: DataArray, trace: InferenceData, site_indicators: DataArray, times: DataArray, start_date: str, end_date: str, species: str, domain: str, site_names: DataArray | None = None, model: Model | None = None)#
Bases:
objectOutputs of inversion needed for post-processing.
- property end_time: Timestamp#
End date of inversion.
- classmethod from_datatree(dt: DataTree) Self#
Construct InversionOutput from serialised InversionOutput xr.DataTree.
This method is the inverse of to_datatree.
- Parameters:
dt – xr.DataTree constructed using InversionOutput.to_datatree
- Returns:
reconstructed from datatree
- Return type:
- get_flat_basis() DataArray#
Return 2D DataArray encoding basis regions.
The InversionOutput.basis matrix is sparse, with three dimensions: latitude, longitude, and region (which corresponds to a basis function). A sparse matrix cannot be saved directly to disk, or compared with other matrices of this type, and converting directly to a dense matrix could use a very large amount of memory.
This function converts the basis matrix to a 2D array with latitude and longitude coordinates, and basis regions encoded by numbers in this 2D array.
- Returns:
xr.DataArray encoding basis functions, with latitude and longitude coordinates
- get_model_data(var_names: str | list[str] | None = None) Dataset#
Return an xarray Dataset containing the data input to the model.
This data is captured using pm.Data, or when data is observed.
- Parameters:
convert_nmeasure – if True, convert nmeasure coordinate to multi-index comprising time and site.
var_names – (list of) variables to select. For instance, “hx” or “min_error”
- Returns:
xarray Dataset containing model data
- get_model_err() DataArray#
Return model_error.
The model error is calculated by subtracting the square of the obs error from the square of the total error, and then taking a square root.
- Returns:
xr.DataArray containing model error
- get_obs_and_errors() Dataset#
Return dataset containing observations and related error terms.
The dataset return contains: obs, obs error (as used by the inversion), obs repeatability and variability, model error, and total error (i.e. the model-data mismatch error).
- Returns:
xr.Dataset containing obs and error data
- get_total_err(take_mean: bool = True) DataArray#
Return the posterior model-data mismatch error.
This is the variable epsilon in the RHIME model. It can be thought of as sqrt(repeatability**2 + variability**2 + model_error**2), although the actual definition is more complicated.
- Parameters:
take_mean – if True, take mean over trace of error term, otherwise return the full trace.
- Returns:
xr.DataArray containing total error
- get_trace_dataset(var_names: str | list[str] | None = None) Dataset#
Return an xarray Dataset containing a prior/posterior parameter/predictive samples.
- Parameters:
convert_nmeasure – if True, convert nmeasure coordinate to multi-index comprising time and site.
var_names – (list of) variables to select. For instance, “x” will return “x_prior” and “x_posterior”.
- Returns:
xarray Dataset containing a prior/posterior parameter/predictive samples.
- classmethod load(file_path: str | Path) Self#
Load InversionOutput from file.
Use this to load InversionOutput that was previously saved using InversionOutput.save.
- Parameters:
file_path – path to saved InversionOutput
- Returns:
InversionOutput loaded from saved file
- nmeasure_to_site_time(data: XrDataArrayOrSet) XrDataArrayOrSet#
Convert nmeasure coordinate of dataset to stacked (site, time) coordinate.
- Parameters:
data – xr.DataArray or xr.Dataset
- Returns:
data with nmeasure converted to a stacked (site, time) coordinate.
- property period_midpoint: Timestamp#
Midpoint of inversion period.
- sample_predictive_distributions(ndraw: int | None = None) None#
Sample prior and posterior predictive distributions.
This creates prior samples as a side-effect.
- Parameters:
ndraw – optional number of prior samples to draw; defaults to the number of posterior samples.
- save(output_file: str | Path, output_format: Literal['netcdf', 'zarr'] | None = None) None#
Save InversionOutput to netCDF or Zarr.
There is a corresponding load method to recover the InversionOutput from a saved version.
- Parameters:
output_file – path to file where the InversionOutput should be saved
output_format – format to save to; if None, this will be inferred by the extension of output_file
- Raises:
ValueError – If output_format is not specified and cannot be inferred from the output file extension.
- property start_time: Timestamp#
Start date of inversion.
- to_datatree() DataTree#
Convert InversionOutput to xarray DataTree.
The output of this method can be saved to netCDF or zarr.
To make it possible to save the data, the nmeasure multi-index needs to be removed. The multi-index is restored by the from_datatree method.
- Returns:
- xr.DataTree containing the trace (as a sub-DataTree), obs and errors, the flat basis
functions, and the flux, as well as the start/end dates, species, and domain in its attributes.
- trace: InferenceData#
- openghg_inversions.postprocessing.inversion_output.convert_idata_to_dataset(idata: InferenceData, group_filters=['prior', 'posterior'], add_suffix=True) Dataset#
Merge all groups in an arviz InferenceData object into a single xr.Dataset.
- Parameters:
idata – arviz InferenceData containing traces (and other data)
group_filters – Filters for the groups of the InferenceData. A group will be selected if a filter is a substring of the group name. So the groups “prior” and “prior_predictive” will both match the filter “prior”. The default filters select the “prior”, “prior_predictive”, “posterior”, and “posterior_predictive” groups.
add_suffix – if True, rename the data variables so that they end in the name of the group they came from.
- Returns:
xr.Dataset containing all data variables in the selected groups of the InferenceData
- openghg_inversions.postprocessing.inversion_output.filter_data_vars_by_prefix(ds: Dataset, var_name_prefixes: str | list[str], sep: str = '_') Dataset#
Select data variables that match the specified filters.
For instance, if var_name_prefixes = ‘prior’, then any data variable whose name begins with ‘prior_’ will be selected. The underscore ‘_’ is added by default, but can be changed by specifying sep.
- Parameters:
ds – Dataset to filter.
var_name_prefixes – (List of) prefix(s) to filter data variables by.
sep – Separator for prefix; default is “_”.
- Returns:
Dataset restricted to data variables whose names match the filter.
- Return type:
xr.Dataset
- openghg_inversions.postprocessing.inversion_output.make_inv_out_for_fixed_basis_mcmc(fp_data: dict, Y: ndarray, Ytime: ndarray, error: ndarray, obs_repeatability: ndarray, obs_variability: ndarray, site_indicator: ndarray, site_names: ndarray | list[str], mcmc_results: dict, start_date: str, end_date: str, species: str, domain: str) InversionOutput#
Create InversionOutput in fixedbasisMCMC.
- openghg_inversions.postprocessing.inversion_output.make_inv_out_from_rhime_outputs(ds: Dataset, species: str, domain: str, start_date: str | None = None, end_date: str | None = None) InversionOutput#
Create inversion output from RHIME outputs.
This can be used to re-run flux and country total outputs using the PARIS postprocessing. However, this doesn’t recover enough information to re-compute concentration outputs.