openghg_inversions.postprocessing.diagnostics#

class openghg_inversions.postprocessing.diagnostics.Diagnostic(func, params)#

Bases: tuple

func#

Alias for field number 0

params#

Alias for field number 1

openghg_inversions.postprocessing.diagnostics.bayes_r2_by_site(inv_out: InversionOutput, report_prior: bool = False) Dataset#

Compute Bayesian R2 scores grouped by site.

Scores are computed for posterior predictive traces (compared against true obs).

Prior R2 scores also be computed, but they can not necessarily be compared with the posterior scores, since Bayesian R2 scores are normalised to always fall between 0 and 1.

Parameters:
  • inv_out – InversionOutput object containing obs and trace

  • report_prior – if True, return prior R2 in addition to posterior R2

Returns:

containing posterior (and optionally, prior) Bayesian R2 values,

with uncertainties.

Return type:

xr.Dataset

openghg_inversions.postprocessing.diagnostics.bayes_r2_by_site_resample(inv_out: InversionOutput, freq: str = 'MS', report_prior: bool = False) Dataset#

Compute Bayesian R2 scores grouped by site and time.

Scores are computed for posterior predictive traces (compared against true obs).

Prior R2 scores also be computed, but they can not necessarily be compared with the posterior scores, since Bayesian R2 scores are normalised to always fall between 0 and 1.

Parameters:
  • inv_out – InversionOutput object containing obs and trace

  • freq – frequency to resample to (should be a pandas freq. str that can be passed to xr.Dataset.resample)

  • report_prior – if True, return prior R2 in addition to posterior R2

Returns:

containing posterior (and optionally, prior) Bayesian R2 values,

with uncertainties.

Return type:

xr.Dataset

openghg_inversions.postprocessing.diagnostics.register_diagnostic(diagnostic: Callable) Callable#

Decorator function to register diagnostics functions.

Parameters:

diagnostic – diagnostics function to register

Returns:

diagnostic, the input function (no modifications made)

Return type:

Callable

openghg_inversions.postprocessing.diagnostics.summary(inv_out: InversionOutput) Dataset#

Return diagnostics summary computed by arviz.

Diagnostics reported:
  • mcse_mean: mean Monte Carlo standard error

  • mcse_sd: standard deviation of Monte Carlo standard error

  • ess_bulk: effective sample size (see e.g. Gelman et. al. “Bayesian Data Analysis”, equation (11.8)) after “rank normalising”

  • ess_tail: minimum effective sample size for 5% and 95% quantiles.

  • r_hat: the “potential scale reduction”, which compares variance within chains to pooled variance across chains. If all chains have converged, these will be the same and r_hat will be 1. Otherwise, r_hat will be greater than 1. Ideally, all r_hat values should be below 1.01

Parameters:

inv_out – InversionOutput to summarise.

Returns:

Dataset with diagnostic summary.

Return type:

xr.Dataset