multiple_inference.bayes.normal#

Empirical Bayes with a normal prior.

References

@inproceedings{stein1956inadmissibility,
    title={Inadmissibility of the usual estimator for the mean of a multivariate normal distribution},
    author={Stein, Charles and others},
    booktitle={Proceedings of the Third Berkeley symposium on mathematical statistics and probability},
    volume={1},
    number={1},
    pages={197--206},
    year={1956}
}

@incollection{james1992estimation,
    title={Estimation with quadratic loss},
    author={James, William and Stein, Charles},
    booktitle={Breakthroughs in statistics},
    pages={443--460},
    year={1992},
    publisher={Springer}
}

@article{bock1975minimax,
    title={Minimax estimators of the mean of a multivariate normal distribution},
    author={Bock, Mary Ellen},
    journal={The Annals of Statistics},
    pages={209--218},
    year={1975},
    publisher={JSTOR}
}

@inproceedings{dimmery2019shrinkage,
    title={Shrinkage estimators in online experiments},
    author={Dimmery, Drew and Bakshy, Eytan and Sekhon, Jasjeet},
    booktitle={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
    pages={2914--2922},
    year={2019}
}

@article{armstrong2020robust,
    title={Robust empirical bayes confidence intervals},
    author={Armstrong, Timothy B and Koles{'a}r, Michal and Plagborg-M{\o}ller, Mikkel},
    journal={arXiv preprint arXiv:2004.03448},
    year={2020}
}

Notes

The James-Stein method of fitting the normal prior relies on my own fully Bayesian derivation that extends Dimmery et al. (2019)’s derivation by 1) accounting for correlated errors and 2) allowing the prior mean vector to depend on a feature matrix X.

Classes

Normal(*args[, fit_method, prior_mean, ...])

Bayesian model with a normal prior.

NormalResults(*args, **kwargs)

Functions

compute_robust_critical_value(m2[, ...])

Compute the critical value for robust confidence intervals.

class multiple_inference.bayes.normal.Normal(*args: Any, fit_method: str | Callable[[], None] = 'mle', prior_mean: float | ndarray | None = None, prior_cov: float | ndarray | None = None, **kwargs: Any)[source]#

Bayesian model with a normal prior.

Parameters:
  • fit_method (Union[str, Callable[[], None]], optional) – Specifies how to fit the prior (“mle”, “bock”, or “james_stein”). You can also use a custom function that sets the prior_mean, prior_cov, posterior_mean and posterior_cov attributes. Defaults to “mle”.

  • prior_mean (Union[float, np.ndarray], optional) – (# params,) prior mean vector. Defaults to None.

  • prior_cov (Union[float, np.ndarray], optional) – (# params, # params) prior covariance matrix. Defaults to None.

Examples

import numpy as np
from multiple_inference.bayes import Normal

model = Normal(np.arange(10), np.identity(10))
results = model.fit()
print(results.summary())
           Bayesian estimates
=======================================
    coef pvalue (1-sided) [0.025 0.975]
---------------------------------------
x0 0.545            0.282 -1.305  2.395
x1 1.424            0.066 -0.426  3.274
x2 2.303            0.007  0.453  4.153
x3 3.182            0.000  1.332  5.032
x4 4.061            0.000  2.211  5.911
x5 4.939            0.000  3.089  6.789
x6 5.818            0.000  3.968  7.668
x7 6.697            0.000  4.847  8.547
x8 7.576            0.000  5.726  9.426
x9 8.455            0.000  6.605 10.305
===============
Dep. Variable y
---------------
class multiple_inference.bayes.normal.NormalResults(*args, **kwargs)[source]#
conf_int(alpha: float = 0.05, columns: Sequence[int] | Sequence[str] | Sequence[bool] | None = None, robust: bool = False, fast: bool = False, **kwargs: Any) ndarray[source]#

Compute the 1-alpha confidence interval.

Parameters:
  • alpha (float, optional) – Significance level. Defaults to .05.

  • columns (ColumnsType, optional) – Selected columns. Defaults to None.

  • robust (bool, optional) – Use robust confidence intervals. These will have correct average coverage even if the normal prior assumption is violated. Defaults to False.

  • fast (bool, optional) – Use a “fast” computation method for the robust confidence intervals. This assumes the critical value is the same for all parameters. Defaults to False.

Returns:

(# params, 2) array of confidence intervals.

Return type:

np.ndarray

multiple_inference.bayes.normal.compute_robust_critical_value(m2: float, kurtosis: float = inf, alpha: float = 0.05) tuple[float, ndarray, ndarray][source]#

Compute the critical value for robust confidence intervals.

Parameters:
  • m2 (float) – Equality constraint on \(E[b^2]\).

  • kurtosis (float, optional) – Estimated kurtosis of the prior distribution. Defaults to np.inf.

  • alpha (float, optional) – Significance level. Defaults to .05.

Returns:

Critical value, array of \(x\) values for the least favorable mass function, array of probabilities for the least favorable mass function.

Return type:

tuple[float, np.ndarray, np.ndarray]

Notes

See Armstrong et al., 2020 for mathematical details. This function is equivalent to the cva function in the ebci R package and uses the same variable names and tolerance thresholds.