multiple_inference.stats#

Statistical distributions.

Classes

joint_distribution(marginal_distributions)

Join distribution based on independent marginal distributions.

mixture(distributions[, weights])

Mixture distribution.

nonparametric(values[, kind])

Nonparametric distribution.

quantile_unbiased(y[, projection_interval, ...])

Conditional quantile-unbiased distribution.

truncnorm([truncation_set, loc, scale, ...])

Truncated normal distribution.

class multiple_inference.stats.joint_distribution(marginal_distributions: Sequence[rv_continuous])[source]#

Join distribution based on independent marginal distributions.

Parameters

marginal_distributions (Sequence[rv_continuous]) – Marginal distributions.

logpdf(x: ndarray) ndarray[source]#

Log of the probability density function evaluated at x.

Parameters

x (np.ndarray) – (n, # marginals) matrix of values at which to evaluate the density function.

Returns

(n,) array of log density.

Return type

np.ndarray

pdf(x: ndarray) ndarray[source]#

Probability density function evaluated at x.

Parameters

x (np.ndarray) – (n, # marginals) matrix of values at which to evaluate the density function.

Returns

(n,) array of densities.

Return type

np.ndarray

rvs(size: int = 1) ndarray[source]#

Sample random values.

Parameters

size (int, optional) – Number of samples to draw. Defaults to 1.

Returns

(size, # marginals) matrix of samples.

Return type

np.ndarray

class multiple_inference.stats.mixture(distributions: list[rv_continuous], weights: Numeric1DArray = None, **kwargs: Any)[source]#

Mixture distribution.

Parameters
  • distributions (list[rv_continuous]) – List of n distributions to mix over.

  • weights (Numeric1DArray, optional) – (n,) array of mixture weights. Defaults to None.

distributions#

Distributions to mix over.

Type

list[rv_continuous]

weights#

Mixture weights.

Type

np.ndarray

mean()[source]#

Mean of the distribution.

Parameters
  • arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • ... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • loc (array_like, optional) – location parameter (default=0)

  • scale (array_like, optional) – scale parameter (default=1)

Returns

mean – the mean of the distribution

Return type

float

std()[source]#

Standard deviation of the distribution.

Parameters
  • arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • ... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • loc (array_like, optional) – location parameter (default=0)

  • scale (array_like, optional) – scale parameter (default=1)

Returns

std – standard deviation of the distribution

Return type

float

var()[source]#

Variance of the distribution.

Parameters
  • arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • ... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)

  • loc (array_like, optional) – location parameter (default=0)

  • scale (array_like, optional) – scale parameter (default=1)

Returns

var – the variance of the distribution

Return type

float

class multiple_inference.stats.nonparametric(values, kind=None, *args, **kwargs)[source]#

Nonparametric distribution.

Parameters
  • values (tuple[np.array, np.array]) – (n,) array of x values, (n,) array of the probability mass function evaluated at x.

  • kind (str, optional) – Type of interpolation to use. Passed to scipy.interpolate.interp1d. Defaults to None.

xk#

(n,) array of x values.

Type

np.ndarray

pk#

(n,) array of the probability mass function evaluated at x.

Type

np.ndarray

Notes

This distribution interpolates between the probability mass function to “continuize” the discrete function.

mean() float[source]#

Compute the mean.

Returns

Mean.

Return type

float

moment(func: Callable[[ndarray], ndarray]) float[source]#

Compute a moment.

Parameters

func (Callable[[np.ndarray], np.ndarray]) – Moment function that takes self.xk and returns an array of the same shape.

Returns

Moment.

Return type

float

std() float[source]#

Compute the standard deviation.

Returns

Standard deviation.

Return type

float

var() float[source]#

Compute the variance.

Returns

Variance.

Return type

float

class multiple_inference.stats.quantile_unbiased(y: float, projection_interval: Union[float, Tuple[float, float]] = (- inf, inf), bounds: Tuple[float, float] = (- inf, inf), dx: Optional[float] = None, **truncnorm_kwargs: Any)[source]#

Conditional quantile-unbiased distribution.

Inherits from scipy.stats.rv_continuous and handles standard public methods (pdf, cdf, etc.).

Parameters
  • y (float) – Value at which the truncated CDF is evaluated

  • projection_interval (Union[float, Tuple[float, float]], optional) – Lower and upper bounds of the projection confidence interval. Defaults to (-np.inf, np.inf).

  • bounds (Tuple[float, float], optional) – Lower and upper bounds of the support of the distribution. Defaults to (-np.inf, np.inf).

  • dx (float) – Used to numerically approximate the PDF.

  • **truncnorm_kwargs (Any) – Keyword arguments for truncnorm.

y#

Value at which the truncated CDF is evaluated.

Type

float

bounds#

Lower and upper bound of the support of the distribution.

Type

Tuple[float, float]

dx#

Used to numerically approximate the PDF.

Type

float

truncnorm_kwargs#

Keyword arguments for truncnorm.

Type

dict

Examples

Compute a median-unbiased estimate of a normally distributed variable given

that its observed value is 1 and falls between 0 and 3.

>>> from multiple_inference.stats import quantile_unbiased
>>> dist = quantile_unbiased(1, truncation_set=[(0, 3)])
>>> dist.ppf(.5)
0.7108033900602351
ppf(q: Union[float, Sequence[float]]) ndarray[source]#

Percent point function.

Parameters

q (np.ndarray) – (n,) array of quantiles at which to evaluate the PPF.

Returns

(n,) array of evaluations.

Return type

np.ndarray

class multiple_inference.stats.truncnorm(truncation_set: Optional[List[Tuple[float, float]]] = None, loc: float = 0, scale: float = 1, n_samples: int = 10000, seed: int = 0)[source]#

Truncated normal distribution.

Inherits from scipy.stats.rv_continuous and handles standard public methods (pdf, cdf, etc.).

This uses the exponential tilting approximation method.

Parameters
  • truncation_set (List[Tuple[float, float]], optional) – List of truncation intervals, e.g., [(-1, 0), (1, 2)] truncates the distribution to [-1, 0] union [1, 2]. Defaults to None.

  • loc (float, optional) – Location. Defaults to 0.

  • scale (float, optional) – Scale parameter. Defaults to 1.

  • n_samples (int, optional) – Number of samples to draw for approximation. Defaults to 10000.

  • seed (int, optional) – Random seed. Defaults to 0.

loc#

Location parameter.

Type

float

scale#

Scale parameter.

Type

float

lower_bound#

(# intervals,) array of lower bounds of the truncation intervals.

Type

np.array

upper_bound#

(# intervals,) array of upper bounds of the truncation intervals.

Type

np.array

interval_masses#

(# intervals,) array of the amount of mass in each truncation interval.

Type

np.array

n_samples#

Number of samples to draw for approximation. Defaults to 10000.

Type

int

Examples

Let’s evaluate the CDF of a standard normal truncated to the interval (-1, 0) at -0.5.

from multiple_inference.stats import truncnorm
print(truncnorm([(-1, 0)]).cdf(-.5))
0.4390935748119969

Let’s evaluate the CDF of a standard normal truncated to the union of (-1, 0) and (1, 2) at -0.5.

from multiple_inference.stats import truncnorm
print(truncnorm([(-1, 0), (1, 2)]).cdf(-.5))
0.3140541146849627

Note

The truncation set is defined over the domain of the standard normal. To convert the truncation set for a specific mean and standard deviation, use:

>>> truncation_set = [(myclip_a - my_mean) / my_std, (myclip_b - my_mean) / my_std)]