multiple_inference.stats#

Statistical distributions.

Classes

`joint_distribution`(marginal_distributions)	Join distribution based on independent marginal distributions.
`mixture`(distributions[, weights])	Mixture distribution.
`quantile_unbiased`(y[, projection_interval, ...])	Conditional quantile-unbiased distribution.
`truncnorm`([truncation_set, loc, scale, ...])	Truncated normal distribution.

class multiple_inference.stats.joint_distribution(marginal_distributions: Sequence[rv_continuous])[source]#

Join distribution based on independent marginal distributions.

Parameters:: marginal_distributions (Sequence[rv_continuous]) – Marginal distributions.

logpdf(x: ndarray) → ndarray[source]#

Log of the probability density function evaluated at x.

Parameters:: x (np.ndarray) – (n, # marginals) matrix of values at which to evaluate the density function.
Returns:: (n,) array of log density.
Return type:: np.ndarray

pdf(x: ndarray) → ndarray[source]#

Probability density function evaluated at x.

Parameters:: x (np.ndarray) – (n, # marginals) matrix of values at which to evaluate the density function.
Returns:: (n,) array of densities.
Return type:: np.ndarray

rvs(size: int = 1) → ndarray[source]#

Sample random values.

Parameters:: size (int, optional) – Number of samples to draw. Defaults to 1.
Returns:: (size, # marginals) matrix of samples.
Return type:: np.ndarray

class multiple_inference.stats.mixture(distributions: list[rv_continuous], weights: Sequence[float] | None = None, **kwargs: Any)[source]#

Mixture distribution.

Parameters:

distributions (list[rv_continuous]) – List of n distributions to mix over.
weights (Numeric1DArray, optional) – (n,) array of mixture weights. Defaults to None.

distributions#

Distributions to mix over.

Type:: list[rv_continuous]

weights#

Mixture weights.

Type:: np.ndarray

mean()[source]#

Mean of the distribution.

Parameters:

arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
loc (array_like, optional) – location parameter (default=0)
scale (array_like, optional) – scale parameter (default=1)

Returns:

mean – the mean of the distribution

Return type:

float

std()[source]#

Standard deviation of the distribution.

Parameters:

arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
loc (array_like, optional) – location parameter (default=0)
scale (array_like, optional) – scale parameter (default=1)

Returns:

std – standard deviation of the distribution

Return type:

float

var()[source]#

Variance of the distribution.

Parameters:

arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
loc (array_like, optional) – location parameter (default=0)
scale (array_like, optional) – scale parameter (default=1)

Returns:

var – the variance of the distribution

Return type:

float

class multiple_inference.stats.quantile_unbiased(y: float, projection_interval: float | Tuple[float, float] = (-inf, inf), bounds: Tuple[float, float] = (-inf, inf), dx: float | None = None, **truncnorm_kwargs: Any)[source]#

Conditional quantile-unbiased distribution.

Inherits from scipy.stats.rv_continuous and handles standard public methods (pdf, cdf, etc.).

Parameters:

y (float) – Value at which the truncated CDF is evaluated
projection_interval (Union[float, Tuple[float, float]], optional) – Lower and upper bounds of the projection confidence interval. Defaults to (-np.inf, np.inf).
bounds (Tuple[float, float], optional) – Lower and upper bounds of the support of the distribution. Defaults to (-np.inf, np.inf).
dx (float) – Used to numerically approximate the PDF.
**truncnorm_kwargs (Any) – Keyword arguments for truncnorm.

y#

Value at which the truncated CDF is evaluated.

Type:: float

bounds#

Lower and upper bound of the support of the distribution.

Type:: Tuple[float, float]

dx#

Used to numerically approximate the PDF.

Type:: float

truncnorm_kwargs#

Keyword arguments for truncnorm.

Type:: dict

Examples

Compute a median-unbiased estimate of a normally distributed variable given: that its observed value is 1 and falls between 0 and 3.

>>> from multiple_inference.stats import quantile_unbiased
>>> dist = quantile_unbiased(1, truncation_set=[(0, 3)])
>>> dist.ppf(.5)
0.7108033900602351

ppf(q: float | Sequence[float]) → ndarray[source]#

Percent point function.

Parameters:: q (np.ndarray) – (n,) array of quantiles at which to evaluate the PPF.
Returns:: (n,) array of evaluations.
Return type:: np.ndarray

class multiple_inference.stats.truncnorm(truncation_set: List[Tuple[float, float]] | None = None, loc: float = 0, scale: float = 1, n_samples: int = 10000, seed: int = 0)[source]#

Truncated normal distribution.

Inherits from scipy.stats.rv_continuous and handles standard public methods (pdf, cdf, etc.).

This uses the exponential tilting approximation method.

Parameters:

truncation_set (List[Tuple[float, float]], optional) – List of truncation intervals, e.g., [(-1, 0), (1, 2)] truncates the distribution to [-1, 0] union [1, 2]. Defaults to None.
loc (float, optional) – Location. Defaults to 0.
scale (float, optional) – Scale parameter. Defaults to 1.
n_samples (int, optional) – Number of samples to draw for approximation. Defaults to 10000.
seed (int, optional) – Random seed. Defaults to 0.

loc#

Location parameter.

Type:: float

scale#

Scale parameter.

Type:: float

lower_bound#

(# intervals,) array of lower bounds of the truncation intervals.

Type:: np.array

upper_bound#

(# intervals,) array of upper bounds of the truncation intervals.

Type:: np.array

interval_masses#

(# intervals,) array of the amount of mass in each truncation interval.

Type:: np.array

n_samples#

Number of samples to draw for approximation. Defaults to 10000.

Type:: int

Examples

Let’s evaluate the CDF of a standard normal truncated to the interval (-1, 0) at -0.5.

from multiple_inference.stats import truncnorm
print(truncnorm([(-1, 0)]).cdf(-.5))

0.4390935748119969

Let’s evaluate the CDF of a standard normal truncated to the union of (-1, 0) and (1, 2) at -0.5.

from multiple_inference.stats import truncnorm
print(truncnorm([(-1, 0), (1, 2)]).cdf(-.5))

0.3140541146849627

Note

The truncation set is defined over the domain of the standard normal. To convert the truncation set for a specific mean and standard deviation, use:

>>> truncation_set = [(myclip_a - my_mean) / my_std, (myclip_b - my_mean) / my_std)]