multiple_inference.stats#

Statistical distributions.

Classes

`joint_distribution`(marginal_distributions)	Join distribution based on independent marginal distributions.
`mixture`(distributions[, weights])	Mixture distribution.
`nonparametric`(values[, kind])	Nonparametric distribution.
`quantile_unbiased`(y[, projection_interval, ...])	Conditional quantile-unbiased distribution.
`truncnorm`([truncation_set, loc, scale, ...])	Truncated normal distribution.

class multiple_inference.stats.joint_distribution(marginal_distributions: Sequence[rv_continuous])[source]#

Join distribution based on independent marginal distributions.

Parameters: marginal_distributions (Sequence[rv_continuous]) – Marginal distributions.

logpdf(x: ndarray) → ndarray[source]#

Log of the probability density function evaluated at x.

Parameters: x (np.ndarray) – (n, # marginals) matrix of values at which to evaluate the density function.
Returns: (n,) array of log density.
Return type: np.ndarray

pdf(x: ndarray) → ndarray[source]#

Probability density function evaluated at x.

Parameters: x (np.ndarray) – (n, # marginals) matrix of values at which to evaluate the density function.
Returns: (n,) array of densities.
Return type: np.ndarray

rvs(size: int = 1) → ndarray[source]#

Sample random values.

Parameters: size (int, optional) – Number of samples to draw. Defaults to 1.
Returns: (size, # marginals) matrix of samples.
Return type: np.ndarray

class multiple_inference.stats.mixture(distributions: list[rv_continuous], weights: Numeric1DArray = None, **kwargs: Any)[source]#

Mixture distribution.

Parameters

distributions (list[rv_continuous]) – List of n distributions to mix over.
weights (Numeric1DArray, optional) – (n,) array of mixture weights. Defaults to None.

distributions#

Distributions to mix over.

Type: list[rv_continuous]

weights#

Mixture weights.

Type: np.ndarray

mean()[source]#

Mean of the distribution.

Parameters

arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
loc (array_like, optional) – location parameter (default=0)
scale (array_like, optional) – scale parameter (default=1)

Returns

mean – the mean of the distribution

Return type

float

std()[source]#

Standard deviation of the distribution.

Parameters

arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
loc (array_like, optional) – location parameter (default=0)
scale (array_like, optional) – scale parameter (default=1)

Returns

std – standard deviation of the distribution

Return type

float

var()[source]#

Variance of the distribution.

Parameters

arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
arg3 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information)
loc (array_like, optional) – location parameter (default=0)
scale (array_like, optional) – scale parameter (default=1)

Returns

var – the variance of the distribution

Return type

float

class multiple_inference.stats.nonparametric(values, kind=None, *args, **kwargs)[source]#

Nonparametric distribution.

Parameters

values (tuple[np.array, np.array]) – (n,) array of x values, (n,) array of the probability mass function evaluated at x.
kind (str, optional) – Type of interpolation to use. Passed to scipy.interpolate.interp1d. Defaults to None.

xk#

(n,) array of x values.

Type: np.ndarray

pk#

(n,) array of the probability mass function evaluated at x.

Type: np.ndarray

Notes

This distribution interpolates between the probability mass function to “continuize” the discrete function.

mean() → float[source]#

Compute the mean.

Returns: Mean.
Return type: float

moment(func: Callable[[ndarray], ndarray]) → float[source]#

Compute a moment.

Parameters: func (Callable[[np.ndarray], np.ndarray]) – Moment function that takes self.xk and returns an array of the same shape.
Returns: Moment.
Return type: float

std() → float[source]#

Compute the standard deviation.

Returns: Standard deviation.
Return type: float

var() → float[source]#

Compute the variance.

Returns: Variance.
Return type: float

class multiple_inference.stats.quantile_unbiased(y: float, projection_interval: Union[float, Tuple[float, float]] = (- inf, inf), bounds: Tuple[float, float] = (- inf, inf), dx: Optional[float] = None, **truncnorm_kwargs: Any)[source]#

Conditional quantile-unbiased distribution.

Inherits from scipy.stats.rv_continuous and handles standard public methods (pdf, cdf, etc.).

Parameters

y (float) – Value at which the truncated CDF is evaluated
projection_interval (Union[float, Tuple[float, float]], optional) – Lower and upper bounds of the projection confidence interval. Defaults to (-np.inf, np.inf).
bounds (Tuple[float, float], optional) – Lower and upper bounds of the support of the distribution. Defaults to (-np.inf, np.inf).
dx (float) – Used to numerically approximate the PDF.
**truncnorm_kwargs (Any) – Keyword arguments for truncnorm.

y#

Value at which the truncated CDF is evaluated.

Type: float

bounds#

Lower and upper bound of the support of the distribution.

Type: Tuple[float, float]

dx#

Used to numerically approximate the PDF.

Type: float

truncnorm_kwargs#

Keyword arguments for truncnorm.

Type: dict

Examples

Compute a median-unbiased estimate of a normally distributed variable given: that its observed value is 1 and falls between 0 and 3.

>>> from multiple_inference.stats import quantile_unbiased
>>> dist = quantile_unbiased(1, truncation_set=[(0, 3)])
>>> dist.ppf(.5)
0.7108033900602351

ppf(q: Union[float, Sequence[float]]) → ndarray[source]#

Percent point function.

Parameters: q (np.ndarray) – (n,) array of quantiles at which to evaluate the PPF.
Returns: (n,) array of evaluations.
Return type: np.ndarray

class multiple_inference.stats.truncnorm(truncation_set: Optional[List[Tuple[float, float]]] = None, loc: float = 0, scale: float = 1, n_samples: int = 10000, seed: int = 0)[source]#

Truncated normal distribution.

Inherits from scipy.stats.rv_continuous and handles standard public methods (pdf, cdf, etc.).

This uses the exponential tilting approximation method.

Parameters

truncation_set (List[Tuple[float, float]], optional) – List of truncation intervals, e.g., [(-1, 0), (1, 2)] truncates the distribution to [-1, 0] union [1, 2]. Defaults to None.
loc (float, optional) – Location. Defaults to 0.
scale (float, optional) – Scale parameter. Defaults to 1.
n_samples (int, optional) – Number of samples to draw for approximation. Defaults to 10000.
seed (int, optional) – Random seed. Defaults to 0.

loc#

Location parameter.

Type: float

scale#

Scale parameter.

Type: float

lower_bound#

(# intervals,) array of lower bounds of the truncation intervals.

Type: np.array

upper_bound#

(# intervals,) array of upper bounds of the truncation intervals.

Type: np.array

interval_masses#

(# intervals,) array of the amount of mass in each truncation interval.

Type: np.array

n_samples#

Number of samples to draw for approximation. Defaults to 10000.

Type: int

Examples

Let’s evaluate the CDF of a standard normal truncated to the interval (-1, 0) at -0.5.

from multiple_inference.stats import truncnorm
print(truncnorm([(-1, 0)]).cdf(-.5))

0.4390935748119969

Let’s evaluate the CDF of a standard normal truncated to the union of (-1, 0) and (1, 2) at -0.5.

from multiple_inference.stats import truncnorm
print(truncnorm([(-1, 0), (1, 2)]).cdf(-.5))

0.3140541146849627

Note

The truncation set is defined over the domain of the standard normal. To convert the truncation set for a specific mean and standard deviation, use:

>>> truncation_set = [(myclip_a - my_mean) / my_std, (myclip_b - my_mean) / my_std)]