lib.statistics

Imports: absolute, array, diag, dot, eye, float64, log, multiply, transpose, inv, qr, exp, pi, sqrt

bucket(values=None, lower=0.0, upper=200.0, inc=100, verbose=False)

source code

Generate a discrete probability distribution for the given values.

Parameters:

values (list of float) - The list of values to convert.
lower (float) - The lower bound of the distribution.
upper (float) - The upper bound of the distribution.
inc (int) - The number of discrete increments for the distribution between the lower and upper bounds.
verbose (bool) - A flag which if True will enable printouts.

Returns: list of lists of float

The discrete probability distribution.

std(values=None, skip=None, dof=1)

source code

Calculate the standard deviation of the given values, skipping values if asked.

Parameters:

values (list of float) - The list of values to calculate the standard deviation of.
skip (list of bool or None.) - An optional list of booleans specifying if a value should be skipped. The length of this list must match the values. An element of True will cause the corresponding value to not be included in the calculation.
dof (int) - The degrees of freedom, whereby the standard deviation is multipled by 1/(N - dof).

Returns: float

The standard deviation.

multifit_covar(J=None, epsrel=0.0, weights=None)

source code

This is the implementation of the multifit covariance.

This is inspired from GNU Scientific Library (GSL).

This function uses the Jacobian matrix J to compute the covariance matrix of the best-fit parameters, covar.

The parameter 'epsrel' is used to remove linear-dependent columns when J is rank deficient.

The weighting matrix 'W', is a square symmetric matrix. For independent measurements, this is a diagonal matrix. Larger values indicate greater significance. It is formed by multiplying and Identity matrix with the supplied weights vector:

   W = I. w

The weights should normally be supplied as a vector: 1 / errors^2.

The covariance matrix is given by:

   covar = (J^T.W.J)^{-1} ,

and is computed by QR decomposition of J with column-pivoting. Any columns of R which satisfy:

   |R_{kk}| <= epsrel |R_{11}| ,

are considered linearly-dependent and are excluded from the covariance matrix (the corresponding rows and columns of the covariance matrix are set to zero). If the minimisation uses the weighted least-squares function:

   f_i = (Y(x, t_i) - y_i) / sigma_i ,

then the covariance matrix above gives the statistical error on the best-fit parameters resulting from the Gaussian errors 'sigma_i' on the underlying data 'y_i'.

This can be verified from the relation 'd_f = J d_c' and the fact that the fluctuations in 'f' from the data 'y_i' are normalised by 'sigma_i' and so satisfy:

   <d_f d_f^T> = I. ,

For an unweighted least-squares function f_i = (Y(x, t_i) - y_i) the covariance matrix above should be multiplied by the variance of the residuals about the best-fit:

   sigma^2 = sum ( (y_i - Y(x, t_i))^2 / (n-p) ) ,

to give the variance-covariance matrix sigma^2 C. This estimates the statistical error on the best-fit parameters from the scatter of the underlying data.

Module statistics

bucket(values=None, lower=0.0, upper=200.0, inc=100, verbose=False)

gaussian(x=None, mu=0.0, sigma=1.0)

geometric_mean(values=None)

geometric_std(values=None, mean=None)

std(values=None, skip=None, dof=1)

multifit_covar(J=None, epsrel=0.0, weights=None)

Links