Package lib :: Module statistics
[hide private]
[frames] | no frames]

Module statistics

source code

Module for calculating simple statistics.

Functions [hide private]
list of lists of float
bucket(values=None, lower=0.0, upper=200.0, inc=100, verbose=False)
Generate a discrete probability distribution for the given values.
source code
float
gaussian(x=None, mu=0.0, sigma=1.0)
Calculate the probability for a Gaussian probability distribution for a given x value.
source code
float
geometric_mean(values=None)
Calculate the geometric mean for the given values.
source code
float
geometric_std(values=None, mean=None)
Calculate the geometric standard deviation for the given values.
source code
float
std(values=None, skip=None, dof=1)
Calculate the standard deviation of the given values, skipping values if asked.
source code
square numpy array
multifit_covar(J=None, epsrel=0.0, weights=None)
This is the implementation of the multifit covariance.
source code
Variables [hide private]
  __package__ = 'lib'

Imports: absolute, array, diag, dot, eye, float64, log, multiply, transpose, inv, qr, exp, pi, sqrt


Function Details [hide private]

bucket(values=None, lower=0.0, upper=200.0, inc=100, verbose=False)

source code 

Generate a discrete probability distribution for the given values.

Parameters:
  • values (list of float) - The list of values to convert.
  • lower (float) - The lower bound of the distribution.
  • upper (float) - The upper bound of the distribution.
  • inc (int) - The number of discrete increments for the distribution between the lower and upper bounds.
  • verbose (bool) - A flag which if True will enable printouts.
Returns: list of lists of float
The discrete probability distribution.

gaussian(x=None, mu=0.0, sigma=1.0)

source code 

Calculate the probability for a Gaussian probability distribution for a given x value.

Parameters:
  • x (float) - The x value to calculate the probability for.
  • mu (float) - The mean of the distribution.
  • sigma (float) - The standard deviation of the distribution.
Returns: float
The probability corresponding to x.

geometric_mean(values=None)

source code 

Calculate the geometric mean for the given values.

Parameters:
  • values (list of float) - The list of values to calculate the geometric mean of.
Returns: float
The geometric mean.

geometric_std(values=None, mean=None)

source code 

Calculate the geometric standard deviation for the given values.

Parameters:
  • values (list of float) - The list of values to calculate the geometric mean of.
  • mean (float) - The pre-calculated geometric mean. If not supplied, the value will be calculated.
Returns: float
The geometric mean.

std(values=None, skip=None, dof=1)

source code 

Calculate the standard deviation of the given values, skipping values if asked.

Parameters:
  • values (list of float) - The list of values to calculate the standard deviation of.
  • skip (list of bool or None.) - An optional list of booleans specifying if a value should be skipped. The length of this list must match the values. An element of True will cause the corresponding value to not be included in the calculation.
  • dof (int) - The degrees of freedom, whereby the standard deviation is multipled by 1/(N - dof).
Returns: float
The standard deviation.

multifit_covar(J=None, epsrel=0.0, weights=None)

source code 

This is the implementation of the multifit covariance.

This is inspired from GNU Scientific Library (GSL).

This function uses the Jacobian matrix J to compute the covariance matrix of the best-fit parameters, covar.

The parameter 'epsrel' is used to remove linear-dependent columns when J is rank deficient.

The weighting matrix 'W', is a square symmetric matrix. For independent measurements, this is a diagonal matrix. Larger values indicate greater significance. It is formed by multiplying and Identity matrix with the supplied weights vector:

   W = I. w

The weights should normally be supplied as a vector: 1 / errors^2.

The covariance matrix is given by:

   covar = (J^T.W.J)^{-1} ,

and is computed by QR decomposition of J with column-pivoting. Any columns of R which satisfy:

   |R_{kk}| <= epsrel |R_{11}| ,

are considered linearly-dependent and are excluded from the covariance matrix (the corresponding rows and columns of the covariance matrix are set to zero). If the minimisation uses the weighted least-squares function:

   f_i = (Y(x, t_i) - y_i) / sigma_i ,

then the covariance matrix above gives the statistical error on the best-fit parameters resulting from the Gaussian errors 'sigma_i' on the underlying data 'y_i'.

This can be verified from the relation 'd_f = J d_c' and the fact that the fluctuations in 'f' from the data 'y_i' are normalised by 'sigma_i' and so satisfy:

   <d_f d_f^T> = I. ,

For an unweighted least-squares function f_i = (Y(x, t_i) - y_i) the covariance matrix above should be multiplied by the variance of the residuals about the best-fit:

   sigma^2 = sum ( (y_i - Y(x, t_i))^2 / (n-p) ) ,

to give the variance-covariance matrix sigma^2 C. This estimates the statistical error on the best-fit parameters from the scatter of the underlying data.

Links

More information ca be found here:

Parameters:
  • J (numpy array) - The Jacobian matrix.
  • epsrel (float) - Any columns of R which satisfy |R_{kk}| <= epsrel |R_{11}| are considered linearly-dependent and are excluded from the covariance matrix, where the corresponding rows and columns of the covariance matrix are set to zero.
  • weigths (numpy array) - The weigths which to scale with. Normally submitted as the 1 over standard deviation of the measured intensity values per time point in power 2. weigths = 1 / sd_i^2.
Returns: square numpy array
The co-variance matrix