[frames] | no frames]

# Module statistics

source code

Module for calculating simple statistics.

 Functions
list of lists of float
 bucket(values=None, lower=0.0, upper=200.0, inc=100, verbose=False) Generate a discrete probability distribution for the given values. source code
float
 gaussian(x=None, mu=0.0, sigma=1.0) Calculate the probability for a Gaussian probability distribution for a given x value. source code
float
 geometric_mean(values=None) Calculate the geometric mean for the given values. source code
float
 geometric_std(values=None, mean=None) Calculate the geometric standard deviation for the given values. source code
float
 std(values=None, skip=None, dof=1) Calculate the standard deviation of the given values, skipping values if asked. source code
square numpy array
 multifit_covar(J=None, epsrel=0.0, weights=None) This is the implementation of the multifit covariance. source code
 Variables
__package__ = `'lib'`

Imports: absolute, array, diag, dot, eye, float64, log, multiply, transpose, inv, qr, exp, pi, sqrt

 Function Details

### bucket(values=None, lower=0.0, upper=200.0, inc=100, verbose=False)

source code

Generate a discrete probability distribution for the given values.

Parameters:
• `values` (list of float) - The list of values to convert.
• `lower` (float) - The lower bound of the distribution.
• `upper` (float) - The upper bound of the distribution.
• `inc` (int) - The number of discrete increments for the distribution between the lower and upper bounds.
• `verbose` (bool) - A flag which if True will enable printouts.
Returns: list of lists of float
The discrete probability distribution.

### gaussian(x=None, mu=0.0, sigma=1.0)

source code

Calculate the probability for a Gaussian probability distribution for a given x value.

Parameters:
• `x` (float) - The x value to calculate the probability for.
• `mu` (float) - The mean of the distribution.
• `sigma` (float) - The standard deviation of the distribution.
Returns: float
The probability corresponding to x.

### geometric_mean(values=None)

source code

Calculate the geometric mean for the given values.

Parameters:
• `values` (list of float) - The list of values to calculate the geometric mean of.
Returns: float
The geometric mean.

### geometric_std(values=None, mean=None)

source code

Calculate the geometric standard deviation for the given values.

Parameters:
• `values` (list of float) - The list of values to calculate the geometric mean of.
• `mean` (float) - The pre-calculated geometric mean. If not supplied, the value will be calculated.
Returns: float
The geometric mean.

### std(values=None, skip=None, dof=1)

source code

Calculate the standard deviation of the given values, skipping values if asked.

Parameters:
• `values` (list of float) - The list of values to calculate the standard deviation of.
• `skip` (list of bool or None.) - An optional list of booleans specifying if a value should be skipped. The length of this list must match the values. An element of True will cause the corresponding value to not be included in the calculation.
• `dof` (int) - The degrees of freedom, whereby the standard deviation is multipled by 1/(N - dof).
Returns: float
The standard deviation.

### multifit_covar(J=None, epsrel=0.0, weights=None)

source code

This is the implementation of the multifit covariance.

This is inspired from GNU Scientific Library (GSL).

This function uses the Jacobian matrix J to compute the covariance matrix of the best-fit parameters, covar.

The parameter 'epsrel' is used to remove linear-dependent columns when J is rank deficient.

The weighting matrix 'W', is a square symmetric matrix. For independent measurements, this is a diagonal matrix. Larger values indicate greater significance. It is formed by multiplying and Identity matrix with the supplied weights vector:

```   W = I. w
```

The weights should normally be supplied as a vector: 1 / errors^2.

The covariance matrix is given by:

```   covar = (J^T.W.J)^{-1} ,
```

and is computed by QR decomposition of J with column-pivoting. Any columns of R which satisfy:

```   |R_{kk}| <= epsrel |R_{11}| ,
```

are considered linearly-dependent and are excluded from the covariance matrix (the corresponding rows and columns of the covariance matrix are set to zero). If the minimisation uses the weighted least-squares function:

```   f_i = (Y(x, t_i) - y_i) / sigma_i ,
```

then the covariance matrix above gives the statistical error on the best-fit parameters resulting from the Gaussian errors 'sigma_i' on the underlying data 'y_i'.

This can be verified from the relation 'd_f = J d_c' and the fact that the fluctuations in 'f' from the data 'y_i' are normalised by 'sigma_i' and so satisfy:

```   <d_f d_f^T> = I. ,
```

For an unweighted least-squares function f_i = (Y(x, t_i) - y_i) the covariance matrix above should be multiplied by the variance of the residuals about the best-fit:

```   sigma^2 = sum ( (y_i - Y(x, t_i))^2 / (n-p) ) ,
```

to give the variance-covariance matrix sigma^2 C. This estimates the statistical error on the best-fit parameters from the scatter of the underlying data.

• `J` (numpy array) - The Jacobian matrix.
• `epsrel` (float) - Any columns of R which satisfy |R_{kk}| <= epsrel |R_{11}| are considered linearly-dependent and are excluded from the covariance matrix, where the corresponding rows and columns of the covariance matrix are set to zero.
• `weigths` (numpy array) - The weigths which to scale with. Normally submitted as the 1 over standard deviation of the measured intensity values per time point in power 2. weigths = 1 / sd_i^2.