Re: The estimation for R2eff errors via Co-variance matrix. -- August 29, 2014

Hi Troels,

I've had a bit more progress with the simple test data in
test_suite/shared_data/curve_fitting/numeric_topology.  I have created
two scripts for error estimation.  The first uses bootstrapping to
determine that the errors are:

sigma_R:      0.02189348811434621567
sigma_I0:     9.92556021619115114163

This uses 200,000 simulations and, as this is an exact solution, this
is the same as Monte Carlo simulations.  The second script uses the
covar = inv(J^T.W.J) notation where W is the matrix of inverse
intensity errors.  The errors I find with this script are:

sigma_R:      0.02192511647039939796
sigma_I0:     9.91022724387324061013

These are both 100% independent of relax - so are great for debugging.
It is clear that the two techniques are perfectly compatible.
Therefore the factor of 1.9555 problem is a bug from somewhere else!
These tests are isolating the code paths in
specific_analysis.relax_disp.estimate_r2eff where this could be.

Cheers,

Edward







On 29 August 2014 19:01, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:

Hi Troels,

I'm still trying to work out what is happening, as R2eff errors being
overestimated by 1.9555 is not good enough for any subsequent analysis
(http://thread.gmane.org/gmane.science.nmr.relax.devel/6929/focus=6935).
It's as bad as doubling your values - the accuracy of the errors and
values are of equal importance.  I've numerically integrated a simple
test system using the numdifftools Python package to obtain the
Jacobian (and pseudo-Jacobian of the chi-squared function).  And I
then created unit tests to check the target_function.relax_fit C
module jacobian() and jacobian_chi2() functions.  This shows that
these functions are 100% bug-free and operate perfectly.  I'm now
searching for an external tool for numerically approximating the
covariance matrix.

To this end, could you maybe shift the
specific_analyses.relax_disp.estimate_r2eff.multifit_covar() function
into the lib.statistics package?  This is a very useful, stand-alone
function which is extremely well documented and uses a very robust
algorithm.  It would be beneficial for future uses including using it
in other analysis types.  Using this to create an analysis independent
error_analysis.covariance_matrix user function is rather trivial and
would be of great use for all.  By debugging, I can see that
multifit_covar() returns exactly the same result as inv(J^T.W.J).  So
it should be correct.  However I really would like to have this basic
synthetic data numerical approximation to compare it to via unit
testing.

As for the error scaling factors of [1.9555, 0.9804] for the
parameters [R2eff, I0], that is still a total mystery.  Maybe it is
better for now to give up on the idea of using the covariance matrix
as a quick error estimate :(

Regards,

Edward




On 29 August 2014 13:11, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> 
wrote:

Hi Edward.

I thing the estimation for of r2eff errors via Co-variance matrix
should be considered complete.

The facts:

We get the same co-variance as reported by scipy.minimise.leastsq.

Using the direct exponential Jacobian, seems to give double error on
R2eff, but after fitting,
the extracted parameters of 'r1', 'r2', 'pA', 'dw', 'kex' are very similar.

The chi2 values seems 4x as small, but that agrees with the R2eff
errors being 2x as big.

This should complete the analysis.

I have not found any "true" reference to which Jacobian to use.
I think it is my imagination, to use the chi2 function Jacobian.

It seems from various tutorials also refers to the direct Jacobian:
http://www.orbitals.com/self/least/least.htm

This is an experimental feature, and that is pointed out in the user 
function.

The last point here is, that it seems that just 50 Monte-Carlo
simulations is enough for the
estimation of R2eff errors.

After your implementation in C-code of the Jacobian, and the Hessian,
which give the possibility to use "Newton" as minimisation technique,
Monte-Carlo simulations are now so super-fast, that if one is in a
HURRY, at least 50 MC simulations will give a better, and more safe,
result than estimating from the Co-Variance matrix.

But now the option is there, giving freedom to the user to try
different possibilities.

I think the code should reside where it is, since it is still so 
experimental.
When the usability has been verified, it could be split up.
If the usability is low, then one can quickly delete it.

If minfx was extended to use constraints in BFGS, it should be quite
easy to make the Jacobian of the different relaxation dispersion
models, and implement those.

That will speed up the analysis, and also make it possible to extract
the estimated errors from the Jacobian.

But this is distant future.

Best
Troels

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: The estimation for R2eff errors via Co-variance matrix.

Header

Content

Related Messages