mailRe: Comparison of Monte Carlo simulations vs. covariance matrix.


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on August 30, 2014 - 09:54:
Hi,

I don't have much time to reply now, but the key is it use simple
synthetic noise-free data.  Try converting the 5 intensities in
test_suite/shared_data/curve_fitting/numeric_topology/ into 5 Sparky
peak lists with a single spin.  Then test the Monte Carlo simulations
and covariance matrix user functions in relax.  These two relax
techniques should then match the numeric results from the super-basic
scripts in that directory!  This would then be converted into two
system tests.

This was my plan, to complete in my spare time.  If you want to go
quickly, then feel free to follow these steps yourself rather than
waiting for me to do it.  I actually suggested this synthetic data
testing earlier to you
(http://thread.gmane.org/gmane.science.nmr.relax.devel/6807/focus=6840).
Synthetic noise-free data is an essential tool for implemented and
debugging any new analysis type, algorithm, protocol, etc.  The key is
that you know the answer you are searching for!  And synthetic data is
simple.  Nothing should ever be implemented and debugged using real
data, as a good looking result might be the consequence of a nasty
bug.

Regards,

Edward






On 30 August 2014 02:49, Troels Emtekær Linnet <tlinnet@xxxxxxxxx> wrote:
Hm.

The last idea I have, is the division by number of degree of freedom.

So either 5-2, or 4-2.

That should be verified by a script with many different time points.

But then the errors for intensity gets very different.

Hm.

On 30 Aug 2014 01:45, "Troels Emtekær Linnet" <tlinnet@xxxxxxxxxxxxx> wrote:

The sentence:

"then the covariance matrix above gives the statistical error on the
best-fit parameters resulting from the Gaussian errors 'sigma_i' on
the underlying data 'y_i'."

And here I note the wording:
"statistical error"
"Gaussian errors"

Best
Troels


2014-08-29 21:21 GMT+02:00 Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx>:
Hi Edward.

I also think it is some math some where.

I have a feeling, that it is the creating of Monte Carlo data with 2
sigma?
and then some log/exp calculation of R2eff.

If the errors are 2 x times over estimated, the chi2 values are 4 as
small, and the
space should be the same?

best
Troels

2014-08-29 17:06 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:
I've just added the 2D Grace plots for this to the repository (r25444,
http://article.gmane.org/gmane.science.nmr.relax.scm/23194).  They are
also attached to the task for easier access
(https://gna.org/task/index.php?7822#comment107).  From these plots I
see that the I0 error appears to be reasonable, but that the R2eff
errors are overestimated by 1.9555.

The plots are very, very different.  It is clear that
chi2_jacobian=True just gives rubbish.  It is also clear that there is
a perfect correlation in R2eff when chi2_jacobian=False.  The plot
shows absolutely no scattering, therefore this indicates a crystal
clear mathematical error somewhere.  I wonder where that could be.  It
may not be a factor of 2, as the correlation is 1.9555.  So it might
be a bug that is more complicated.  In any case, overestimating the
errors by ~2 and performing a dispersion analysis is not possible.
This will significantly change the curvature of the optimisation space
and will also have a huge effect on statistical comparisons between
models.

Regards,

Edward



On 29 August 2014 16:56, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx>
wrote:
The default is now chi2_jacobian=False.

You will hopefully see, that the errors are double.

Best
Troels

2014-08-29 16:43 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:
Terrible ;)  For R2eff, the correlation is 2.748 and the points are
spread out all over the place.  For I0, the correlation is 3.5 and
the
points are also spread out everywhere.  Maybe I should try with the
change from:

relax_disp.r2eff_err_estimate(chi2_jacobian=True)

to:

relax_disp.r2eff_err_estimate(chi2_jacobian=False)

How should this be used?

Cheers,

Edward



On 29 August 2014 16:33, Troels Emtekær Linnet
<tlinnet@xxxxxxxxxxxxx> wrote:
Do you mean terrible or double?

Best
Troels

2014-08-29 16:15 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:
Hi Troels,

I really cannot follow and judge how the techniques compare.  I
must
be getting old.  So to remedy this, I have created the

test_suite/shared_data/dispersion/Kjaergaard_et_al_2013/exp_error_analysis/
directory (r25437,
http://article.gmane.org/gmane.science.nmr.relax.scm/23187).  This
contains 3 scripts for comparing R2eff and I0 parameters for the 2
parameter exponential curve-fitting:

1)  A simple script to perform Monte Carlo simulation error
analysis.
This is run with 10,000 simulations to act as the gold standard.

2)  A simple script to perform covariance matrix error analysis.

3)  A simple script to generate 2D Grace plots to visualise the
differences.  Now I can see how good the covariance matrix
technique
is :)

Could you please check and see if I have used the
relax_disp.r2eff_err_estimate user function correctly?  The Grace
plots show that the error estimates are currently terrible.

Cheers,

Edward

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel



Related Messages


Powered by MHonArc, Updated Sat Aug 30 14:00:35 2014