Re: r27203 - /trunk/specific_analyses/relax_disp/optimisation.py -- January 19, 2015

Hi,

I'll take a closer look at the reference and technique a little later
- I'm currently doing 3 other things simultaneously.  Each technique
for studying errors would be useful in some way though.  There are not
many in relax, so feel free to add new user functions for each method
you would like to test out.  As these are to do with parameters and
their errors, these could be placed in the error_analysis user
function class (otherwise a new statistics user function class might
have been useful).  This is what I did while studying the book
'Numerical Optimisation' by Nocedal and Wright
(http://www.amazon.com/Numerical-Optimization-Operations-Financial-Engineering/dp/0387303030
- I bought myself a hard copy, but PDF versions are around).  In the
end, this became minfx (https://gna.org/projects/minfx/).  The list of
minfx techniques is quite similar to the first half of the table of
contents of this book (the linear and quadratic programming was not
implemented).  Anyway, if you have a technique to test and it has a
proper name, just create error_analysis user functions as you need.
You now know how to handle the specific API to do this extremely
quickly from pipe_control.error_analysis.  For example this one could
be error_analysis.confidence_intervals which takes the argument of the
parameter name and only works on a single parameter (for now).

It's a pity about the F-statistics - I was actually going to add these
distributions, as well as the chi-squared distributions, into relax
during my PhD studies for better comparisons to, as well as being able
to reproduce the results of, Art Palmer's Modelfree4.  But I never got
around to it, despite it being a simple problem and solution.  You may
have found that useful.  Anyway, if you have any such data store or
specific analysis independent code, just dump it into lib.statistics.

Regards,

Edward



On 19 January 2015 at 11:02, Troels Emtekær Linnet
<tlinnet@xxxxxxxxxxxxx> wrote:

Hi Edward.

I am now trying to follow page 109-111.
http://www.graphpad.com/faq/file/Prism4RegressionBook.pdf
"Generating confidence intervals via model comparison"

Here, I have locked all values except kex.
# The number of parameters to check is kex = 1.
P = 1
# Number of datapoints
N = 1952
# The degrees of freedom for this confidence interval
dof_conf = N - P
# The critical value of the F distribution with p-value of 0.05 for 95%
confidence.
# Can be calculated with microsoft excel:
# F=FINV(0,05; P; dof_conf), F=FINV(0,05; P; dof_conf), F=FINV(0,05; 1;
1951)=3,846229551
 # Can also be calculated with: import scipy.stats; scipy.stats.f.isf(0.05,
1, 1951)=3.8462295505435562
F = 3.8462295505435562
scale = F*P/dof_conf +1 = 1.00197141443

Then I vary kex from 1000, til 5000, and then taking values of kex, where
the calulated chi2 is less than:
chi2_test = 1.00197141443 *  2324.5 = 2329.082

For the 100% dataset, there is a nice shape of chi2.

But this goes against 50 % dataset I have seen.
i_sort    dw_sort    pA_sort    kex_sort      chi2_sort
471       4.50000    0.99375    2125.00000    4664.31083
470       4.50000    0.99375    1750.00000    4665.23872

If I am unlucky, I have created local minima in the space.

So, I am now trying to do this for the 50 % dataset.

This method is interesting, since I can force kex out of its local minima.
But it will be close to impossible to extend to more parameters than 1.

Best
Troels

2015-01-19 10:43 GMT+01:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:


Ah!  It could be what I've seen with the model-free analysis.  It
could be that the high precision optimisation in relax avoids the
problems of parameter error overestimation due to the real minimum
being located in a broad region or tunnel in the space.  Do you know
the reason for the large kex errors in the original analysis?  Is it
possible to investigate this?  How were the errors calculated, and do
you know the exact implementation details?  For example what was the
optimisation starting point for each error simulation?  I spent my
entire PhD time solving such problems, reading 3 statistics books
cover-to-cover in the maths library, so I may be able to help.  Or at
least point you in a useful direction.

Regards,

Edward


On 19 January 2015 at 10:33, Troels Emtekær Linnet
<tlinnet@xxxxxxxxxxxxx> wrote:

Hi Edward.

I actually think that I have created local minima in my dataset, which
is
not caught.

I am looking into it.

2015-01-19 10:30 GMT+01:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:


Hi,

Maybe we should discuss on the original thread the problem in detail
and see if there is a solution.  I wonder why the kex errors are so
different?

Regards,

Edward



On 19 January 2015 at 09:51, Troels Emtekær Linnet
<tlinnet@xxxxxxxxxxxxx> wrote:

Hi Edward.

I was through sor (sum of residuals), sos(sum of squares), and now
sse(sum
of squared errors).

I agree with sse being the best, but I have reverted all my commits,
and
found a solution through the API.

Just using the chi2 value, and finding degrees of freedom with the
API.

If one wants .sse, one can just quickly do

value.set(val=1.0, param="r2eff", error=True)
minimise.calculate(verbosity=1)

Anyway, in the end, the new method did not solve my problem.
STD_fit = sqrt(chi2 / dof)

Since dof is so big (many datapoints, small amounts of parameters for
clustered fitting), STD_fit becomes close to 1.


Best
Troels


2015-01-19 9:35 GMT+01:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:


Hi Troels,

Could you rename spin.sos to spin.sse?  This is the acronym used in
the field and by other software - the sum of squared errors
(https://en.wikipedia.org/wiki/Residual_sum_of_squares,
http://www.palmer.hs.columbia.edu/software/modelfree_manual.pdf).
If
the individual SSE elements are divided by the experimental error
sigma_i, then this is the chi2 value.  The SSE and chi2 statistics
are
related, and are identical in the case of unit errors.  Other
acronyms, much less used in the NMR field, are SSR or RSS.  I don't
think I've ever encountered SOS before, outside of emergencies
(https://en.wikipedia.org/wiki/SOS).

Cheers,

Edward

On 16 January 2015 at 23:19,  <tlinnet@xxxxxxxxxxxxx> wrote:

Author: tlinnet
Date: Fri Jan 16 23:19:50 2015
New Revision: 27203

URL: http://svn.gna.org/viewcvs/relax?rev=27203&view=rev
Log:
Implemented storing of sum of squares and the standard deviation
of
these for relaxation dispersion, when doing a point calculation.

Task #7882 (https://gna.org/task/?7882): Implement Monte-Carlo
simulation, where errors are generated with width of standard
deviation or
residuals.

Modified:
    trunk/specific_analyses/relax_disp/optimisation.py

Modified: trunk/specific_analyses/relax_disp/optimisation.py
URL:


http://svn.gna.org/viewcvs/relax/trunk/specific_analyses/relax_disp/optimisation.py?rev=27203&r1=27202&r2=27203&view=diff



==============================================================================
--- trunk/specific_analyses/relax_disp/optimisation.py  (original)
+++ trunk/specific_analyses/relax_disp/optimisation.py  Fri Jan 16
23:19:50 2015
@@ -119,7 +119,7 @@
     @type spin_lock_nu1:        list of lists of numpy rank-1
float
arrays
     @keyword relax_times_new:   The interpolated experiment
specific
fixed time period for relaxation (in seconds).  The dimensions are
{Ei, Mi,
Oi, Di, Ti}.
     @type relax_times_new:      rank-4 list of floats
-    @keyword store_chi2:        A flag which if True will cause
the
spin specific chi-squared value to be stored in the spin
container.
+    @keyword store_chi2:        A flag which if True will cause
the
spin specific chi-squared value to be stored in the spin container
together
with the sum of squares of the residuals and the standard
deviation
of the
sum of squares of the residuals.
     @type store_chi2:           bool
     @return:                    The back-calculated R2eff/R1rho
value
for the given spin.
     @rtype:                     numpy rank-1 float array
@@ -215,10 +215,15 @@
     # Make a single function call.  This will cause back
calculation
and the data will be stored in the class instance.
     chi2 = model.func(param_vector)

-    # Store the chi-squared value.
+    # Get the sum of squares 'sos' of the residuals between the
fitted
values and the measured values. Get the std deviation of these,
std_sos.
+    sos, sos_std = model.get_sum_of_squares()
+
+    # Store the chi-squared value, sums of squares of residual
and
the
standard deviation of sums of squares of residual.
     if store_chi2:
         for spin in spins:
             spin.chi2 = chi2
+            spin.sos = sos
+            spin.sos_std = sos_std

     # Return the structure.
     return model.get_back_calc()


_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-commits mailing list
relax-commits@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-commits


_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: r27203 - /trunk/specific_analyses/relax_disp/optimisation.py

Header

Content

Related Messages