Re: r25526 - /branches/est_par_error/test_suite/system_tests/relax_disp.py -- September 02, 2014

Hi Troels,

For the dispersion models, excluding the 'R2eff' model, the data to be
randomised is the R2eff/R1rho values (and the measured R1 for
off-resonance data), not the peak intensities.  This looks like a
simple bug - I guess you'll have it fixed in one of the next commits.
The bootstrapping numbers should be similar to the MC simulations.

Oh, you should understand the situation where the covariance matrix
and bootstrapping techniques fail.  If you have a long flat valley or
tunnel going down to a small minimum, there are situations where noise
can constrict the tunnels causing a new minimum to appear somewhere in
the tunnel.  The noise can also squeeze a fine local minimum out of
existence, hence a new minimum will appear somewhere on the path of
steepest descent before the real minimum position.  These new minima
occur a lot in the model-free space, see my paper at
http://dx.doi.org/10.1007/s10858-006-9007-z, specifically figure 2
which shows this type of space and figure 4 which shows what happens
to Monte Carlo simulations.  It is actually relatively common when the
optimisation problem is non-linear.  And from your OpenDX mapping of
the dispersion space, I can guarantee you that in certain edge cases,
this will occur in the dispersion space as well.  The covariance
matrix technique makes assumptions about the quadrature of the space,
and these conditions break that assumption and make the error estimate
far too small.  And the bootstrapping technique is not capable of
simulating this correctly as the wrong data is randomised - it will
give a distorted picture of this phenomenon.  In such cases, Monte
Carlo simulations gives the best estimate.  But as you can see in
figure 4 of my paper, this technique can also have significant
problems if the squeezing of the space causes the new minimum to
appear at infinity :)

Regards,

Edward

On 2 September 2014 03:06,  <tlinnet@xxxxxxxxxxxxx> wrote:

Author: tlinnet
Date: Tue Sep  2 03:06:02 2014
New Revision: 25526

URL: http://svn.gna.org/viewcvs/relax?rev=25526&view=rev
Log:
Tried to implement bootstrap method.

But that goes horrible wrong.

The numbers are now:
param: r2a_err, with err: 0.48131663, compared to MC: 0.49354342, with 600 
boot 0.00000000
param: dw_err, with err: 117.99734537, compared to MC: 0.32813703, with 600 
boot 0.00000000
param: k_AB_err, with err: 0.64606994, compared to MC: 0.65384783, with 600 
boot 0.00000000

So, something is right, and something is really wrong.

It is clear, that the reported error estimation scales with the error.

So, either one should start with simulations from peak intensity and then 
forward.
Or, maybe the use of errors are wrong.

There are some rules for how errors can be added together.
If they are a product, then the fractional errors can be summed together.

But, I have to look into this.

task #7824(https://gna.org/task/index.php?7824): Model parameter ERROR 
estimation from Jacobian and Co-variance matrix of dispersion models.

Modified:
    branches/est_par_error/test_suite/system_tests/relax_disp.py

Modified: branches/est_par_error/test_suite/system_tests/relax_disp.py
URL: 
http://svn.gna.org/viewcvs/relax/branches/est_par_error/test_suite/system_tests/relax_disp.py?rev=25526&r1=25525&r2=25526&view=diff
==============================================================================
--- branches/est_par_error/test_suite/system_tests/relax_disp.py        
(original)
+++ branches/est_par_error/test_suite/system_tests/relax_disp.py        Tue 
Sep  2 03:06:02 2014
@@ -41,12 +41,13 @@
 from pipe_control.minimise import assemble_scaling_matrix
 from specific_analyses.relax_disp.checks import check_missing_r1
 from specific_analyses.relax_disp.estimate_r2eff import estimate_par_err, 
estimate_r2eff
-from specific_analyses.relax_disp.data import average_intensity, 
check_intensity_errors, generate_r20_key, get_curve_type, 
has_exponential_exp_type, has_r1rho_exp_type, loop_exp_frq, 
loop_exp_frq_offset, loop_exp_frq_offset_point, 
loop_exp_frq_offset_point_time, loop_time, return_grace_file_name_ini, 
return_param_key_from_data, spin_ids_to_containers
+from specific_analyses.relax_disp.data import average_intensity, 
check_intensity_errors, generate_r20_key, get_curve_type, 
has_exponential_exp_type, has_r1rho_exp_type, is_r1_optimised, 
loop_exp_frq, loop_exp_frq_offset, loop_exp_frq_offset_point, 
loop_exp_frq_offset_point_time, loop_point, loop_time, return_cpmg_frqs, 
return_grace_file_name_ini, return_param_key_from_data, 
spin_ids_to_containers, return_offset_data, return_r1_data, 
return_r1_err_data, return_r2eff_arrays, return_spin_lock_nu1
 from specific_analyses.relax_disp.data import INTERPOLATE_DISP, 
INTERPOLATE_OFFSET, X_AXIS_DISP, X_AXIS_W_EFF, X_AXIS_THETA, 
Y_AXIS_R2_R1RHO, Y_AXIS_R2_EFF
 from specific_analyses.relax_disp.model import models_info, nesting_param
-from specific_analyses.relax_disp.parameters import linear_constraints
+from specific_analyses.relax_disp.parameters import assemble_param_vector, 
linear_constraints, param_num
 from specific_analyses.relax_disp.variables import EXP_TYPE_CPMG_DQ, 
EXP_TYPE_CPMG_MQ, EXP_TYPE_CPMG_PROTON_MQ, EXP_TYPE_CPMG_PROTON_SQ, 
EXP_TYPE_CPMG_SQ, EXP_TYPE_CPMG_ZQ, EXP_TYPE_LIST, EXP_TYPE_R1RHO, 
MODEL_B14_FULL, MODEL_CR72, MODEL_CR72_FULL, MODEL_DPL94, MODEL_IT99, 
MODEL_LIST_ANALYTIC_CPMG, MODEL_LIST_FULL, MODEL_LIST_NUMERIC_CPMG, 
MODEL_LM63, MODEL_M61, MODEL_M61B, MODEL_MP05, MODEL_NOREX, 
MODEL_NS_CPMG_2SITE_3D_FULL, MODEL_NS_CPMG_2SITE_EXPANDED, 
MODEL_NS_CPMG_2SITE_STAR_FULL, MODEL_NS_R1RHO_2SITE, MODEL_NS_R1RHO_3SITE, 
MODEL_NS_R1RHO_3SITE_LINEAR, MODEL_PARAMS, MODEL_R2EFF, MODEL_TP02, 
MODEL_TAP03, PARAMS_R20
 from status import Status; status = Status()
+from target_functions.relax_disp import Dispersion
 from test_suite.system_tests.base_classes import SystemTestCase

 # C modules.
@@ -7377,6 +7378,38 @@
         my_dic = {}
         spin_id_list = []

+        # Set algorithm.
+        min_algor = 'simplex'
+        constraints = True
+        if constraints:
+            min_options = ('%s'%(min_algor),)
+            min_algor = 'Log barrier'
+            #min_algor = 'Method of Multipliers'
+            scaling_matrix = assemble_scaling_matrix(scaling=True)
+
+            # Collect spins
+            all_spin_ids = []
+            for cur_spin, mol_name, resi, resn, spin_id in 
spin_loop(full_info=True, return_id=True, skip_desel=True):
+                all_spin_ids.append(spin_id)
+
+            spins = spin_ids_to_containers(all_spin_ids[:1])
+
+            # Get constraints
+            A, b = linear_constraints(spins=spins, 
scaling_matrix=scaling_matrix[0])
+        else:
+            min_options = ()
+            A, b = None, None
+
+
+        sim_boot = 2
+
+        # Number of spectrometer fields.
+        fields = [None]
+        field_count = 1
+        if hasattr(cdp, 'spectrometer_frq_count'):
+            fields = cdp.spectrometer_frq_list
+            field_count = cdp.spectrometer_frq_count
+
         # Get the data.
         for cur_spin, mol_name, resi, resn, spin_id in 
spin_loop(full_info=True, return_id=True, skip_desel=True):
             # Add key to dic.
@@ -7392,6 +7425,10 @@
             # Generate spin string.
             spin_string = generate_spin_string(spin=cur_spin, 
mol_name=mol_name, res_num=resi, res_name=resn)

+            # Get the param_vector
+            param_vector = assemble_param_vector(spins=[cur_spin])
+            model = cur_spin.model
+
             param_key_list = []
             for exp_type, frq, offset, ei, mi, oi, in 
loop_exp_frq_offset(return_indices=True):
                 # Get the parameter key.
@@ -7418,6 +7455,73 @@
                         param_err_val = deepcopy(getattr(cur_spin, 
param_err))
                         my_dic[spin_id][param_key][param_err] = 
param_err_val

+                # Start boot strap.
+                values, errors, missing, frqs, frqs_H, exp_types, 
relax_times = return_r2eff_arrays(spins=[cur_spin], spin_ids=[spin_id], 
fields=fields, field_count=field_count)
+                # The offset data.
+                offsets, spin_lock_fields_inter, chemical_shifts, 
tilt_angles, Delta_omega, w_eff = return_offset_data(spins=[cur_spin], 
spin_ids=[spin_id], field_count=field_count)
+
+                # r1 data.
+                r1 = return_r1_data(spins=[cur_spin], spin_ids=[spin_id], 
field_count=field_count)
+                r1_err = return_r1_err_data(spins=[cur_spin], 
spin_ids=[spin_id], field_count=field_count)
+                r1_fit = is_r1_optimised(model)
+                model_param_num = param_num(spins=[cur_spin])
+
+                dispersion_points = cdp.dispersion_points
+                cpmg_frqs = return_cpmg_frqs(ref_flag=False)
+                spin_lock_nu1 = return_spin_lock_nu1(ref_flag=False)
+                num_spins=1
+
+                values_err = deepcopy(values)
+                si = 0
+                r2a_sim_l = []
+                dw_sim_l = []
+                k_AB_sim_l = []
+
+                for j in range(sim_boot):
+                    if j in range(0, 1000, 10):
+                        print("Simulation %i"%j)
+                    # Start minimisation.
+
+                    # Produce errors
+                    r2_err = []
+                    for j, error in enumerate(errors[ei][si][mi][oi]):
+                        r2_error = gauss(values[ei][si][mi][oi][j], error)
+                        #print values[ei][si][mi][oi][j], r2_error, error
+                        values_err[ei][si][mi][oi][j] = deepcopy(r2_error)
+
+                    # Init the Dispersion clas with data, so we can call 
functions in it.
+                    tfunc = Dispersion(model=model, 
num_params=model_param_num, num_spins=num_spins, num_frq=field_count, 
exp_types=exp_types, values=values_err, errors=errors, missing=missing, 
frqs=frqs, frqs_H=frqs_H, cpmg_frqs=cpmg_frqs, spin_lock_nu1=spin_lock_nu1, 
chemical_shifts=chemical_shifts, offset=offsets, tilt_angles=tilt_angles, 
r1=r1, relax_times=relax_times, r1_fit=r1_fit)
+                    results = generic_minimise(func=tfunc.func, args=(), 
x0=param_vector, min_algor=min_algor, min_options=min_options, A=A, b=b, 
full_output=True, print_flag=0)
+                    param_vector, chi2, iter_count, f_count, g_count, 
h_count, warning = results
+                    print chi2, warning
+
+                    # Loop over params.
+                    for i, param in enumerate(params):
+                        # If param in PARAMS_R20, values are stored in 
with parameter key.
+                        if param in PARAMS_R20:
+                            # Get the error.
+                            param_val = deepcopy(getattr(cur_spin, 
param)[param_key])
+                            r2a_sim_l.append(param_val)
+
+                        else:
+                            # Get the error.
+                            param_val = deepcopy(getattr(cur_spin, param))
+                            if param == 'dw':
+                                dw_sim_l.append(param_val)
+                            elif param == 'k_AB':
+                                k_AB_sim_l.append(param_val)
+
+                # Get stats on distribution.
+                sigma_r2a_sim = std(asarray(r2a_sim_l), ddof=1)
+                sigma_dw_sim = std(asarray(dw_sim_l), ddof=1)
+                sigma_k_AB_sim = std(asarray(k_AB_sim_l), ddof=1)
+                my_dic[spin_id][param_key]['sigma_r2a_sim'] = sigma_r2a_sim
+                my_dic[spin_id][param_key]['sigma_dw_sim'] = sigma_dw_sim
+                my_dic[spin_id][param_key]['sigma_k_AB_sim'] = 
sigma_k_AB_sim
+                my_dic[spin_id][param_key]['sigma_r2a_arr'] = 
asarray(r2a_sim_l)
+                my_dic[spin_id][param_key]['sigma_dw_arr'] = 
asarray(dw_sim_l)
+                my_dic[spin_id][param_key]['sigma_k_AB_arr'] = 
asarray(k_AB_sim_l)
+
             my_dic[spin_id]['param_key_list'] = param_key_list

         # After Monte-Carlo.
@@ -7456,11 +7560,14 @@
             param_key_list = my_dic[spin_id]['param_key_list']
             for param_key in param_key_list:
                 params_err = my_dic[spin_id]['params_err']
+                params = my_dic[spin_id]['params']
                 # Loop over params.
                 for i, param_err in enumerate(params_err):
+                    param = params[i]
                     param_err_val = my_dic[spin_id][param_key][param_err]
                     param_err_val_mc = 
my_dic[spin_id][param_key][param_err+'MC']
-                    print("param: %s, with err: %3.8f, compared to MC: 
%3.8f" % (param_err, param_err_val, param_err_val_mc) )
+                    param_err_val_sim = 
my_dic[spin_id][param_key]['sigma_%s_sim'%param]
+                    print("param: %s, with err: %3.8f, compared to MC: 
%3.8f, with %i boot %3.8f" % (param_err, param_err_val, param_err_val_mc, 
sim_boot, param_err_val_sim) )


     def test_tp02_data_to_ns_r1rho_2site(self, model=None):


_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-commits mailing list
relax-commits@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-commits

Re: r25526 - /branches/est_par_error/test_suite/system_tests/relax_disp.py

Header

Content

Related Messages