Re: [sr #3154] Implementation of Baldwin (2014) B14 model - 2-site exact solution model for all time scales -- May 06, 2014

Hi Andy,

I have modified your script to produce text files of your grid,
displaying the maximum R2eff difference for each grid point
(https://gna.org/support/download.php?file_id=20654).  This has
produced a number of maps which clearly show the problem:

map_Baldwin:  https://gna.org/support/download.php?file_id=20656
map_Carver:  https://gna.org/support/download.php?file_id=20655
map_Meiboom:  https://gna.org/support/download.php?file_id=20657
map_Nikolai:  https://gna.org/support/download.php?file_id=20658
map_NikolaiFun:  https://gna.org/support/download.php?file_id=20660
map_NikolaiLong:  https://gna.org/support/download.php?file_id=20659

For map_NikolaiFun (and map_Nikolai), it can be seen that the problem
area is located to only one region of the parameter space:

pB                   dw                   kex                  max_delta_R2eff
   0.400000000000000 1000.000000000000000    5.492802716530586
0.592752848196470
   0.252382937792077 1000.000000000000000    5.492802716530586
3.845196468401902
   0.159242868221399 1000.000000000000000    5.492802716530586
5.431089306584756
   0.100475457260383 1000.000000000000000    5.492802716530586
2.682512609040295
   0.063395727698445 1000.000000000000000    5.492802716530586
0.107557528833629
   0.040000000000000 1000.000000000000000    5.492802716530586
1.862111109685991
   0.025238293779208 1000.000000000000000    5.492802716530586
1.248136912461243
   0.015924286822140 1000.000000000000000    5.492802716530586
1.439357156013592
   0.010047545726038 1000.000000000000000    5.492802716530586
0.384699031445987
   0.006339572769844 1000.000000000000000    5.492802716530586
0.165313396220368
   0.004000000000000 1000.000000000000000    5.492802716530586
0.316275262141682
   0.400000000000000 1000.000000000000000    2.343672911592100
11.633432971003034
   0.252382937792077 1000.000000000000000    2.343672911592100
         nan
   0.159242868221399 1000.000000000000000    2.343672911592100
45.046754399379282
   0.100475457260383 1000.000000000000000    2.343672911592100
3.357238601090618
   0.063395727698445 1000.000000000000000    2.343672911592100
3.758080253732464
   0.040000000000000 1000.000000000000000    2.343672911592100
2.232620263385279
   0.025238293779208 1000.000000000000000    2.343672911592100
19.760618010603249
   0.015924286822140 1000.000000000000000    2.343672911592100
8.862633579602273
   0.010047545726038 1000.000000000000000    2.343672911592100
15.978405412625353
   0.006339572769844 1000.000000000000000    2.343672911592100
8.952365343475268
   0.004000000000000 1000.000000000000000    2.343672911592100
11.945565346143020
   0.400000000000000 1000.000000000000000    1.000000000000000
         nan
   0.252382937792077 1000.000000000000000    1.000000000000000
         nan
   0.159242868221399 1000.000000000000000    1.000000000000000
         nan
   0.100475457260383 1000.000000000000000    1.000000000000000
104.735381560928317
   0.063395727698445 1000.000000000000000    1.000000000000000
102.520903807959172
   0.040000000000000 1000.000000000000000    1.000000000000000
65.987026895834390
   0.025238293779208 1000.000000000000000    1.000000000000000
78.994037851916559
   0.015924286822140 1000.000000000000000    1.000000000000000
         nan
   0.010047545726038 1000.000000000000000    1.000000000000000
         nan
   0.006339572769844 1000.000000000000000    1.000000000000000
         nan
   0.004000000000000 1000.000000000000000    1.000000000000000
44.553251112908285

I.e. dw of 1000 rad.s^-1 and kex being very, very low.  In all other
regions of the parameter space, the max_delta_R2eff values are in the
order of 1e-8 or lower.  Looking at the equivalent region in
map_NikolaiLong:

pB                   dw                   kex                  max_delta_R2eff
   0.400000000000000 1000.000000000000000    5.492802716530586
0.000000022779017
   0.252382937792077 1000.000000000000000    5.492802716530586
0.000000370744953
   0.159242868221399 1000.000000000000000    5.492802716530586
0.000000004931843
   0.100475457260383 1000.000000000000000    5.492802716530586
0.000000003682589
   0.063395727698445 1000.000000000000000    5.492802716530586
0.000000007636758
   0.040000000000000 1000.000000000000000    5.492802716530586
0.000000002312100
   0.025238293779208 1000.000000000000000    5.492802716530586
0.000000003527191
   0.015924286822140 1000.000000000000000    5.492802716530586
0.000000003627945
   0.010047545726038 1000.000000000000000    5.492802716530586
0.000000004664540
   0.006339572769844 1000.000000000000000    5.492802716530586
0.000000001267414
   0.004000000000000 1000.000000000000000    5.492802716530586
0.000000006526546
   0.400000000000000 1000.000000000000000    2.343672911592100
0.000000013242452
   0.252382937792077 1000.000000000000000    2.343672911592100
0.000000074741680
   0.159242868221399 1000.000000000000000    2.343672911592100
0.000000004875817
   0.100475457260383 1000.000000000000000    2.343672911592100
0.000000123757648
   0.063395727698445 1000.000000000000000    2.343672911592100
0.000000011314873
   0.040000000000000 1000.000000000000000    2.343672911592100
0.000000054738139
   0.025238293779208 1000.000000000000000    2.343672911592100
0.000000011630442
   0.015924286822140 1000.000000000000000    2.343672911592100
0.000000063631723
   0.010047545726038 1000.000000000000000    2.343672911592100
0.000000004935935
   0.006339572769844 1000.000000000000000    2.343672911592100
0.000000023574356
   0.004000000000000 1000.000000000000000    2.343672911592100
0.000000021224806
   0.400000000000000 1000.000000000000000    1.000000000000000
0.000000577754879
   0.252382937792077 1000.000000000000000    1.000000000000000
0.000000347777326
   0.159242868221399 1000.000000000000000    1.000000000000000
0.000000228853165
   0.100475457260383 1000.000000000000000    1.000000000000000
0.000000259100318
   0.063395727698445 1000.000000000000000    1.000000000000000
0.000000179476287
   0.040000000000000 1000.000000000000000    1.000000000000000
0.000000030331906
   0.025238293779208 1000.000000000000000    1.000000000000000
0.000000064205007
   0.015924286822140 1000.000000000000000    1.000000000000000
0.000000276358881
   0.010047545726038 1000.000000000000000    1.000000000000000
0.000000110869264
   0.006339572769844 1000.000000000000000    1.000000000000000
0.000000195626962
   0.004000000000000 1000.000000000000000    1.000000000000000
0.000000211432798

This makes it clear where the higher numerical precision is required.
However as this situation will only arise when optimisation fails and
shoots off to infinity.  If the optimisation reaches a dw value of
1000 rad.s^-1, I do not believe that an optimisation algorithm could
ever be able to recover and return back to reasonable values.
Therefore I don't think that I'll be implementing a higher precision
version of Nikolai's code in relax.  It will only take time and I
don't see a benefit.

Anyway, your model that has been labelled as 'B14' in relax
(http://wiki.nmr-relax.com/B14) will soon be fully implemented and
documented.  As Troels mentioned, we will have two versions of your
equations implemented as separate models - one with R20A == R20B for
most users, and one with R20A != R20B for those who like adventure.
With all of the code optimisations we can perform within relax, that
Troels is currently working on, our implementation should be faster
than the Python code you originally sent us (which forms the basis of
the relax implementation).  Troels might be able to make it up to
twice as fast.  So you will soon have at your disposal a top quality
implementation with a very user friendly user interface!  This will
also give you access to a pile of powerful tools - grid search
capabilities for finding a non-biased initial optimisation position,
the fast and very accurate Nelder-Mead simplex optimiser, constraints
implemented with the powerful Log-Barrier iterative constraint
algorithm, model nesting in the auto-analysis for speed (i.e. we could
take the CR72 model results as the optimisation starting point for the
B14 model rather than performing a grid search), Monte Carlo
simulations for error estimates, 2D Grace graph plotting abilities for
parameter values, 3D optimisation space mapping capabilities (via the
dx.map user function), plotting parameter values directly onto 3D
structures in Molmol or PyMOL, etc.

Cheers,

Edward



On 6 May 2014 12:21, Andrew Baldwin <andrew.baldwin@xxxxxxxxxxxxx> wrote:

Okay. The proper definition of tcp for your code shows that both functions
(fun and rieko) behave similarly. Good!

I also agree that even with the strictest tests, a long double
implementation of the algorithm comes out with errors that are much smaller
than anything we could ever hope to measure experimentally. 23 decimal
places seems to be needed to get things to machine precision. This is
overkill. The paper makes all these points in a manner that I hope is clear
(see last fig).

Best,

Andy.


  The tcp in this function in relax is defined as 1/(4*nu_CPMG), but in
your script you have "tcp=1./(2*nu_cpmg);".  Apart from that, it looks
ok.  Changing to the correct tcp gives:

Relative times of the various approaches:
Baldwin 0.159632 1.0
Numerical 19.974315 125.127261451
Meiboom 0.056497 0.353920266613
Carver 0.080027 0.501321790117
Nikolai 0.453068 2.83820286659
NikolaiFun 0.284052 1.77941766062
NikolaiLong 50.020116 313.34642177
NikolaiLong 50.430594 315.917823494
Maximum error over grid (s-1):
Meiboom:          237214256.582
Baldwin:          6.57168541807e-10
CarverRichards:   29.9287668504
Nikolai fun:                104.735381561
Nikolai dougle (9):          142.46676955
Nikolai long double (18):     0.063449549192819563320223
Nikolai long double (23):     5.7775487944482434059452e-7

That's far more reasonable for Nikolai's fun!

I was hoping that the modification would make it more stable. If anything
the reverse is true. I would have included a note to this effect in the
paper (and still will if you can show it's better). Note the error is
very
temperamental. You can miss it even by setting up the grid search
differently. Give it a narrow grid, and you won't see the error. Only
relatively specific, but seemingly hard to predict sets of parameters
(all
with large deltaOs) lead it to explode.


It is worth looking at all deviations rather than just the single
maximum R2eff difference.  It could be that there is an edge case in
the grid whereby a maths domain error occurs.  Or an trig function
overflow.  The median deviation to avoid such outliers might also be
of worth.  By switching in your code all numpy.max() calls with
numpy.median(), and making sure the "Nikolai long double (*)" results
are pre-converted to a numpy float64 array, I then see an incredibly
different picture:

Relative times of the various approaches:
Baldwin 0.152725 1.0
Numerical 19.596446 128.3119725
Meiboom 0.05604 0.366934031756
Carver 0.077236 0.505719430349
Nikolai 0.43944 2.87732853167
NikolaiFun 0.273977 1.7939237191
NikolaiLong 50.243433 328.979754461
NikolaiLong 50.784841 332.524740547
Median error over grid (s-1):
Meiboom:          4.0064828013
Baldwin:          3.9630521087e-12
CarverRichards:   0.00661151728704
Nikolai fun:                6.62225829728e-12
Nikolai dougle (9):          6.70041799822e-12
Nikolai long double (18):     2.62099543458e-12
Nikolai long double (23):     2.27758174227e-12

!

So this is much clearer - outliers are the major difference you are
seeing.  A study of such outliers and their cause would be quite
interesting.  But in most of these cases, simple collapsed equations
for each model should be derivable at the edges which then produce the
expected R2eff over the entire parameter space rather than
computational artefacts in some corners.  Anyway, this result makes it
crystal clear that the different models should not be judged by the
single maximum outlier over the entire parameter space.

Regards,

Edward

Re: [sr #3154] Implementation of Baldwin (2014) B14 model - 2-site exact solution model for all time scales

Header

Content

Related Messages