Re: Speed up of the TSMFK01 model. -- September 09, 2013

Dear Edward.

There is an explanation for everything, including user errors. :-)

I have analysed the 0.48 M GuaHCl dataset (the folded protein), and so
this is not comparable to the
1M GuaHcl (intermediate between folded/unfolded) dataset, which is
shown in the figures in the paper.

So, I got the original datapoints for figure 3, which shows the
ln(k_a) per GuaHCl.
And now k_a fits perfect for 0.48 M.

And I have today analysed the 1M dataset.

Everything matches until first digit, so I am satisfied.

So, I will soon send a swarm of patches to include this dataset.

Thanks for looking!

I will look into the tsmfk01 code to speed it up.

To compare to the numerical methods, you mentioned that one could make
auto conversion of the
parameters?

So could k_AB be calculated for the numerical methods, so:
k_AB = kex*(1-pA)

Best
Troels




Troels Emtekær Linnet


2013/9/9 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

Hi Troels,

I'm still not sure why the TSMFK01 model results do not match what you
expect 
(http://thread.gmane.org/gmane.science.nmr.relax.scm/18555/focus=4531).
 The code is very clean and there is nothing obvious.  The problems I
saw before with the k_AB parameter, you have now fixed.  So I really
don't know what is happening.  I would recommend looking at a spin
system with data at two fields, just in case the model is not stable
for single field strength data.  In any case, more testing of data is
required to work out what is happening.  If the motional parameters
are truly within the range of the TSMFK01 model, then comparison to
the numeric model should produce similar parameter values.  This is
also a useful sanity check.

Maybe you could write an email to Martin Tollinger about getting some
of the data used in the paper.  This would include peak heights (or
R2eff) as well as the optimised parameter values.  Having both is
important.  Some of his code is in relax, so he is aware of the
dispersion branch.  If you explain what you would like to do and how
you're in the process of implementing / debugging his model in relax,
I'm sure he'd be happy to help.  He may even still have the synthetic
data he used in his paper (http://dx.doi.org/10.1021/ja011300z) - that
would be the best for the checks.

On the subject of the subject line, I have a few points about the
lib.dispersion.tsmfk01 code to speed things up:

1)  The dw * tau_CP mathematical operation occurs twice - this is a
waste and the result can be stored as a variable and reused.  An easy
solution would be to put the denominator calculation before the
numerator, and the rest should be obvious.

2)  The tau_CP value is re-calculated each time.  These values could
be stored in data structure which is set up when the target function
class is initialised (in the __init__() method).  That way this
calculation is avoided in the target function where it is much more
computationally expensive.

3)  Also related to point 2), Python has to convert your integer '2'
into a float prior to the multiplication.  If you use '2.0' instead,
then that avoids the time required for Python type conversions.

Implementing these drop the number of mathematical operations per loop
per function call from 9 to 6, and removes a type conversion.  So you
should get a good speed up.

Regards,

Edward


P. S.  Be careful with the tau_CP to nu_CPMG calculation.  In relax,
the factor of 1/2 rather than 1/4 is often used.  This is the notation
used by CPMGFit.  Different groups define nu_CPMG differently!

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: Speed up of the TSMFK01 model.

Header

Content

Related Messages