Re: Implementation of the Schanda, Lescanne and Marion numeric dispersion model code in relax. -- July 23, 2013

Hi Edward,

Yes, I imagine that it would be easy to get the code into python. Matlabis almost the same as python, so it's really easy to understand andtranslate the functions. Differences are, e.g. in the optimizationalgorithms, but I am sure you would follow the existing routines.


As for test data, the first thing that comes to my mind is this:
Massi, F.; Grey, M.; Palmer, A. G. Protein Sci 2005, 14, 735–742.
but there are tons of R1rho data out there (e.g. by AG Palmer & coworkers).

Paul

PS: Sorry for not having sent the last message to the mailing list, butas private message instead.




On 23.07.13 22:14, Edward d'Auvergne wrote:

[private]

Hi,

I've had a look at the code, and it looks like it would be very easy
for me to get this into relax!  Matlab code is almost Python syntax,
so its rather trivial.  The funNumrho() function I'd probably break
into a function for generating the R matrix, as is done for the CPMG
numerical solutions, and a function for the propagation elements and
R1rho' calculation.  The hard part would be the generation of
different test data sets and the checking of the data.  It might also
be interesting to copy in the funTrottPalmer() function.  This was one
of the R1rho models I was planning on looking at while I was still
looking at R1rho data with Martin, but didn't get to it before the
project was discontinued.

One part that relax doesn't handle yet, due to time constraints on the
project, was the support for off-resonance R1rho.  I was also working
with the high-powered HEHAHA R1rho experiments as published by
Griesinger, so off-resonance effects did not need to be handled.  It
would be good to support this, but right now I don't have the time.  I
would need to add a new user function for reading chemical shifts
(which would be easy).  Then when assembling the data for
optimisation, I would need a function to convert all the radial
frequencies to the correct one.  I would also have to add something to
the GUI, only for the off-resonance R1rho experiment, to allow the
user to input chemical shifts.  It's probably only a few hours of
work, so I might look at it later.  If you could point me in the right
direction for the off-resonance experiments, that would be
appreciated.  Note that I'll probably copy and paste most of this text
when you send the message to the mailing list.  Thank you again for
your code contributions!

Cheers,

Edward


P. S.  I'll look at the conversion once you make your code public.






On 23 July 2013 14:11, Paul Schanda <paul.schanda@xxxxxx> wrote:

Hi Edward,

I am not aware of software that does R1rho fitting. Well, that's not quite
true, I have a set of MATLAB functions that do R1rho fitting. I attach a
package of functions. however, as I said before, it is a MATLAB code. If you
have MATLAB, then the functions should directly.
You have to execute the macro sim_all, which will in turn call functions
defined in the separate .m files. The philosophy is very similar to python,
except that the functions are external.
In the sim_all there are a few flags that allow you to chose the model you
want to use. In fact, all the CPMG functions that you implemented were, at
least in part, based on this.

I did not have time to convert the R1rho to python, but you will see that
the Bloch-McConnell part is quite similar, except for omega1 terms in the
matrix, as the rf is on all the time. Plus there is some adjustment due to
the fact that the rf irradiation is not necessarily on-resonance, so the
magnetization has to be tilted into the effective field frame.
I guess that if you wanted to translate this to python, the best thing would
be to have MATLAB or Octave in parallel, so that you can follow step by step
whether the python code is doing what it's supposed to.

Paul




On 23.07.13 13:58, Edward d'Auvergne wrote:

Hi,

It's a pity that the data is not at hand.  It makes the implementation
much easier.  If you look into the test_suite/shared_data/dispersion/
directory, you will see that there is a collection of different data
types.  I try to have both published data whereby the result is known
(either from a paper, its supplementary material, or from the
corresponding author) as well as synthetic data generated from a known
model.  I then use both data types to test relax, as well as all other
software from the field.  This internal and external consistency is
extensively tested.  The relax tests are then truncated and turned
into system tests to lock in the correct behaviour.

For the synthetic data, I have currently only used the analytic models
to generate the data.  But simulating the full Bloch-McConnell
magnetisation propagation would also be useful.  Then the numeric
models can be tested on both (and the analytic models on both as
well).  Note that it is very important that the synthetic data does
not come from the same code that it will be tested against.  In all
the synthetic data directories of test_suite/shared_data/dispersion/,
you will see that there is a Python script used to generate the data.
These sometimes use parts of relax, for example the Sparky peak list
writing function or the relax data store to handle spin system data,
but that the dispersion code in relax is never used.  This is
important as otherwise a bug in the code will be propagated into the
data, and then it will not be seen by subsequent optimisation in
relax.  Comparison to other software makes such data more convincing,
but for R1rho data, I don't think that is possible.  Do you know of
R1rho model fitting software?

Cheers,

Edward




On 23 July 2013 13:06, Paul Schanda <paul.schanda@xxxxxx> wrote:

Hi Edward,

Looks very convincing. As an alternative to Flemmings data, you might
also
just create synthetic data, by using the Bloch-McConnell equations to
create
R2eff as a function of nu_cpmg. This could, by the way, also be used for
generating R1rho data. I don't have reference R1rho data.
paul



On 23.07.13 12:17, Edward d'Auvergne wrote:

Hi,

I forgot to mention how I validated that the numeric model code in
relax is functional and finding the correct result.  I would have
liked to compare the results with Flemming Hansen's CATIA program
(http://www.biochem.ucl.ac.uk/hansen/catia/), but this relax export
function is not implemented yet (unlike those for CMPGFit, ShereKhan,
and NESSY).  I therefore compared the results to those of the Carver
and Richards (CR72) analytic model using Flemming Hansen's data from
his 2008 J. Phys. Chem. B paper (http://dx.doi.org/10.1021/jp074793o).
    For residues 70 and 71, the analytic CR72 model results from the
various programs and the numeric relax results are:


The CR72 model
==============


Residue :70
-----------

Param       relax       NESSY       CPMGFit     ShereKhan
R2 (500)        7.011       7.236       6.263       7.011
R2 (800)        9.465      10.027       6.263       9.465
pA              0.990       0.989       0.978       0.990
dw              5.577       5.373       0.937       5.577
kex          1765.993    1522.059       1.774    1765.988
chi2           18.450       7.701      65.237      18.450


Residue :71
-----------

Param       relax       NESSY       CPMGFit     ShereKhan
R2 (500)        4.978           -       4.983       4.978
pA              0.997           -       0.994       0.997
dw              4.460           -       0.757       4.461
kex          1879.584           -       0.896    1879.618
chi2            1.379           -       1.382       1.379


The numeric models in relax
===========================

Residue :70
-----------

Param     NS 2-site 3D  NS 2-site expanded   NS 2-site star
R2 (500)         6.996               6.996            6.996
R2 (800)         9.453               9.453            9.453
pA               0.990               0.990            0.990
dw               5.645               5.645            5.645
kex           1723.443            1723.443         1723.078
chi2            18.098              18.098           18.098


Residue :71
-----------

Param     NS 2-site 3D  NS 2-site expanded   NS 2-site star
R2 (500)         4.980               4.980            4.980
pA               0.997               0.997            0.997
dw               4.629               4.629            4.629
kex           1805.665            1805.676         1805.110
chi2             1.371               1.371            1.371


This comes from the
test_suite/shared_data/dispersion/Hansen/software_comparison file.
For these residues, you can see that the CR72 and numeric results are
very similar.  It could be hard to argue that they are statistically
different.  But this shows that the code is functioning perfectly.

Regards,

Edward



On 23 July 2013 09:28, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:

Hi,

The 2-site numeric dispersion models for CPMG data are now fully
functional
within relax!  The following describes how these are implemented in
relax
(to be permanently archived at
http://dir.gmane.org/gmane.science.nmr.relax.devel).  I hope that you
will
check all of this by running relax and seeing for yourself.  I also
have
a
few questions below.  Note that the code is in the relax_disp branch of
the
source code repository and that there is no official release at
http://www.nmr-relax.com/download.html yet.

I have documented the numeric models now present in relax in Chapter 10
of
the user manual.  I have compiled the current manual for the branch and
uploaded it to
http://download.gna.org/relax/manual/relax_disp_manual.pdf.
Please have a look and see what you think.  The text is extremely
minimal
and needs to be expanded.  If you have any suggestions or additional
material (equations, figures, etc., and in LaTeX preferably), it would
be
happily accepted.  For example equations for the matrices and the
propagation used.  The aim is to be a complete stand-alone relaxation
dispersion reference for the users, pointing them to all the relevant
literature at all possible points.  Cheers!

As a summary, from the relax_disp.select_model user function
description,
the numeric models are:

       'NS 2-site 3D':  The reduced numerical solution for the 2-site
Bloch-McConnell equations using 3D magnetisation vectors whereby the
simplification R20A = R20B is assumed.  Its parameters are {R20, ...,
pA,
dw, kex}.

       'NS 2-site 3D full':  The full numerical solution for the 2-site
Bloch-McConnell equations using 3D magnetisation vectors.  Its
parameters
are {R20A, R20B, ..., pA, dw, kex}.

       'NS 2-site star':  The reduced numerical solution for the 2-site
Bloch-McConnell equations using complex conjugate matrices whereby the
simplification R20A = R20B is assumed.  It has the parameters {R20,
...,
pA,
dw, kex}.

       'NS 2-site star full':  The full numerical solution for the
2-site
Bloch-McConnell equations using complex conjugate matrices with
parameters
{R20A, R20B, ..., pA, dw, kex}.

       'NS 2-site expanded':  The numerical solution for the 2-site
Bloch-McConnell equations expanded using Maple by Nikolai Skrynnikov.
It
has the parameters {R20, ..., pA, dw, kex}.

I would be interested to hear if you have suggestions for saner model
names.

In the auto-analysis in relax - the blackbox analysis script to make
the
lives of users extremely easy - the *full models are not turned on by
default.  If curious, you can see auto-analysis script in the file
auto_analyses/relax_disp.py.  The auto-analysis script is normally
hidden
from the user.  Also, please run the relax GUI with:

$ relax --gui

and start the relaxation dispersion analysis to see how it is
implemented.
Note that the auto-analyses are used for the operation of all GUI
analyses.
Using scripts, the relax user is free to do anything they wish and
implement
any protocol they can imagine (if useful for other users, these can
then
be
converted into an auto-analysis).

Part of the auto-analysis that I have implemented is the concept of
using
optimised parameters from a simpler or equivalent model to speed up
optimisation by avoiding an expensive grid search.  For the 'full'
models
above, I use the parameters of the simpler model and start with R20A =
R20B
= R20.  Then, during optimisation, R20A and R20B diverge.

For the model equivalence, this is a bit different.  I use the
optimised
parameter values from the analytic Carver and Richards model (CR72) as
the
starting parameters for the 'NS 2-site 3D', 'NS 2-site star' and 'NS
2-site
expanded' models.  This avoids the grid search and really speeds up the
optimisation by orders of magnitude.  It can be done because the CR72
results are often very similar to the numeric results.  And even a
failed
CR72 solution is close enough to the minima to be used as a starting
point
for the numeric models.

I have a few other implementation questions, but I'll send these in a
different message for saner threads in the mailing list archives.

Cheers,

Edward


P. S.  Actually, I might take some of this text into the relax manual
:)



--
Paul Schanda, Ph.D.
Biomolecular NMR group
Institut de Biologie Structurale Jean-Pierre Ebel (IBS)
41, rue Jules Horowitz
F-38027 Grenoble
France
+33 438 78 95 55
paul.schanda@xxxxxx
http://www.ibs.fr/groups/biomolecular-nmr-spectroscopy?lang=en


--
Paul Schanda, Ph.D.
Biomolecular NMR group
Institut de Biologie Structurale Jean-Pierre Ebel (IBS)
41, rue Jules Horowitz
F-38027 Grenoble
France
+33 438 78 95 55
paul.schanda@xxxxxx
http://www.ibs.fr/groups/biomolecular-nmr-spectroscopy?lang=en



--
Paul Schanda, Ph.D.
Biomolecular NMR group
Institut de Biologie Structurale Jean-Pierre Ebel (IBS)
41, rue Jules Horowitz
F-38027 Grenoble
France
+33 438 78 95 55
paul.schanda@xxxxxx
http://www.ibs.fr/groups/biomolecular-nmr-spectroscopy?lang=en

Re: Implementation of the Schanda, Lescanne and Marion numeric dispersion model code in relax.

Header

Content

Related Messages