mailRe: Convergence of the full_analysis.py script.


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on May 08, 2007 - 10:10:
Hi,

I should probably add a line to the script's documentation to say that
data at minimally two field strengths is required.  Unfortunately
models tm4 = {tm, S2, te, Rex} and tm5 = {tm, S2, S2f, ts} are each
composed of 4 parameters.  Therefore only models tm0, tm1, tm2, tm3,
and tm9 are available for single field strength data.  As there are no
two timescale model-free models (Clore et al., 1990) then if you have
nanosecond motions these will be absorbed into the diffusion tensor
causing an underestimation of the global correlation time.  The
techniques employed by the new protocol will not work.

With single field strength data, you will need to use the classic
approach of starting with an initial estimate of the diffusion tensor.
The program Tensor is useful for this (although relax could be used
instead).  Then you fit the model-free models using a script similar
to 'mf_multimodel.py' removing the models with greater than 4
parameters.  The model elimination followed by model selection is used
to select the best model-free model for the spin system (see the
'modsel.py' script).  Finally the diffusion tensor together with all
model-free parameters are optimised together (no sample script exists
for this, although this is not too difficult to do).  Using this
optimised diffusion tensor, then you repeat the steps of:  model-free
optimisation, model-free model elimination, and model-free model
selection.  This is done separately for the four diffusion tensors
(sphere, prolate spheroid, oblate spheroid, and ellipsoid).  Model
selection is used to select between these, and then the very last step
is Monte Carlo simulations.

Even if this approach is taken I would still recommend using data at
more than a single field strength.  This allows you to more easily
differentiate between artefactual Rex and nanosecond motions from true
internal motions in your system.  If you system is quite rigid then it
is less of a problem but if it is exhibiting interesting dynamics,
then additional data is very useful.  Oh and temperate calibration
using MeOH, etc is important not only between different experiments
but between different magnets.

Cheers,

Edward



On 5/8/07, Clare-Louise Evans <pcxcle@xxxxxxxxxxxxxxxx> wrote:
Hi,

Thanks for your responses.  I didn't realise full_analysis.py was
insufficent for single field data, opps!

You mentioned in reply to my other message
(https://mail.gna.org/public/relax-users/2007-04/msg00006.html), that
performing model-free using models m1 and m5 was still ok for single
field data, although there could still be problems.  Currently, I've
only got access to a single field, can I simply edit the
full_analysis.py script to only run the optimisations using tm1 - tm5
and m1 - m5, or is it more complex than that?  Is the model-free.py ok
for use with single-field data?

Many thanks
Clare

Edward d'Auvergne wrote:
> Hi,
>
> As optimisation should converge very quickly when the differences
> between iterations is at such low significant figures, I'm unsure as to
> what is causing the problem.  For the spheroidal and ellipsoidal
> diffusion tensors convergence should occur between 10-15 iterations, and
> maximally after 20.  The reason could be because the script is trapped
> jumping between the same two models for each alternate iteration.  I
> have a feeling though that this is an issue caused by over-fitting and
> the parameter values are just drifting aimlessly through the parameter
> space.  Actually, thinking about this, I can almost guarantee that that
> is what is happening.  The data at only a single field strength
> (https://mail.gna.org/public/relax-users/2007-04/msg00006.html) is
> insufficient for the full_analysis.py script.  Sorry for not responding
> earlier.
>
> Cheers,
>
> Edward
>
>
>
> On Mon, 2007-04-30 at 15:11 +0100, Clare-Louise Evans wrote:
>
>> Dear Edward,
>>
>> Sorry for the thread hijack, but I noticed in the PS section of the
>> quote Hongyan has included below your comment regarding convergence.
>>
>> I am having problems with convergence at the moment when running the
>> full_analysis.py script.  I have run the optimisation of the diffusion
>> models on the same PC.  For the oblate, prolate and ellipsoid models I
>> have failed to reach convergence after 30+ rounds.  The sphere model
>> converged within 5 rounds.  However, when I look at the output the only
>> difference between the chi-squared and other parameter values is in the
>> 10th and later decimal places.  Surely, with having such small
>> differences between the values they can be considered to have
>> converged.  However, in your comment you state they have to be
>> identical.  If this really is the case then I'm not sure how to proceed
>> given that I'm failing to reach convergence on these 3 models?
>>
>> Apologies if this ties in to the question I asked yesterday regarding
>> full_analysis.py.
>>
>> Kind regards
>> Clare
>>
>> Hongyan Li wrote:
>>
>>> Dear Edward,
>>> Thanks for your early suggestion regarding different model selections. I 
have
>>> been busy on other stuff and only recently got time to focus on this 
subject
>>> again.
>>> I have run all the diffusion models e.g. isotropic, prolate, oblate and
>>> ellipsoid and model selections were made within each model (in aic 
directory).
>>> I am not sure how to write a script to select different models using
>>> 'model_selection()'. Could you please do me a favor?
>>> Your help is highly appreciated!
>>> Best wishes,
>>> Hongyan
>>>
>>> Quoting Edward d'Auvergne <edward.dauvergne@xxxxxxxxx>:
>>>
>>>
>>>
>>>> Hi,
>>>>
>>>> To compare the results what you need to employ is a technique from the
>>>> statistical field of model selection.  The spherical diffusion
>>>> (isotropic) + all model-free models of all selected residues is one
>>>> single mathematical model.  The prolate and oblate spheroids (prolate
>>>> and oblate axially symmetric anisotropic diffusion tensors) + all
>>>> model-free models, and the ellipsoid (fully anisotropic or three
>>>> different eigenvalues) + all model-free models, are three additional
>>>> mathematical models.  Therefore to compare these four different models
>>>> you need to select the model which best represents your relaxation
>>>> data.  These models are, however, not nested and therefore cannot be
>>>> compared using ANOVA F-tests!  Firstly the three types of diffusion
>>>> tensor are not nested (there is a reference from Dominique Marion's
>>>> group in which they say ANOVA statistics cannot be used but I can't
>>>> find it at the moment (although it shouldn't be too hard to track
>>>> down, it's related to Tensor)).  Secondly the model-free models
>>>> selected will be different between the four models.  Hence chi-squared
>>>> and F-tests cannot be used.
>>>>
>>>> A useful reference (I'm not at all biased ;) for this problem is my
>>>> paper d'Auvergne, E. J. and Gooley, P. R. (2003), (see
>>>> http://www.nmr-relax.com/refs.html for the full reference).  On page
>>>> 37 at the end of that paper I discuss how AIC model selection is
>>>> perfect for selecting between these non-nested models.  The AIC
>>>> criterion is still
>>>>
>>>> AIC = chi2 + 2k,
>>>>
>>>> however chi2 is the minimised chi-squared value for the complete model
>>>> and k is the sum of the number of diffusion parameters and number of
>>>> model-free parameters for all spin systems.  BIC model selection is
>>>> likely to work quite well as well.  If you have four runs, one for
>>>> each of the diffusion models, then the relax user function
>>>> 'model_selection()' is designed to select between these models.  I
>>>> hope this helps and hasn't been too biased.
>>>>
>>>> Cheers,
>>>>
>>>> Edward
>>>>
>>>>
>>>> P.S.  Prior to model selection between the diffusion models, the
>>>> diffusion models must have fully converged.  Multiple iterations of
>>>> optimisation of the model-free models, AIC model selection, and
>>>> optimisation of all parameters together (diffusion tensor + model-free
>>>> parameter of all residues) must be executed.  Convergence is when two
>>>> iterations possess identical chi-squared values, identical model-free
>>>> models, and identical parameter values.
>>>>
>>>>
>>>> On 3/2/07, Hongyan Li <hylichem@xxxxxxxxxxxx> wrote:
>>>>
>>>>
>>>>> Dear relax users,
>>>>> I have managed to use Relax to run my dynamics data by both isotropic 
and
>>>>> axial-oblate models. My qestion is how to compare the results, by
>>>>>
>>>>>
>>>> chi-square??
>>>>
>>>>
>>>>> what is the criteria to make judgment that which residues is beter
>>>>>
>>>>>
>>>> simulated
>>>>
>>>>
>>>>> with which model?
>>>>>
>>>>> Thanks for your kind help!
>>>>>
>>>>> Best wishes,
>>>>>
>>>>> Hongyan
>>>>>
>>>>> Dr. Hongyan Li
>>>>> Department of Chemistry
>>>>> The University of Hong Kong
>>>>> Pokfulam Road
>>>>> Hong Kong
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> relax (http://nmr-relax.com)
>>>>>
>>>>> This is the relax-users mailing list
>>>>> relax-users@xxxxxxx
>>>>>
>>>>> To unsubscribe from this list, get a password
>>>>> reminder, or change your subscription options,
>>>>> visit the list information page at
>>>>> https://mail.gna.org/listinfo/relax-users
>>>>>
>>>>>
>>>>>
>>> Dr. Hongyan Li
>>> Department of Chemistry
>>> The University of Hong Kong
>>> Pokfulam Road
>>> Hong Kong
>>>
>>>
>>> _______________________________________________
>>> relax (http://nmr-relax.com)
>>>
>>> This is the relax-users mailing list
>>> relax-users@xxxxxxx
>>>
>>> To unsubscribe from this list, get a password
>>> reminder, or change your subscription options,
>>> visit the list information page at
>>> https://mail.gna.org/listinfo/relax-users
>>>
>>>
>> This message has been checked for viruses but the contents of an attachment
>> may still contain software viruses, which could damage your computer 
system:
>> you are advised to perform your own checks. Email communications with the
>> University of Nottingham may be monitored as permitted by UK legislation.
>>
>>
>> _______________________________________________
>> relax (http://nmr-relax.com)
>>
>> This is the relax-users mailing list
>> relax-users@xxxxxxx
>>
>> To unsubscribe from this list, get a password
>> reminder, or change your subscription options,
>> visit the list information page at
>> https://mail.gna.org/listinfo/relax-users
>>
>
>

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.





Related Messages


Powered by MHonArc, Updated Tue May 15 12:40:28 2007