mailRe: Full analysis issue


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on May 27, 2008 - 14:31:
On Fri, May 9, 2008 at 5:45 PM, Sébastien Morin
<sebastien.morin.1@xxxxxxxxx> wrote:
Hi Ed,

First, thanks a lot for this help !

Second, I have to apologize for the length of this mail...


Ok...


My system is a 271 residue globular protein (230 residues with data at 3
fields = 2070 observables). An homologous protein is being studied in
the lab and analysing relaxation data using either the diffusion seeded
approach in ModelFree or the new protocol of the full_analysis script
yields similar results with a high mean S2 (~0.90) and a few Rex (15-20)
throughout the protein. Thus, the problem here with my system is
probably external to the approaches and the user...


Ok...


I tried using ModelFree with relax (script palmer.py : ModelFree as an
engine for optimization, but relax for automating and AIC model
selection) and got similar results than with the full_analysis.py
approach... For the two situations tested (see below), no oscillation
occured. Here are some stats :

=======================================================================
Approach        Diff     Iter  Chi2    AIC     Nb_Rex  <Rex>_+-_StdDev
==============  =======  ====  ======  ======  ======  ===============
palmer          prolate  15    ~12990  ~14060  182     1.602_+-_0.770

palmer_hybrid   prolate  12    ~ 2715  ~ 3660  129     0.902_+-_0.571

full            prolate   5    ~13090  ~14125  181     1.671_+-_0.782

full_hybrid     prolate   7    ~ 2750  ~ 3720  145     2.431_+-_1.546
=======================================================================

It seems that the new protocol is not the source of the problem.
Moreover, it is obvious from the AIC value (and also from the diffusion
tensor details, not shown here) that the hybrid (without the highly
flexible C-terminus) is a better description of the system. However, as
is seen here, the Rex values seem quite small and there are way too much
Rex (> 50 % of all residues)... These may thus be non significative, but
then, how can one exclude such "artifacts" when doing iterative
optimization (with either approach)..? How can one decide to choose
another model than with Rex when iterating to find the best diffusion
tensor..?

This Rex all over the place is an indication of trouble.  Especially
if the local tm models don't show the same Rex pattern!  This is
likely to be the artificial Rex values as described in Tjandra et al.,
1996 (also discussed in depth in d'Auvergne and Gooley, 2007 and 2008b
(http://dx.doi.org/10.1039/b702202f and
http://dx.doi.org/10.1007/s10858-007-9213-3)).  The problem is likely
that the diffusion tensor description is inadequate.


Ok...


Maybe, as you proposed, the problem arises because of the crystal
structure being inappropriate for describing the solution structure...
The crystal structure I use has a resolution of 1.95 A. Protons were not
visible but were added using CHARMM.  Moreover, different snapshots from
molecular mechanics in CHARMM were also tested to see if fluctuations in
NH bond orientation could yield better optimizations... It was not the case.

I'll try to assess this issue of the crystal structure by running tests
(with palmer.py and also full_analysis.py approaches) using a different
structure (a ponctual mutant) also from crystallography... The
resolution of this structure is also quite low (1.75 A). Anyway, I don't
have choice since no solution structure exists, neither better crystal
structures... If ever the crystal structure is the cause of this
problem, what can one do ? Is one obliged to do his analysis with a
local_tm or a sphere diffusion tensor ? Is it a waste if on does so with
good quality data at three fields ???

The underlying structure is only one of a few issues which may trigger
this problem.  I'll describe more below.  But if this structure is not
representative of solution conditions, then the local tm and spherical
diffusion models is all you can use.  Because of its construction, the
local tm model is highly sensitive to noise and can be quite unstable,
as can be seen in the plot of the tm value, and hence this model
significantly benefits from more and higher quality data.  This will
improve the dynamic description obtained from the local tm model, and
hence help any next steps in the analysis.


Ok...


What about the AIC for the local_tm model VS the ellipsoid in the
full_analysis approach ? Here are some stats :

=======================================================================
Approach     Models  Diff       AIC
===========  ======  =========  ======
full         m1-m5   local_tm   ~ 4510
full         m1-m5   ellipsoid  ~12710

full         m0-m9   local_tm   ~ 4410
full         m0-m9   ellipsoid  ~ 5210

Ok, this huge differential decrease in AIC values between the 2
diffusion models using the different sets of model-free models is
probably an indication that the ellipsoid diffusion model with
model-free models m0 to m9 is absorbing an artifact.  This is probably
artificial Rex or ts values absorbing the inadequacy of the diffusion
model, a bit like the artificial Rex of Tjandra et al., 1996 but with
the diffusion model being more complex than the ellipsoid because of
the C-terminus (rather than the artificial Rex when using spherical
diffusion rather than a spheroid or ellipsoid).


full_hybrid  m1-m5   local_tm   ~ 4510
full_hybrid  m1-m5   ellipsoid  ~ 4720 *

full_hybrid  m0-m9   local_tm   ~ 4410
full_hybrid  m0-m9   ellipsoid  ~ 4570 **

So excluding the C-terminal tail fixes this, but still the ellipsoid
is insufficient.


=======================================================================
*  not converged after 35 rounds (oscillates)
** not converged after 26 rounds (oscillates)

As said before, the hybrid improves the description of the diffusion,
however, there is still a problem : first, the local_tm diffusion is
still selected over the ellipsoid (even if the difference is now
smaller), second, the ellipsoid optimizations don't converge and
oscillate...

The oscillation is not really a problem.  But somehow I (or anyone
who's interested) should try to add a method or algorithm to relax to
detect this oscillatory swinging between different universes around
the 'universal' solution to stop any automated procedures.

The selection of the local tm says one thing - that the most complex
hybrid model using an ellipsoidal core is insufficient.  The diffusion
is probably more complex.  Along these lines, there is one issue which
would cause the diffusion tensor to be more complex than a simple
isolated particle tumbling as an ellipsoid (with no large concerted
internal motions such as inter-domain dynamics).  That is a phenomenon
first investigated in Schurr et al., 1994 at that back of that paper,
and that is partial dimerisation.  If you have 5% dimer in the NMR
tube, even a non-specific dimerisation, then this could cause the
problems you are seeing.  The diffusion tensor would then be a
superposition of 2 very different ellipsoidal diffusion tensors, say
D1 and D2 weighted by the populations p1 and p2 (the isotropic and
spheroid tensors could be simplifications of D1 and/or D2).  Then the
single ellipsoid would be insufficient.  The local tm diffusion model
could absorb p1.D1+p2.D2 into the single tm (very roughly considering
each vector experiences then up to 2*5 global correlation times) and
return a slightly better picture of the internal dynamics, and hence
be chosen by AIC model selection.  This has been investigated
elsewhere by looking at concentration dependence, but I can't remember
off the top of my head the references right now.

I'm not sure if this would have an effect, but maybe the large
movement of the C-terminus is modulating the diffusion of the core as
well, shifting it away from ellipsoidal behaviour.  There are many
other dynamic events which could cause the full single ellipsoid
equations to be insufficient.  Whatever is happening I think you are
walking on the cutting edge.  The theory you need for your current
data set does not exist, as far as I know, let alone has been properly
tested.

Oh, one other thing that it could be (although I think I remember you
saying that that wasn't the case already) is that a number of
different NMR samples were used, and that the protein and/or salt
concentration was not 100% identical in each.  Although not in your
case, this could also be caused by improper temperature calibration,
i.e. not using MeOH or another temp reference to calibrate different
experiments and different spectrometers, or temperature compensatory
blocks at the start of the R2, or single scan interleaving, etc.


Now, what about the Rex and slow motions (ts) in the local_tm diffusion
? Here are some stats :

=======================================================================
Approach     Models  Diff       Nb_Rex  Nb_ts
===========  ======  =========  ======  =====
full         m1-m5   local_tm    58      30
full         m1-m5   ellipsoid  171      21

full         m0-m9   local_tm    63      41
full         m0-m9   ellipsoid  144      49

full_hybrid  m1-m5   local_tm    58      30
full_hybrid  m1-m5   ellipsoid  142 *    28

full_hybrid  m0-m9   local_tm    64      41
full_hybrid  m0-m9   ellipsoid  145 **   50
=======================================================================
*  not converged after 35 rounds (oscillates)
** not converged after 26 rounds (oscillates)

Maybe the flexible tail is causing your sample to oscillate, swimming
around like a cork-screw, in the NMR tube ;)  Seriously though, the
oscillation isn't a worry but the Rex parameter count is probably
demonstrating that this Rex is artificial, caused by the full
diffusion description being insufficient.


As you can see, there are way more Rex in the ellipsoid, which probably
means that there is a problem with the diffusion tensor... For the slow
ns motions, there doesn't seem to be significantly more in the ellipsoid
description... Moreover, the sphere diffusion tensor which is not
NH-vector-orientation-dependent, also as a high degree of Rex, similar
ns motions and AIC values similar (just a bit higher) to what is
observed for the ellipsoid :

=======================================================================
Approach     Models  Diff       Nb_Rex  Nb_ts  AIC
===========  ======  =========  ======  =====  ======
full         m1-m5   sphere     191      20    ~15200

full         m0-m9   sphere     155      47    ~ 5640

full_hybrid  m1-m5   sphere     145      31    ~ 5190

full_hybrid  m0-m9   sphere     153      47    ~ 5030
=======================================================================

Should the sphere diffusion tensor yield similar results as the local_tm
? If there is a major difference between those two, does it mean that
concerted motions may be present and that an hybrid model could solve
the issue ?

No, the sphere should show artificial motions all over the place.  I
could guarantee that for all molecules, nothing tumbles truly as a
sphere!


Ok...


Now, are there concerted motions apparent from the local_tm results..? I
plotted the results from the local_tm run after aic model selection
(Would it be better if I'd look at the local_tm run for model 1 or 2
only ? Can model selection here bias the results ?) and couldn't find
any obvious link between different parts of the protein for one or more
parameters among S2, S2f, S2s, Rex, te, tf, ts, chi2.

I can't remember if anyone has tried to isolate concerted motions from
model-free results.  I wouldn't use the local tm values though as
these simply indicate the shape of the diffusion tensor (well very
roughly considering that the single tm mimics in reality a number of
global correlation times, i.e. 5 in the pure ellipsoid).


However, a small relation seems to exist for the local_tm distribution
and the domain (The inverse is seen for the S2, but to a lesser extent.
When looking at the tm1 run, the local_tm is also a bit smaller in the
same domain [a small difference of 0.5-1.0 ns for values of ~13 ns], but
the S2 are similar, which points to a difference for the two domains).

My protein is globular, but has two structural domains side by side, an
all alpha domain and an alpha/beta domain. In the homologous protein,
there seems to exist Rex at the interface (which spans a surface of four
10 residue beta strands, which is big and is expected to be quite
rigid). Maybe the two domains are a bit different in my system which
could cause the problems I encounter. I'll try to assess this by running
full_analysis runs on the different domains alone...

Because you have 2 domains, I would try the analysis like I did in
Horne et al., 2007 (http://dx.doi.org/10.1016/j.jmb.2007.05.067).
Maybe try a hybrid with the local tm values for the C-terminus, and 2
different diffusion tensors for each domain separately.  Maybe you
have inter-domain motions which would also explain why the single
ellipsoid is insufficient.  As long as the hybrid diffusion model
covers exactly the same spin systems as in all other models it is
compared to, you can construct highly complex models with relax.


Ok...


Well, I'm out of idea now... If you have any idea that could help, these
will be more than welcome !

I think I'm out of ideas too!  For now anyway.


I hope this discussion can also help other people solving difficulties
encountered in their analysis or help them get more information out of
their system...

Thanks a lot once more !

Cheers !



Sébastien


P.S. Again, sorry for the length of the mail...

That's not a problem.  I hope some of my suggestions will be useful.

Regards,

Edward



Related Messages


Powered by MHonArc, Updated Tue May 27 15:03:01 2008