Re: Model selection and local_tm -- April 02, 2008

Hi again.

On Mon, 2008-03-31 at 11:28, Edward d'Auvergne wrote:

Hi,

Please see below for my replies.

On Fri, Mar 28, 2008 at 2:59 PM, Carl Diehl <Carl.Diehl@xxxxxxxxx> wrote:

Hi.
 I used the full_analysis.py script for testing and evaluating relax in
 comparison with published data.

 The system was calcium-loaded Calbindin D9k, with R1 & NOE at 600 MHz
 and R2, R1 & NOE at 500 MHz. The relaxation data is of very high
 quality. The model selection was done using home-written software, so no
 ordinary model selection à Modelfree.


What do you mean by model selection?  Did you use a different
technique from the statistical field of mathematical modelling and
model selection?  Did you use a frequentist method, a Bayesian method,
or hypothesis testing methods (the last of which is described in
textbooks from this field of knowledge as being very, very bad)?  Did
you use this to select between model-free models, diffusion models, or
the combined global model (model-free + diffusion)?

To clarify, in the example above I've only used published relaxation
values for D9k (Johan Kördel et al, Biochemistry 1992, 31, 4856-4866).
This is a pretty old article and I'm mainly using the data for comparing
and validating relax vs Modelfree. In other words none of the original
model selection was done by me. 
The diffusion model was isotropic.  
The model selection was done by first optimising the global sum of
squares vs model m2.
Residues with a R2/R1 ratio below the average value were optimised with
model m5.

 After running full_analysis.py (removing excess models),


By 'removing excess models', do you mean eliminating failed models?
What is an excess of models?

Models with more parameters than the number of experimental values.
MF_MODELS = ['m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9']
LOCAL_TM_MODELS = ['tm0', 'tm1', 'tm2', 'tm3', 'tm4', 'tm5', 'tm6',
'tm7', 'tm9']

model selection
gives me local_tm as the best model for the diffusion tensor.

Previous

calculations (both 15N and 13C relaxation data) indicates that the
diffusion tensor is to a good approximation, isotropic (only a very
 slight anistropy).

What does 'good', 'slight', etc. really mean?  For a mathematical
modelling perspective, I don't understand these.  Is a Da of
1.001+/-0.001 slight, or 1.2+/-0.01, or 1.6+/-0.3?  From my
experience, my opinion is that unless all the bond vectors point in
the same direction, isotropy will never be statistically significant
over the spheroids and ellipsoids.

Dpar/Dper = 1.08 +/- 0.001. The axially symmetric diffusion model is 
statistically significant compared to the isotropic model as you say (after 
digging up the paper).
An anistropy of 1.08 gives only a small effect on the relaxation rates
(R1 +/- 0.05, ~R2 +/- ~0.25, NOE +/- ~0.01), an isotropic diffusion
tensor is therefore a good approximation for most calculations.

 Looking at the local_tm values for the secondary structure, most of them
 have a local_tm which is similar to the isotropic tensor.


If you used AIC or BIC model selection between the two models, what
are the chi-squared values, criteria values, and parameter numbers for
each model?  In the test, did you compare the local_tm model to the
isotropic model?  Or did you compare the local_tm model to the
isotropic, 2 spheroids, and ellipsoid simultaneously?  Again, what is
the qualifier most?  And how do the non-conforming residues not
conform?

I used the full_analysis.py script and ran all diffusion models until
convergence according to protocol within the script. 
I will have to get back to you with the exact values of the model
selection.

 Is the local_tm model always correct? For a well folded protein, one
 would expect that the local_tm model should be invalid?


As this is mathematical modelling, there is no such thing as an
invalid model.  By definition of the term model, a model is an
approximation of something far more complex.  Therefore there is only
a grey scale of how good a model approximates reality (of course we
can never know what is reality).  For a folded, single domain,
globular protein (with no significant, floppy loops), then the sphere,
spheroids, or ellipsoid should be a good description and the local_tm
model will not be selected.  Could you reproduce the model selection
statistics for all 4?  If the local_tm model is chosen, then it is an
indication that something is not normal - quite possibly interesting
dynamics.  Unfortunately, there's not enough information in your post
for me to tell you exactly what happened.  One point that concerns me
is that you only have an R2 measurement at a single field strength.
Note that to differentiate between chemical exchange effects (which
are scaled quadratically with field strength) and internal nanosecond
motions (constant at different fields) and anisotropic tumbling of the
molecule (again constant), you really need the R2 collected at 2
fields.  But this may not necessarily be the reason you're seeing the
local_tm model being selected.  I'm sorry that I am not yet able to
give you a clear answer.

Chemical exchange is not an issue for D9k. D9k has floppy ends and a
loop in the middle, the rest of the sequence is rigid (S2 ~ 0.8-0.9).
Instead of measuring R2 at two fields, one option is to measure 1H-15N
dipolar/CSA transverse and longitudinal crossrelaxation and extract R2
without chemical exchange. The Kay group also published an article
recently which allows measurements of R2 without chemical exchange.

BTW, is it possible to incorporate the above types of relaxation rates
as another experimental value to fit against. 

Regards 
-- 
Carl Diehl
Department of Biophysical Chemistry
Lund University
046-2220384

Re: Model selection and local_tm

Header

Content

Related Messages