Hi,
If you have 5 relaxation data sets, you can use the
full_analysis.py
script but you will need to remove model tm8. This is the only
model
with 6 parameters and doing the analysis without it might just
work
(the other tm0 to tm9 models may compensate adequately).
I've looked at the script and it seems fine. I think the issue is
that the model-free problem is not simply an optimisation
issue. It
is the simultaneous combination of global optimisation
(mathematics)
with model selection (statistics). You are not searching for the
global minimum in one space, as in a normal optimisation problem,
but
for the global minimum across and enormous number of spaces
simultaneously. I formulated the totality of this problem
using set
theory here http://www.rsc.org/Publishing/Journals/MB/article.asp?
doi=b702202f
or in my PhD thesis at
http://eprints.infodiv.unimelb.edu.au/archive/00002799/. In your
script, the CONV_LOOP flag allows you to automatically loop over
many
global optimisations. Each iteration of the loop is the
mathematical
optimisation part. But the entire loop itself allows for the
sliding
between these different spaces. Note that this is a very, very
complex problem involving huge numbers spaces or universes,
each of
which consists of a large number of dimensions. There was a
mistake
in my Molecular BioSystems paper in that the number of spaces is
really equal to n*m^l where n is the number of diffusion models,
m is
the number of model-free models (10 if you use m0 to m9), and l
is the
number of spin systems. So if you have 200 residues, the
number of
spaces is on the order of 10 to the power of 200. The number of
dimensions for this system is on the order of 10^2 to 10^3. So
the
problem is to find the 'best' minimum in 10^200 spaces, each
consisting of 10^2 to 10^3 dimensions (the universal solution
or the
solution in the universal set). The problem is just a little more
complex than most people think!!!
So, my opinion of the problem is that the starting position of
one of
the 2 solutions is not good. In one (or maybe both) you are
stuck in
the wrong universe (out of billions of billions of billions of
billions....). And you can't slide out of that universe using the
looping procedure in your script. That's why I designed the new
model-free analysis protocol used by the full_analysis.py script
(http://www.springerlink.com/content/u170k174t805r344/?
p=23cf5337c42e457abe3e5a1aeb38c520&pi=3
or the thesis again). The aim of this new protocol is so that you
start in a universe much closer to the one with the universal
solution
that you can ever get with the initial diffusion tensor estimate.
Then you can easily slide, in less than 20 iterations, to the
universal solution using the looping procedure. For a published
example of this type of failure, see the section titled
"Failure of
the diffusion seeded paradigm" in the previous link to the
"Optimisation of NMR dynamic models II" paper.
Does this description make sense? Does it answer all your
questions?
Regards,
Edward
On Jan 10, 2008 5:49 PM, Douglas Kojetin
<douglas.kojetin@xxxxxxxxx> wrote:
Hi All,
I am working with five relaxation data sets (r1, r2 and noe at
400
MHz; r1 and r2 and 600 MHz), and therefore cannot use the
full_analysis.py protocol. I have obtained estimates for tm,
Dratio, theta and phi using Art Palmer's quadric_diffusion
program.
I modified the full_analysis.py protocol to optimize a prolate
tensor
using these estimates (attached file: mod.py). I have performed
the
optimization of the prolate tensor using either (1) my original
structure or (2) the same structure rotated and translated by the
quadric_diffusion program. It seems that relax does not
converge to
a single global optimum, as different values of tm, Da, theta
and phi
are reported.
Using my original structure:
#tm = 6.00721299718e-09
#Da = 14256303.3975
#theta = 11.127323614211441
#phi = 62.250251959733312
Using the rotated/translated structure by the quadric_diffusion
program:
#tm = 5.84350638161e-09
#Da = 11626835.475
#theta = 8.4006873071400197
#phi = 113.6068898953142
The only difference between the two calculations is the
orientation
of the input PDB structure file. For another set of five rates
(different protein), there is a >0.3 ns difference in the
converged
tm values.
Is my modified protocol (in mod.py) setup properly? Or is this a
more complex issue in the global optimization? In previous
attempts,
I've also noticed that separate runs with differences in the
estimates for Dratio, theta and phi also converge to different
optimized diffusion tensor variables.
Doug
_______________________________________________
relax (http://nmr-relax.com)
This is the relax-users mailing list
relax-users@xxxxxxx
To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users