mailRe: influence of pdb orientation on model-free optimization?


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Douglas Kojetin on January 10, 2008 - 20:13:
Hi Edward,

Thanks for the response. So, with 5 relaxation data sets, only tm8 should be removed -- no need to remove m8 as well? Also, if only 4 relaxation data sets were available, could {tm6-8 and m8} be removed to use the full_analysis.py protocol?

Thanks,
Doug


On Jan 10, 2008, at 1:31 PM, Edward d'Auvergne wrote:

Hi,

If you have 5 relaxation data sets, you can use the full_analysis.py
script but you will need to remove model tm8.  This is the only model
with 6 parameters and doing the analysis without it might just work
(the other tm0 to tm9 models may compensate adequately).

I've looked at the script and it seems fine.  I think the issue is
that the model-free problem is not simply an optimisation issue.  It
is the simultaneous combination of global optimisation (mathematics)
with model selection (statistics).  You are not searching for the
global minimum in one space, as in a normal optimisation problem, but
for the global minimum across and enormous number of spaces
simultaneously.  I formulated the totality of this problem using set
theory here http://www.rsc.org/Publishing/Journals/MB/article.asp? doi=b702202f
or in my PhD thesis at
http://eprints.infodiv.unimelb.edu.au/archive/00002799/.  In your
script, the CONV_LOOP flag allows you to automatically loop over many
global optimisations.  Each iteration of the loop is the mathematical
optimisation part.  But the entire loop itself allows for the sliding
between these different spaces.  Note that this is a very, very
complex problem involving huge numbers spaces or universes, each of
which consists of a large number of dimensions.  There was a mistake
in my Molecular BioSystems paper in that the number of spaces is
really equal to n*m^l where n is the number of diffusion models, m is
the number of model-free models (10 if you use m0 to m9), and l is the
number of spin systems.  So if you have 200 residues, the number of
spaces is on the order of 10 to the power of 200.  The number of
dimensions for this system is on the order of 10^2 to 10^3.  So the
problem is to find the 'best' minimum in 10^200 spaces, each
consisting of 10^2 to 10^3 dimensions (the universal solution or the
solution in the universal set).  The problem is just a little more
complex than most people think!!!

So, my opinion of the problem is that the starting position of one of
the 2 solutions is not good.  In one (or maybe both) you are stuck in
the wrong universe (out of billions of billions of billions of
billions....).  And you can't slide out of that universe using the
looping procedure in your script.  That's why I designed the new
model-free analysis protocol used by the full_analysis.py script
(http://www.springerlink.com/content/u170k174t805r344/? p=23cf5337c42e457abe3e5a1aeb38c520&pi=3
or the thesis again).  The aim of this new protocol is so that you
start in a universe much closer to the one with the universal solution
that you can ever get with the initial diffusion tensor estimate.
Then you can easily slide, in less than 20 iterations, to the
universal solution using the looping procedure.  For a published
example of this type of failure, see the section titled "Failure of
the diffusion seeded paradigm" in the previous link to the
"Optimisation of NMR dynamic models II" paper.

Does this description make sense?  Does it answer all your questions?

Regards,

Edward



On Jan 10, 2008 5:49 PM, Douglas Kojetin <douglas.kojetin@xxxxxxxxx> wrote:
Hi All,

I am working with five relaxation data sets (r1, r2 and noe at 400
MHz; r1 and r2 and 600 MHz), and therefore cannot use the
full_analysis.py protocol.  I have obtained estimates  for tm,
Dratio, theta and phi using Art Palmer's quadric_diffusion program.
I modified the full_analysis.py protocol to optimize a prolate tensor
using these estimates (attached file: mod.py).  I have performed the
optimization of the prolate tensor using either (1) my original
structure or (2) the same structure rotated and translated by the
quadric_diffusion program.  It seems that relax does not converge to
a single global optimum, as different values of tm, Da, theta and phi
are reported.

Using my original structure:
#tm = 6.00721299718e-09
#Da = 14256303.3975
#theta = 11.127323614211441
#phi = 62.250251959733312

Using the rotated/translated structure by the quadric_diffusion program:
#tm = 5.84350638161e-09
#Da = 11626835.475
#theta = 8.4006873071400197
#phi = 113.6068898953142

The only difference between the two calculations is the orientation
of the input PDB structure file.  For another set of five rates
(different protein), there is a >0.3 ns difference in the converged
tm values.

Is my modified protocol (in mod.py) setup properly?  Or is this a
more complex issue in the global optimization?  In previous attempts,
I've also noticed that separate runs with differences in the
estimates for Dratio, theta and phi also converge to different
optimized diffusion tensor variables.

Doug


_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users






Related Messages


Powered by MHonArc, Updated Thu Jan 10 21:02:04 2008