mailRe: Extremely long optimization times


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on September 17, 2007 - 23:29:
Hi,

On 9/17/07, Sebastien Morin <sebastien.morin.1@xxxxxxxxx> wrote:

 Hi Ed,

 First, there were some bad assignments in my data set. I used the automatic
assignment (which takes an assigned peak list and propagates it to other
peak lists) procedure within NMRPipe for the first time and some peaks were
badly assigned.

Although a problem because of the bond vector orientation, the effect
of this should not be long computation times just incorrect internal
motions.


 Second, the PDB file is quite good as it is a representative conformation
from a 60 ns MD simulation using CHARMM. That said, the protein moves in the
simulation and, hence, the orientations also change. I could take another
conformation, which is what I'll do to cross-validate my models, but
nevertheless the orientations will change and subtil changes will appear.
This shouldn't be an issue since the vectors that move a lot in the
simulations should have correlating relaxation properties and that should be
seen in the models chosen.

The orientation changes should only affect the Euler angle values of
the diffusion tensor.  Nothing else should be affected by this.  The
internal motions of the simulation will affect the results of the
analysis, but the overall orientation really doesn't matter unless you
are comparing these Euler angles.


 Third, here are the stats for the ellipsoid optimization :

 round  t_total_(h)  t_opt_(h)  iter_opt  model_change  tm       a     b
 g      chi2                  comments
 =====  ===========  =========  ========  ============  ======   ====  =====
 ====   ==================    =======================
  1     146          144        207       ---           12.423   18.8  159.7
 99.1   9282.2280010132217    ok
  2      49           47         62       215           12.463   74.7  152.0
 94.3   8793.0777454789404    ok
  3      16           14         19        16           12.448   78.0  152.3
 96.9   8767.5325004348124    ok
  4      12           10         13         1           12.445   80.2  151.9
 97.9   8765.5659442063006    ok
  5      19           17         23         2           12.445   83.1  151.7
 98.3   8761.0001889287214    ok
  6      25           23         27         1           12.452   80.9  151.4
 96.2   8744.6870170285692    ok
  7      16           14         19         1           12.445   83.1  151.7
 98.3   8761.0001889287269    almost_5
  8      25           23         28         1           12.452   80.9  151.4
 96.2   8744.6870170285729    almost_6
  9      14           12         17         1           12.445   83.1  151.7
 98.3   8761.0001889287269    almost_5_and_exactly_7
 10      29           27         33         1           12.452   80.9  151.4
 96.2   8744.6870170285656    almost_6_and_8
 11      stopped...................................

Are these states from the results in the 'opt' directories?  Can you
possibly pin-point where in the calculation the problem is?  One
option is to increase the verbosity flag 'print_flag' in the
minimise() user function.  This may help in seeing the problem.


 As you can see, there is a kind of interchange between two runs in the end
of the optimization. In fact, from the iteration 5 on, there is only one
residue for which the model is changing, it's always the same. It changes
from model 5 to 6 and 6 to 5... with a tf of ~17, a ts of ~25000 and a S2 of
~0.73 (chi2 ~40 in aic file, but then with ts ~ 1200) when with model 6 and
ts of ~650 and S2 of ~0.78 when with model 5 (chi2 ~50 in aic file). How
come a so high ts (25000) isn't eliminated..?

In mathematical modelling, model elimination or model validation must
occur prior to the model selection step.  This is when ts is at ~1.2
ns, and hence the model is not eliminated.  The final optimisation is
shifting ts up to 25 ns, and this is likely to be the thing causing
the optimisation to take soooo long!  Is there something particular
with this residue?

The iteration numbers are low, but these may be the number of
iterations of the method of multipliers algorithm.  For each iteration
there could possibly be thousands of steps of the Newton subalgorithm.
 I can't remember how the iteration number is generated, but the
print_flag option may show if this is the case.


 round   AIC_or_OPT  model   S2    S2f   S2s   tf      ts      chi2
 =====   ==========  =====   ===   ====  ====  ======  ======  =========
  9      AIC         5       0.78  0.96  0.81  None      698   52
 10      AIC         6       0.78  0.97  0.80  11.2     1173   39
  9      OPT         5       0.78  0.96  0.81  None      630   ---
 10      OPT         6       0.73  0.93  0.79  16.8    24904   ---


 Fourth, the previous runs were made on 4 different computers which give
almost exactly the same calculation time, maybe differing from 10-15 %...
This shouldn't be what's causing those so extremely long times...

This is unlikely to be the problem, but I was just wondering in case
there was an operating system or platform specific bug possibly in the
Numeric code.


 Fifth, I used the default algorithm whithin the full_analysis.py script.
How can I change the optimization algorithm so it's a two stage procedure
like you proposed ? Should I run several times with MIN_ALGOR = 'simplex'
and, after a few runs (maybe when the chi2 and number of iterations get to a
plateau) switch to MIN_ALGOR = 'newton' ?

Simply have two lines, one after the other, in the code where the
minimise() user function is located.  I.e. in the current 1.2
repository line file 'full_analysis.py':

# Minimise all parameters.
minimise('simplex', run=name)
minimise(MIN_ALGOR, run=name)

# Write the results.
...


That should be enough to solve the problem (hopefully).

Cheers,

Edward



 I think that's almost everything I can find now...

 Let me know if you know how to catch those problems before they appear...

 Cheers


 Séb  :)






 Edward d'Auvergne wrote:
 Hi,

I've been trying to think of what could possibly be causing these
really long times, but I'm really not sure what is happening.
Unfortunately there just was not enough information in the post to
decipher the key to this problem. Is there something special about
those 7 residues? How accurate do you think their orientations are in
the PDB file you are using? And how accurate is the PDB file itself
with respect to all parts of the system?

Have you had a chance to investigate further as to what the issue
might be? For example, which part of the calculation is taking the
time? Is it the global optimisation of all parameters? Are the final
results of each round similar or completely different (selected model
wise and parameter value wise). How do the iteration numbers compare
at each stage. Essentially a fine analysis and comparison of the
results files and the printout from relax will be necessary to track
down this abnormal computation time. Oh, are you running these on the
same computer as the previous analysis?

As for the optimisation algorithm being stuck, if you've used the
default algorithm then this shouldn't happen. Optimisation should
terminate. There are certain very rare situations where the algorithm
known as the GMW Hessian modification, which is used by default as a
subalgorithm by the Newton algorithm in relax, can take large amounts
of time to complete. You'll see this as a increase in the number of
iterations by 4 to 5 orders of magnitude. One way to test this is to
use a lower quality optimisation algorithm first and then complete to
high precision with the Newton algorithm. In this case I would use
simplex first followed by the default Newton algorithm and its default
subalgorithms. In all cases constraints should be used. This will
only solve the long computation times if the GMW algorithm is at
fault.

Regards,

Edward


On 9/4/07, Sebastien Morin <sebastien.morin.1@xxxxxxxxx> wrote:


 Hi all,

I am using the full_analysis.py script with data a three magnetic fields.

After a first complete cycle (going through the final optimization), I
realized that a few residues had extremely high chi-squared values (>
1000) no matter the diffusion model or model-free model chosen...

So I removed those residues (7 out of 222) and started the full_analysis
protocole again.

However, the optimization times are now extremely long and I should get
the final results in weeks...


Here are the available times (for local_tm, sphere and ellipsoid) :


Diffusion_model Round Time-before_N=222 X2
Time-now_N=215 X2
=============== ===== ================= =======
============== =======
local_tm --- 12h30 45949
14h30 5802 OK, X2 much smaller

sphere init --- 1154338 ---
 249255
 1 2h30 65654 36h00
 10303 Long, but X2 much smaller
 2 2h30 65654 > 30h00

ellipsoid init --- 753535
--- 177764
 1 4h00 64592 >
67h00 ??
 2 2h30 64592
not_there_yet

Is it possible that the algorithms get stuck somewhere during the
optimization..?

I thought that removing badly fit residues would, on the contrary, speed
up calculations...

Thanks for ideas !


Sébastien :)

--
 ______________________________________
 _______________________________________________
 | |
 || Sebastien Morin ||
 ||| Etudiant au PhD en biochimie |||
 |||| Laboratoire de resonance magnetique nucleaire ||||
||||| Dr Stephane Gagne |||||
 |||| CREFSIP (Universite Laval, Quebec, CANADA) ||||
 ||| 1-418-656-2131 #4530 |||
 || ||
 |_______________________________________________|
 ______________________________________



_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users





 --
 ______________________________________
 _______________________________________________
 | |
 || Sebastien Morin ||
 ||| Etudiant au PhD en biochimie |||
 |||| Laboratoire de resonance magnetique nucleaire ||||
||||| Dr Stephane Gagne |||||
 |||| CREFSIP (Universite Laval, Quebec, CANADA) ||||
 ||| 1-418-656-2131 #4530 |||
 || ||
 |_______________________________________________|
 ______________________________________





Related Messages


Powered by MHonArc, Updated Tue Sep 18 01:21:15 2007