mailRe: [bug #22730] Model-free auto-analysis - relax stops and quits at the polate step.


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on October 21, 2014 - 10:36:
Hi Olena,

Please see below:

The problem with the repeating spherical diffusion tensor initialisation 
has disappeared right from the moment I updated the relax version. Also, 
the problem of stopping of the calculation has not been happening anymore.

This problem can only be caused by the operating system, as there are
two sequential actions in the protocol here.  Save the results file,
then check that it exists.  This is specifically to allow for
continuing calculations that were prematurely interrupted.  The
problem you saw was that the check for the file was infinitely
failing, meaning that the results file was probably not saved.  This
can happen if there is not enough room left on the disk, or if
something strange happened with the file system preventing relax from
seeing and reading the file.  I have modified the relax source code
(in the relax trunk,
http://www.nmr-relax.com/download.html#Source_code_repository) to now
better handle file system strangeness.  If this problem returns,
please create a new bug report for this so we can isolate the problem
and fix it (https://gna.org/bugs/?func=additem&group=relax).


Last calculation reached the point of prolate up to 28 round, which seemed 
to me strange and I stopped it.

If you had the limit of 30, then it would have finished soon anyway
and have moved onto the next model.  You can decrease this limit if
you wish.


At the moment I do not know where could be a problem. We performed 
temperature calibration prior each epxeriment, as you suggest. Also we used 
signle scan interleaving approach. Therefore, unless it is possible to see 
from the acquired data, I do not think this is the reason, because, as I 
explained, this point has been considered.

If this is the case, then it should not be the quality of the data!


The protein I am studying is a homodimer, composed of the two head-to-head 
dimerised C-terminal domains to which the N-terminal domains are attached 
via flexible linkers.

This could be the problem.  The A-BB-A domain system will not tumble
as a simple sphere, spheroid, or ellipsoid.  In this case, the
standard diffusion tensors are just an approximation to a much more
complicated Brownian tumbling.


As this is a homodimer, I see the number of 15N-1H HSQC peaks corresponding 
for only a monomer.

That is to be expected.


To study this protein as a dimer (as it is) I duplicated the data in the 
final DC files, in accordance with the pdb file of the dimer.

Due to symmetry, this should hopefully give identical results compared
to analysing a single domain.  The key is if the domain symmetry lines
up with the Brownian diffusion tensor and its symmetry.  I guess that
is the aim.


My purpose is to run model-free analysis for the entire protein. But also I 
am interested to analyse the domains separately, so I began with 
calculations for the N-terminal part of it, cutting away the rest of the 
data. Exactly this part I have sent you. Do you think that could cause the 
problem? Then I should just run Model-free analysis for the entire protein, 
and see if it will stay within 2 to 15 rounds.

If there are domain motions between the two domains in one unit then,
as I mentioned above, the simple Brownian diffusion tensors are only a
rough approximation.  You will be able to fit the data, but it can be
problematic.  The residual components of the complex diffusion not
modelled by the simple single tensor will appear as either artificial
Rex or nanosecond motions (see my Mol. Biosyst. review at
http://dx.doi.org/10.1039/b702202f).  These artificial motions could
cause the global optimisation algorithm to take a huge number of
rounds to converge.  In any case, you should compare to the global
'local tm' model to try to judge if the internal motions are real or
not.  With such complexity in the system, you must understand the
consequences and side effects to be able to strongly interpret the
final results.  And you should not worry if optimisation takes a huge
amount of time, as the current level of model-free theory is not
advanced enough for such internal domain motions.


As for the structure quality. What kind of parameters should be considered 
in this case? Do you see the problems from the input data I have sent you?

If some bond vectors are shifted from the solution average
orientation, then for these you will see artificial Rex or nanosecond
motions.  Which of these effects is determined by the bias of the
vector orientation with respect to the diffusion tensor, and if an
under or over estimation of tm occurs.  See the review for a detailed
explanation.

Regards,

Edward


Thanks for your effort.

Regards,
Olena
________________________________________
From: edward.dauvergne@xxxxxxxxx [edward.dauvergne@xxxxxxxxx] on behalf of 
Edward d'Auvergne [edward@xxxxxxxxxxxxx]
Sent: 20 October 2014 17:02
To: Edward d'Auvergne
Cc: Olena Dobrovolska; relax-devel@xxxxxxx; Francesco Musiani
Subject: Re: [bug #22730] Model-free auto-analysis - relax stops and quits 
at the polate step.

Hi Olena,

I was wondering if you had seen the repeating spherical diffusion
tensor initialisation problem again, or if this problem has
disappeared?  The bug report (https://gna.org/bugs/?22730) is that the
analysis just stopped.  Have you seen this stopping again in your
current calculations?  You should note that in the GUI, there is a
default maximum of 30 rounds of optimisation.  If you have stopped at
this limit, then that is ok.  It could mean one of a few things:

- There is a problem with the data.  This sometimes happens when not
using the best per-experiment temperature calibration and
per-experiment temperature control methods
(http://www.nmr-relax.com/manual/Temperature_control_calibration.html).
You should carefully note the points in that link.  Often the single
90 pulse MeOH calibration used to determine the difference between the
VT unit and the spectrometer is not good enough for relaxation data.

- The affected diffusion tensor is not a good description of your
system.  This happens when the molecule is not tumbling as a single
rigid body.  I.e. there are freely moving domains.  In this case, the
single diffusion tensor is a poor estimate of the real global motion
of the system.  In such cases you should be very careful about
artificial Rex or nanosecond motions (see my 2007 Mol. Biosyst. paper
at http://dx.doi.org/10.1039/b702202f for a review about this
problem).  If you don't see the motion in the noisy local tm global
model, then it is likely to not be real.

- The input structure is of low quality.

Normally you should be finished between 2 to 15 rounds of
optimisation, and anything more is an indication that something is not
quite right.  I look forward to hearing what the current status of
this problem is.

Cheers,

Edward


P. S.  Note that this is on the public and permanently archived
relax-devel mailing list and not the bug report.  Your response will
be accepted by the mailing list moderator after a certain while.



On 15 October 2014 17:31, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
Hi Olena,

I will reply on the relax-devel mailing list, as it is a bit
ridiculous to have a long conversation on the bug tracker.  I have
received your private messages with the Input_state.bz2,
analysis.log.bz2, and analysis.log.7z files.  These are not attached,
so they will remain private.  Please reply to this message rather than
continuing the conversation in the bug tracker.

I have now run the analysis up to round_2 of the spherical diffusion
tensor.  I stopped here, as that is where the bug reports do not
match.  The problem I saw previously in your log file 'analysis_first
part.txt' (at https://gna.org/support/download.php?file_id=22627) is
that the spherical diffusion tensor initial optimisation had been
executed 42085 times (https://gna.org/bugs/index.php?22730#comment9).
But in your new analysis.log file, this is not the case.  It is only
executed once:

[edward@localhost bug_22730]$ grep -c "^Function value:   200507.263"
*.txt *.log
analysis_first part.txt:42085
analysis_last10000lines.log:70
analysis.log:1

So here we now have two completely unrelated problems!  The first
problem reported (https://gna.org/support/download.php?file_id=22627)
is not replicated in your current log file or in my analysis attempt.
Do you know why?  This massive repetition looks like what happens when
the 'mpirun' command line program is incorrectly used or incorrectly
set up.  Could that be the cause?  Can you replicate this
super-massive repetition problem?  Or could it be that your 7z
compression software somehow corrupted the log file?  Could this be a
simple mistake?  This is where I am quite confused, as it should be
impossible with relax's source code for this infinite repetition to
occur.  The only case is if there are strange things happening with
your file system so that relax cannot detect the file.  That is why I
asked you if your home directory was NFS, SMB, or some other network
drive.

There is one other case I can think of here, and that is if your file
system no longer had any room.  Then relax will not be able to save
the results file, hence it will not be able to detect it and then it
will repeat the last optimisation round forever.  Did you have such a
problem?  I have not written the dauvergne_protocol auto-analysis to
handle this situation, as I assumed that a Python Error would stop the
analysis.  That may not be the case when running with OpenMPI.

I would like to solve this infinite repetition first, as that has
taken up a lot of my time already.  This is an impasse which prevents
the rest of the analysis from ever occurring.  I.e. you cannot get to
the prolate model.  Then I'll look and see if there is another issue
affecting your data.

Cheers,

Edward


On 15 October 2014 10:34, anonymous <NO-REPLY.INVALID-ADDRESS@xxxxxxx> 
wrote:
Follow-up Comment #12, bug #22730 (project relax):

Hi,

I have no idea why the information containing the data and its 
specification
is missing. I follow the protocol loading the DC files (after I load the 
H and
N spins from the pdb file), specifying all the parameters (temperature 
control
etc.). Then prior the execution, I save the input as input_state.bz2 as
recommended. Perhaps, last time I opened the input_state.bz2 file, which
perhaps did not load the data? But then why did it work?

Anyway, this time, I upgraded Relax version to 3.3.1 and load the files 
once
again from the scratch. I would appreciate if you have a look at the log 
file
which, being compressed to 1.8 Mb, I uploaded here
https://mega.co.nz/#fm/HI4ShI7T

The Relax calculations are currently at the 'prolate' stage. But the last 
file
was generated at 4 pm yesterday.
Should it be like that? If you think there could be a problem with the 
input
data, I can send it to you as well via private email.

Would appreciate if you help me to find and solve the problem.

Regards,
Olena

    _______________________________________________________

Reply to this item at:

  <http://gna.org/bugs/?22730>

_______________________________________________
  Message sent via/by Gna!
  http://gna.org/




Related Messages


Powered by MHonArc, Updated Wed Oct 22 08:40:15 2014