Re: Time of running, Model selection and global/cluster analysis for relaxation dispersion analysis -- June 13, 2013

Hi,

You can find this information using the GUI, programmatically or by
reading the XML results file.  The relax data store contains a special
hierarchical data structure for holding information about your spin
systems.  This is the molecule, residue, and spin data structure which
you will see when:

- Looking at the contents of the spin viewer window in the GUI,
- Opening a relax results or state XML file in a text editor,
- Looking at the cdp.mol[i].res[j].spin[k] data structure.

In each of these cases, look for the 'model' variable and its value
will be the selected model.  Programmatically you would use, for
example, the following relax code:

from pipe_control.mol_res_spin import spin_loop

state.load('my_state.bz2')
for spin, spin_id in spin_loop(return_id=True, skip_desel=True):
    print("Spin '%s', model '%s'" % (spin_id, spin.model))

You will then have a printed out list of all models per spin (though
without clustering information).  I hope this helps.

Regards,

Edward



On 13 June 2013 17:59, Troels Emtekær Linnet <tlinnet@xxxxxxxxx> wrote:

Hi Edward.

It certainly helped setting Monte Carlo simulations to 3, for the
auto-analysis.
Now I am down to half/full hour for all methods. :-)
Great!

But could you help me pin pointing how to find which model is selected?
I can't find any button in the GUI. Is it there?

I can do it by grep in the logfile:
grep -B 6 "The model from the data" LOGFILE.txt | head -n 20

The spin cluster ['#protein:3@N'].
# Data pipe    Num_params_(k)    Num_data_sets_(n)    Chi2         Criterion
No Rex         1                 14                   105.22792    107.22792
LM63           3                 14                   6.86584      12.86584
CR72           4                 14                   6.63718      14.63718
IT99           4                 14                   6.62371      14.62371
The model from the data pipe 'LM63' has been selected.
--
The spin cluster ['#protein:5@N'].
# Data pipe    Num_params_(k)    Num_data_sets_(n)    Chi2         Criterion
No Rex         1                 14                   102.30715    104.30715
LM63           3                 14                   2.28925      8.28925
CR72           4                 14                   2.28845      10.28845
IT99           4                 14                   2.41391      10.41391
...


But what if you run relax without the flag: -l LOGFILE.txt

???

Does the cdp class contain the information?

Best
Troels



Best
Troels


Troels Emtekær Linnet


2013/6/11 Edward d'Auvergne <edward@xxxxxxxxxxxxx>


Hi,

I'll answer below:

I performed 'cpmg fixed' a  relaxation dispersion analysis, for dataset
with
68 residues.
Having 22 intensity files, with 4 triple replications.

I took from 17 pm to 13 pm following day, app 20 hours.
The analysed models were: R2eff', 'No Rex','LM63','CR72'

Is 20 hours for an analysis expected?
, or should I look for some errors somewhere?


This depends.  By default relax uses much higher precision
optimisation than most other softwares.  This is based on the
philosophy that more accurate results are worth the wait, especially
considering that this time is relatively small in comparison to the
measurement and processing time (and adding the inevitable
re-measurements).  In addition, lots of Monte Carlo simulations are
used for really accurately determining your parameter errors.  For
comparison, note that a full a Lipari and Szabo model-free analysis
can take between 1 day and 2 weeks to complete.

For initial analyses where errors are not so important, the number of
simulations can be dropped massively to speed things up.  This can
also be done for students.  The result is that the error estimates of
the parameters are horrible but, in some cases, excluding publication,
that is not such a problem.  This will not affect model selection
either.  Therefore if errors are not important for specific cases,
then set the number of MC sims to 3.  Then watch how much quicker
things run.

Oh, if this is for students using the GUI, you could hack a special
version of relax to have the auto-analysis perform much lower
precision optimisation.  It should be possible to make things run very
quick for them.  If they are using scripting, but still the relaxation
dispersion auto-analysis, then no hacking is necessary.  The function
tolerance and maximum number of iterations can be set using special
tricks ;)

Relax made a model selection.
model_selection(method='AIC', modsel_pipe='final', bundle=None,
pipes=['No
Rex', 'LM63', 'CR72'])

How can I inspect which model is then chosen?


Well, the hard way would be to open the relax saved state in the
'final' directory.  Or to look at the logs.  That information is
unfortunately not presented in a text file.  How would you suggest
that such information is presented?

Then I would like to make a global fit / cluster analysis.
Is this implemented yet?


Yes.  You need to use the relax_disp.cluster user function.

And should I use the:
User functions (n-z) -> relax_disp -> cluster ?


I am still thinking about how to bring this into the relaxation
dispersion auto-analysis GUI element without requiring me to write a
lot of code!

By the way.
The sherekhan input function is super great!


:)

The CPMGFit and NESSY input user functions will hopefully also be of
use to some people.  However if you start adding some new models to
relax, then relax will soon have all of the capabilities that these
these programs possess.  And if Paul Schanda's numerical integration
Python code is merged into relax, then relax will soon surpass all of
these softwares.

Thanks a lot :-)
This is super great


You're welcome.  I'm hoping that you can help to make this even greater ;)

Regards,

Edward

Re: Time of running, Model selection and global/cluster analysis for relaxation dispersion analysis

Header

Content

Related Messages