Re: [bug #22055] The processor.run_queue() does not clean up in uni_processor? -- May 20, 2014

It might be quite useful for you.  It may tell you that some of the
test cases that have really been bugging you can be safely ignored.

Regards,

Edward



On 20 May 2014 19:31, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Hi Ed.

Hm, I have thought about this before.

And thought I had written it up as a task.

I will do it now.
My supervisor has also asked for this feature.

best
Troels



2014-05-20 18:56 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

Hi Troels,

In your testing, for sanity you may also want to look at calculating
the alpha value from:

    - Millet, O. et al. (2000) The static magnetic field dependence of
chemical exchange linebroadening defines the NMR chemical shift time
scale. J. Am. Chem. Soc., 122, 2867–2877.
(http://dx.doi.org/10.1021/ja993511y).

Specifically equation (20).  This calculation could behind a user
function called relax_disp.calc_alpha.  You could calculate it
directly from the optimised parameters or from the experimental R2eff
values.  According to the paper:

    0 <= alpha  < 1     slow exchange,
    alpha = 1   intermediate exchange,
    1 < alpha <= 2    fast exchange.

This will help you determine if your parameters are in the correct
exchange regime for the model you are studying.  This is implemented
as part of the ShereKhan web interface, click on the [i] button next
to "Residue Selection" for details.

For your testing of the CR72 model, you will require an alpha value
between 0 and 1.  Fast exchange will cause the dw and pA values to
convolute and merge, so you cannot distinguish them.  Hence
optimisation will look like it has failed.  You can have different dw
and pA value combinations to produce the same phi_ex value, and it
will be impossible to find the original values again.  That is why
there are fast exchange models with the phi_ex parameter.

Regards,

Edward



On 20 May 2014 17:46, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:

Hi,

In this case, we need to perform a careful comparison between relax
and other software - by generating input data and see how both
programs respond.  The key is to identify if the problem is in the
implementation (i.e. the relax code) or in the theory (the equations).
 We can fix the first, not the second (the solution to the second is
to use a different model).  The problem also needs to be carefully
stated:  Using the parameter set {x: 1.0, y: 2.0} and model Z, and
setting the R2eff errors to 0.1, R2eff values were back calculated;
then this perfect R2eff data was optimised against model Q.

To standardise such a problem it might be worth creating a
relax_disp.r2eff_back_calc user function.  The contents of this is
already in your scripts.  This would simplify the scripts and help
with debugging.

The chi2 surface from the CR72_fail_one_field.py script also shows no
clear minimum which means, assuming the CR72 code in relax is correct
(and it matches ShereKhan perfectly), that the problem is not well
conditioned.  Why this is the case needs to be investigated.  Is it
relax's implementation?  Or is it the CR72 model itself?  You can set
up the exact same grid search relax uses in ShereKhan, so that is a
good reference test.  It is well known that the CR72 model does not
hold in all cases, so it is likely that you have hit such a case here.
 This is the reason why there are now so many alternative dispersion
models.

Regards,

Edward





On 20 May 2014 16:49, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Hi Ed.

This is a very difficult case, but I am sure that relax has failed in
the error catching, making
the simplex algorithm to fail, when computing certain math domain errors.

The data I am referring to is:
http://thread.gmane.org/gmane.science.nmr.relax.devel/5302

I have proven in:
https://gna.org/bugs/?22024
bug #22024: Minimisation space for CR72 is catastrophic. The chi2
surface over dw and pA is bounded.

That relax fail to find values, even when dw is 1 ppm.

I will fix the speed-up, and rerun the data to see if it has been fixed.

Best
Troels


2014-05-20 15:43 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

Hi Troels,

Thanks for fixing the bug!


No problems.

But I think the worst most pity thing, is that I have waster 1/4-1/2
year of my PhD to fix this bug:

bug #22032: Minimisation explodes when using Grid_INC=None
https://gna.org/bugs/index.php?22032


I'm still trying to work out what this bug is about.  But if you start
optimising with the dw = 0.0, this is not a bug!  You just cannot ever
start optimising from there.  Not many optimisation algorithms can
survive that and find the real solution (global optimisation
algorithms such as simulated annealing and genetic algorithms
excluded).  If you simplify the CR72 equations with dw = 0.0, you will
see that pA, pB, and kex fall out of the equation and you actually end
up with R2eff = R20.  This is expected as when the chemical shift
difference is zero, there is no chemical exchange.  Therefore when dw
= 0.0, then the pA and kex parameters are undefined.   Undefined
parameters kill many optimisation algorithms - this is why you'll see
order parameters of 1 often in a model-free analysis (prior to people
using relax), as in this case the tau_e correlation time parameter is
undefined.

This bug is however only 9 days old.  Do you mean one of the other bugs?

My last three weeks analysis have shown that the error catching at
boundary values had a huge impact on the minimisation search. It
simply went to scrap.


Before or after?  Do you have an example that can be converted into a
system test?

This was also the case when running grid search. Touching the math
domain errors, caught simplex to stop.
https://gna.org/bugs/download.php?file_id=20704
https://gna.org/bugs/download.php?file_id=20698


These files are not attached to bug #22032
(https://gna.org/bugs/index.php?22032).  This is rather bug #22024
(https://gna.org/bugs/?22024).  However, it is clear from those
chi-squared surfaces that there is nothing to optimise to.  There is
no clear minimum within the space.  We should discuss things under
each separate bug report.

As I mentioned before, this one is not really a bug.  This is simply
what happens when you have very accurate optimisation combined with a
poorly conditioned problem - i.e. the parameter values are close to
insignificance.  This is an edge case, and no one in the field will
care for such a case as its statistical significance is orders of
magnitude less than zero ;)  I wouldn't bother wasting time on that
one, as the only real tool you can use from the field of mathematical
optimisation is a re-parametrisation of the problem.  I used this in
the model-free analysis to go from the {S2f, S2s, ts} parameters to
{S2, S2s, ts} for the 2 time scale models.  This had a huge effect on
optimisation and such problems.  But in the relaxation dispersion
analysis, it will be rather difficult to come up with a new parameter
set.  And rather pointless just to solve such a difficult edge case.

I have tried so hard to understand why the behaviour of relax is
different from other programs.


Do you have data showing that other software with the same input data
gives different results?  From all my testing, I don't see this:

http://svn.gna.org/viewcvs/*checkout*/relax/trunk/test_suite/shared_data/dispersion/software_comparison?revision=HEAD

Well, apart from the expected differences due to bugs, strange
assumptions of certain old software, and the differences Andy
mentioned (this never reached the mailing list as he had an
attachment!).

I have tested, and inspected code, and tried other programs. relax was
just stubborn.


Could you present such comparisons on the mailing list?  Which
software found the solution that relax could not?  And how was relax
set up?

Realising this issue has been a relieve!

I am happy that the processor.run_queue() bug only is related to the 
test-suite.
It hopefully did not affect production???


It is only visible in the test suite, and only if certain tests fail.
It is only annoying for relax developers.

I can have night-mares about these bugs, as that can mean that
scientists are wasting their time.
Including mine.


This bug is not an issue for relax users.  They cannot be affected by
it.  Other bugs can however, and that is why relax has a test suite.
It is also the reason for the existence of relax - because I found
insanely fatal bugs in other NMR dynamics software that had been used
for 20 years!

Anyway, to save your time with the dispersion work, you should follow
a few simple rules.  If the dispersion curve is insignificant, i.e.
the difference of the maximum and minimum R2eff value is less than a
certain value, say 0.5 rad.s^-1, then don't bother with it!  Most of
the dispersion bug reports where you report optimisation problems fall
into this category.  I strongly doubt that any other dispersion
software will handle these cases - they will give different results
but also not find the real solution (this is likely to be the source
of differences you see to published results).  And never start
optimisation for positions where other parameters are undefined - i.e.
dw = 0.0.  That covers the other half of the bug reports.  You are
otherwise battling nearly impossible battles.

Regards,

Edward

Re: [bug #22055] The processor.run_queue() does not clean up in uni_processor?

Header

Content

Related Messages