Re: pA = 0.5 problems. -- April 30, 2014

Hi,

I should expand on the statistics a bit more.  Maybe using AIC would
clarify the noise vs. real data components.  Here is a short table:

Set          Chi2    k  AIC
Individual   32.97  10  52.97
Cluster      48.79   8  64.79

So even using AIC, the individual fit is better.  Statistically it is
not that you are just fitting more noise in the non-clustered fit.
That is significant!  One thing I noticed is that dw is the same for
both spins in the clustered fit.  Could you check if this is the case
for other clustering cases?  It must be different for each spin.
Maybe there is an important bug there.

Regards,

Edward



On 30 April 2014 10:16, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:

I tried to generate sherekhan output, but since I have time_T2 of 0.04
and 0.06, for the two fields,
I cannot generate the input files for ShereKhan.


ShereKhan should support this, and it would be a good test for relax.
The second line of the input file has this time.  Was it that relax
could not create the input files rather than ShereKhan not handling
this?

My problem origins from that I would like to compare results from Igor
Pro script.
Yet, another software solution.


Have you run the Igor Pro script to compare to relax?  With the same
input data, all software solutions should give the same result.  This
is important - you need to determine if the issue is with relax or
with the data itself.  It is best to first assume that the problem is
with relax, then see if other software produces a different result
(the more comparisons here the better).  Maybe relax is not handling
the two different times correctly.  Otherwise if everything has the pA
= 0.5 problem then the solution, if one exists, will be very
different.

I now got the expected pA values of 0.97 if I did a cluster of two 
residues.


This could indicate that the pA = 0.5 issue is in the data itself,
probably due to noise.  You should confirm this by comparing to other
software though.  Comparing to the 'NS CPMG 2-site expanded' might
also be useful.

If I do an initial Grid inc of 21, use
relax_disp.set_grid_r20_from_min_r2eff(force=False) I get this.


As I mentioned before
(http://thread.gmane.org/gmane.science.nmr.relax.scm/20597/focus=5390),
maybe it would be better to shorten this user function name as it is a
little misleading - it is about custom value setting and not the grid
search, despite it being useful for the later.

:10@N GRID   r2600=20.28 r2500=18.48 dw=1.0 pA=0.900 kex=2000.80
chi2=28.28 spin_id=:10@N resi=10 resn=G
:10@N MIN    r2600=19.64 r2500=17.88 dw=0.7 pA=0.500 kex=2665.16
chi2=14.61 spin_id=:10@N resi=10 resn=G
:10@N Clust  r2600=18.43 r2500=16.98 dw=2.7 pA=0.972 kex=3831.77
chi2=48.79 spin_id=:10@N resi=10 resn=G

:11@N GRID   r2600=19.54 r2500=17.96 dw=1.0 pA=0.825 kex=3500.65
chi2=47.22 spin_id=:11@N resi=11 resn=D
:11@N MIN    r2600=14.98 r2500=15.08 dw=1.6 pA=0.760 kex=6687.15
chi2=18.36 spin_id=:11@N resi=11 resn=D
:11@N Clust  r2600=18.19 r2500=17.31 dw=2.7 pA=0.972 kex=3831.77
chi2=48.79 spin_id=:11@N resi=11 resn=D


If you sum the chi-squared values, which is possible as these are all
the same model, then you can compare the individual fits and the
clustered fit.  The individual fit total chi-squared value is 32.97.
The cluster value is 48.79.  This is very important - the individual
fit is much, much better.  You should make a plot of the fitted curves
for both and compare.  Note that a better fit does not mean a better
result, as you are fitting both a data component and noise component.
So the better fit might be due to the noise component.  This is why
clustering exists.

Ideally, I would like to cluster 68 residues.

But as you can see, if several of my residues start out with dw/pA far
from the Clustered result, this minimisation takes
hilarious long time.


I can see how this would be a problem for you mass screening
exercises.  This will probably require a lot of investigation on your
part to solve, as I have not seen any solution published in the
literature.  Though if you could find a solution in the literature,
that would probably save you a lot of time.  You could also ask others
in the field.  If you remember
(http://thread.gmane.org/gmane.science.nmr.relax.devel/4647/focus=4648),
you changed the parameter averaging to the parameter median for the
clustering.  So maybe that is having an effect.  Anyway, you need to
first compare to other software or models and see if there is a
problem in relax first, before trying to invent a solution.

Regards,

Edward

Re: pA = 0.5 problems.

Header

Content

Related Messages