mailRe: Relax_fit.py problem


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Tyler Reddy on October 15, 2008 - 16:57:
Hey,

Farrow et al. (1994) Biochemistry, 33: 5984-6003 also draw a similar conclusion
(paragraph starting at bottom left of p. 5988) and apply the RMS value of the
noise as an estimate of the standard deviation of peak intensity. If I'm not
mistaken this is the exact assumption made by relax for steady-state NOE error
propagation by the sum of squares equation from this paper as well.

Also of interest on p. 5988,

"The distribution of the difference in intensities of identical peaks in
duplicate spectra should have a standard deviation [sqrt(2)] times greater than
the standard deviation of the individual peaks."

They again conclude that duplicate and RMS baseline data errors are consistent
within those bounds. If the Kay and Palmer labs are going with this conclusion
(even if it doesn't really tell us which error is more appropriate), it seems
like it's a good bet that you can estimate standard deviation this way.

However, I'm sill not clear on the relationship between curve fit errors and the
errors measured directly from the spectra. I'm not sure how the nonlinear
fitting error factors in for relax R1, R2 curve-fitting scripts. Certainly, if
curve-fit error alone could be used that would make things easier since no
error measurement on the T1/T2 experiment spectra would be needed, you could
just dump the peak heights to relax.

Tyler




Quoting Tyler Reddy <TREDDY@xxxxxx>:

Hi,

I'll try to dig up those references. The other thing I find confusing is that
some groups use the curve fit error for the parameters. So, the errors in R1
and R2 per residue are actually from the nonlinear curve fitting process
itself. In theory, if there is no error in peak height then the fit is
perfect.
So I wonder if there is yet another relationship to think about if you want to
use those values?!

I have these values already for T1 and T2 parameters and their curve fitting
errors (though I haven't figured out how to propagate these errors to the
reciprocal rate constants, or if that will even be meaningful), but I'm not
sure how they compare to the other 2 'error types' we are talking about.

Certainly, S/N = peak height/RMS baseline noise  (From Cavanagh textbook)

And while there are many references that throw around the sqrt(2) in various
equations, I haven't seen a comprehensive explanation yet.

Tyler




Quoting Edward d'Auvergne <edward.dauvergne@xxxxxxxxx>:

Hi,

That was the reference I used many years ago when I first added these
abilities to relax.  The text is a little confusing, but the important
line is the first of that paragraph you mention:

The uncertainties in the measured peak heights, sigma_h, were set
equal to the root-mean-square baseline noise in the spectra.

So if one looks at the code in relax, there is no multiplication by
sqrt(2).  As this was a long time ago, I'm not sure if this is the
most correct approach.  The confusing chi-squared tests between the
sigma_h and sqrt(2)*sigma_h may not statistically significant but
considering that the parameter number is identical in both cases, the
weighting constant simply changes, then no statistically significant
difference doesn't mean that one weight is better than the other or
that both weights are correct.

There is another early reference (or two) in which the NOE error
formula is given.  I think that may have more information, but I'm
struggling to remember what that reference is and cannot find it at
the moment.  And there may be more recent papers performing a much
more thorough noise analysis.  It could even be done using synthetic
spectra with white noise added (I recently did this to test the effect
of white noise on the uncertainty in peak chemical shift position to
validate Ad Bax's RDC error formula LW/SN - strangely the results were
far more complex than this formula).

There is a bit of time to find the correct baseplane RMSD to peak
height uncertainty as I need to wait for Sebastien to finish the work
with the loading of NMRView (as well as Sparky and XEasy) peak list
intensities.  The rearrangements I plan to do will affect the code he
is working on.

Regards,

Edward



On Mon, Oct 13, 2008 at 7:44 PM, Tyler Reddy <TREDDY@xxxxxx> wrote:
Hi Edward,

Palmer et al. (1991) JACS. 113: 4371-4380 is a nice reference for the error
conversion. It looks like the value for standard deviation between peaks in
paired spectra is sqrt(2) multiplied by the base plane RMS value (in
particular, see the short paragraph at the top right of page 4375 in this
manuscript). However, the authors seem to use the base plane RMS values
regardless, and then verify that the qualitative conclusions do not change
when
using the more conservative error estimates (i.e. multiplying by 1.4).

There's an extensive discussion of using chi-square critical values to
verify
the validity of this relationship between the noise types, though I must
concede that I don't grasp all the details after the first reading.

Tyler


Quoting Edward d'Auvergne <edward.dauvergne@xxxxxxxxx>:

Hi,

There are three ways that an error analysis can be done for relaxation
curve fitting, although one of those is only partly implemented in
relax at the moment (that means it won't work until I write some
computer code).  These are:

1.  Collect all spectra in duplicate, triplicate, or more if you
really have lot of NMR time to kill, for absolutely no reason.  The
peak intensity error for a single spin is calculated as the standard
deviation for each peak.  Because this is inaccurate for a low replica
number, this error is averaged for all peaks to give one error per
spectrum.  This error is then used in the Monte Carlo simulations.

2.  If only some spectra are duplicated, then the average of the
errors for all spectra is calculated.  This gives a single error value
for all spins and all spectra.  This is then used in the Monte Carlo
simulations.

3.  This is the error analysis technique which is not fully
implemented yet.  If no spectra are recorded in duplicate, then one
needs to use the RMSD of the base plane noise.  This is similar to
what relax uses for the NOE analysis (hence shouldn't be too hard to
implement for relaxation curve fitting).  I would need to find the
reference, but I think this value needs to be divided or multiplied by
root 2 to convert it to a peak height uncertainty.  Does anyone know a
reference for this?  Then a separate error value for all spins and all
spectra can be used in the Monte Carlo simulations.

Wei Xia has recently asked the same question
(https://mail.gna.org/public/relax-users/2008-09/msg00000.html).  It
might be worth reading my reply at
https://mail.gna.org/public/relax-users/2008-09/msg00002.html.  So
this feature will be added to relax, but the question is how long will
that take.  I'd first need the error conversion factor from RMSD of
base plane noise to peak height, and then add the ability to use the
RMSD value in relaxation curve fitting.  The first part will be the
hardest, but you'll need that to do a proper Monte Carlo simulation
error analysis for the curve fitting.  To do the second part I would
set up a mini analysis, lets call it a 'system test' because it tests
the system - relax - to see if the analysis works, and then make this
system test pass - i.e. implement the feature.

Don't forget that the errors in a complex analysis (e.g model-free and
reduced spectral density mapping) are just as important as the values
themselves, if not more.  Getting these wrong will really damage
optimisation, model selection, and error propagation to the final
parameters via Monte Carlo simulations.  So both your model-free
values and errors will be incorrect.

Regards,

Edward


On Wed, Oct 8, 2008 at 5:07 PM, Tyler Reddy <TREDDY@xxxxxx> wrote:

Hello,

It seems that Relax_fit.py requires replicate data because average and
standard
deviation values are used downstream in the analysis. With no replicate
data
(since I don't have any) the output is shown below. Also, commenting out
the
average and error propagation across multiple spectra

#relax_fit.mean_and_error()

doesn't work either, and I get another error output that is looking for
an
averaged value. I'll probably try using a duplicate data set to
circumvent this
for now (unless this is actually another problem).

Tyler

Output:

relax> relax_fit.mean_and_error()

Calculating the average intensity and standard deviation of all spectra.

Time point:  0.01 s
Number of spectra:  1
Standard deviation for time point 0:  0.0

Time point:  0.050000000000000003 s
Number of spectra:  1
Standard deviation for time point 1:  0.0

Time point:  0.10000000000000001 s
Number of spectra:  1
Standard deviation for time point 2:  0.0

Time point:  0.20000000000000001 s
Number of spectra:  1
Standard deviation for time point 3:  0.0

Time point:  0.29999999999999999 s
Number of spectra:  1
Standard deviation for time point 4:  0.0

Time point:  0.5 s
Number of spectra:  1
Standard deviation for time point 5:  0.0

Time point:  0.80000000000000004 s
Number of spectra:  1
Standard deviation for time point 6:  0.0
Traceback (most recent call last):
 File "/Applications/relax-1.3.1/relax", line 408, in <module>
  Relax()
 File "/Applications/relax-1.3.1/relax", line 125, in __init__
  self.interpreter.run(self.script_file)
 File "/Applications/relax-1.3.1/prompt/interpreter.py", line 270, in run
  return run_script(intro=self.__intro_string, local=self.local,
script_file=script_file, quit=self.__quit_flag,
show_script=self.__show_script,
raise_relax_error=self.__raise_relax_error)
 File "/Applications/relax-1.3.1/prompt/interpreter.py", line 531, in
run_script
  return console.interact(intro, local, script_file, quit,
show_script=show_script, raise_relax_error=raise_relax_error)
 File "/Applications/relax-1.3.1/prompt/interpreter.py", line 427, in
interact_script
  execfile(script_file, local)
 File "relax_fit_T1_500Mhz.py", line 45, in <module>
  relax_fit.mean_and_error()
 File "/Applications/relax-1.3.1/prompt/relax_fit.py", line 96, in
mean_and_error
  relax_fit_obj.mean_and_error()
 File "/Applications/relax-1.3.1/specific_fns/relax_fit.py", line 729, in
mean_and_error
  sd = sd / float(num_dups)
ZeroDivisionError: float division


_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users


_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users






_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users





_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users







Related Messages


Powered by MHonArc, Updated Wed Oct 15 18:40:24 2008