mailRe:


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Dr. Klaus-Peter Neidig on November 19, 2010 - 10:40:
Title: E-Mail Signatur Bruker BioSpin Rheinstetten V2.00
Dear Edward,

after coming back from England, let me provide some comments.

Some of the literature use the words error and uncertainty interchangeably.
The quantity sigma used in least squares fitting would be the square of
error or uncertainty.

The pdc output refers to error, so   error_R1 = error_T1/T1

Where do the errors come from ? The source of the errors is the spectrum.
Each data point has an error. Commonly this error is derived from signal-to-noise,
usually obtained from signal free regions. In most cases the standard deviation
is calculated from such regions then multipled with a factor. The PDC uses
multiple such regions, takes the one with the lowest sdev and multiples it by 2.0.
There is no unqiue definition of this factor. When used as a peak picking
threshold we often use 3.5 * sdev. The noise is calculated for each plane of the pseudo
3D separately.
But this would not account for systematic errors. Therefore it is advised to
also repeat experiments and check the variation of data points from spectrum
to spectrum. Unfortunately it is up to the NMR user to do this or not and in most
cases we are happy to see1-3 mixing times repeated at least once. That is not
enough to calculate proper errors, therefore we just take the spread (max difference)
as an estimate and apply it to all planes.

Next the user has the freedom to calculate the peak integrals. We offer: just peak
intensity, peak shape integral, peak area integral and peak fit. They all have their
disadvantages in many cases the peak intensity is not even the worst.
The error estimation also depend on the chosen method. In case of shape
and area integral the integral error goes down with the number of integrated
data points since at least random errors partially cancel out. In case of peak fit
we take the error as coming out of Levenberg Marquardt, internally based on
covariance analysis. Since the number of data points in pseudo 3D spectra
is small per peak the 2D peak fit usually results in large errors. Also, the peak fitting
assumes line shapes as gaussian, lorentizian or mixtures of both but the actual peak
shape often differs from that.

As it comes to relaxation parameter fitting we supply the peak integral errors to
Levenberg-Marquardt and get fitted parameters and their errors back, again based
on covariance analysis. Alternatively the user may run MC simulations, usually 1000.
The input variation of the integrals comes from a gaussian random generator, the
width of the gaussian distribution for each integral is taken identical to the estimated
error of that integral. Literature says that the error then obtained from MC should be identical
to the error obtained from LM. I checked this by eye, they are at least similar.
All errors, regardless from LM or MC are finally multiplied with a factor obtained from
a Student T distribution at given confidence level and degrees of freedom.
The number we get at the end agree with what we get from Matlab which has become
an internal standard for us at Bruker.

As it comes to model free modelling the errors obtained so far are taken into account.
But there is heavy criticism from some of the experts. They say for example that the errors
tend to be too small, especially the NOE error which is only based on the ratio of 2 numbers,
whereas T1 and T2 errors are based fitting of 10-20 input values. Therefore in the PDC we
allow tht user override the determined errors by default errors e.g. 2% for NOE and 1% for T1
and T2. They also say that the modelling output should contain the back calculated T1, T2
and NOE and should get markers if T1 and T2 are well reproduced regardless what NOE
is doing. I don't want to comment on this but I have implemented it in the PDC.

During the curve fitting we refer to T1, T2 for no deeper reason. The older Bruker software
tools did it and a module in TopSpin is for example called the T1/T2 module. Today I'm
only a developer (in former times I could define projects by myself) and other people officially
tell me what to do. That is the only reason for using T1, T2. As soon as it goes beyond the
relaxation curve fitting everything else internally continues with R1, R2, most of the literature
presents the formulas with R1, R2 and I didn't want to rewrite all these and introduce mistakes.
Technically, it would be no problem to present everything in terms of R1, R2 but now several
people already use the software an nobody complained. Perhaps with a future version I
should just allow the user to switch between the one or other representation. I will discuss this with
some people here.

I think everybody here understands that Bruker is not a research institution and does not
have to ressources and knowledge to do the modelling on a level you do. In my talks I
advertised that our PDC allows a very convenient data analysis especially if the Bruker
release pulse programs (written by Wolfgang Bermel) are used. With our old T1/T2 module
in TopSpin we had a big problem. It was so bad that many people just used nmrPipe, Sparky,
ccpn or anything else even with having the need to a lot of manual work. I found it convenient
to furthermore offer some of the diffusion tensor and modelling stuff to be a bit more complete.
I'm absolute happy if I can say however that there is relax available which is much more advanced
and can read our output.
What disappoints me a bit at the moment is the behaviour of the users (independent of the software
they use). Typically, they say, it is good to have all the modelling but what can we do with it ? The
overall dynamical features of the molecule are quite obvious already from looking at the NOE,
T1/T2 or reduced spectral densities. Many people just use relaxation data to check if there is
aggregation.
Your customer contacts should be much better than mine, what is you experience ?

Just to indicate the future resources for the PDC: Until ENC 2011 I have permission to use
50% of my time to add more features, e.g. allow user defined spectral density functions and
use multiple fields for modeling. But I already got a more general project, it will be called
Dynamics Center that must cover all kinds of dynamics, that includes diffusion, kinetics and
some solid state stuff like REDOR experiments. Applications will include smaller molecules.

Best regards,
Peter



On 11/16/2010 7:10 PM, Edward d'Auvergne wrote:
Dear Peter,

Thank you for posting this info to the relax mailing lists.  It is
much appreciated.  I hadn't thought too much about this, but this is
as you say: an error propagation through a ratio.  The same occurs
within the steady-state NOE error calculation.  As y=1/B and errA=0,
we could simply take the PDC file data and convert the error as:

sigma_R1 = sigma_T1 / T1^2.

This would be a 100% exact error calculation.  Therefore within relax,
we will only need to read the final relaxation data from the PDC files
and nothing about the peak intensities.  Reading additional
information from the PDC files could be added later, if someone needs
that.  One thing that would be very useful would be to have higher
precision values and errors in the PDC files.  5 or more significant
figures verses the current 2 or 3 would be of great benefit for
downstream analyses.  For a plot this is not necessary but for high
precision and highly non-linear analysis such as model-free (and SRLS
and spectral density mapping), this introduces significant propagating
truncation errors.  It would be good to avoid this issue.

An additional question is about the error calculation within the
Protein Dynamics Centre.  For model-free analysis, the errors are just
as important or maybe even more important than the data itself.  So it
is very important to know that the errors input into relax are of high
quality.  Ideally the R1 and R2 relaxation rate errors input into
relax would be from the gold standard of error propagation - Monte
Carlo simulations.  Is this what the PDC uses, or is the less accurate
jackknife technique used, or the even lowest accuracy covariance
matrix estimate?  And how are replicated spectra used in the PDC?  For
example, if only a few time points are duplicated, if all time points
are duplicated, if all time points are triplicated (I've seen this
done before), or if no time points are duplicated.  How does the PDC
handle each situation and how are the errors calculated?  relax
handles these all differently, and this is fully documented at
http://www.nmr-relax.com/api/1.3/prompt.spectrum.Spectrum-class.html#error_analysis.
 Also, does the PDC use peak heights or peak volumes to measure signal
intensities?

Sorry for all the questions, but I have one more.  All of the
fundamental NMR theories work in rates (model-free, SRLS, relaxation
dispersion, spectral density mapping, Abragam's relaxation equations
and their derivation, etc.), and most of the NMR dynamics software
accepts rates and their errors and not times.  The BMRB database now
will also accept rates in their new version 3.1 NMR-STAR definition
within the Auto_relaxation saveframe.  Also most people in the
dynamics field publish R1 and R2 plots, while T1 and T2 plots are much
rarer (unless you go back to the 80's).  If all Bruker users start to
publish Tx plots while most of the rest publish Rx plots, comparisons
between different molecular systems will be complicated.  So is there
a specific reason the PDC outputs in relaxation times rather than in
rates?

Cheers,

Edward



On 16 November 2010 06:52, Neidig Klaus-Peter
<Klaus-Peter.Neidig@xxxxxxxxxxxxxxxxx> wrote:
Dear all, Dear Michael & Edward,

I'm currently on the way to England, thus only a short note:

The error or an inverse is a special case of the error of a ratio. A search for "error propagation" in the internet yields
hundreds of hits. There are also some discussions about correlation bewtween involved quantities.

If y=A/B with given errors of A and B then the absolute error of y is y * sqrt [(errA/errB)^2 + (errB/B)^2]

If A=1 you get error of y is y*errB/B, since the error of a constant is 0.

I compared the results with the errors I got from Marquardt if I fit a* exp(-Rt) instead of a* exp(-t/T) by eye up to
a number of digits.

I hope, I did it right.

Best regards,
Peter
_______________________________________________
relax (http://nmr-relax.com)

This is the relax-devel mailing list
relax-devel@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

.


--
Bruker BioSpin

Dr. Klaus-Peter Neidig
Head of Analysis Group
NMR Software Development

Bruker BioSpin GmbH
Silberstreifen 4
76287 Rheinstetten

Germany
 Phone: +49 721 5161-6447
 Fax:     +49 721 5161-6480


Bruker BioSpin GmbH: Sitz der Gesellschaft/Registered Office: Rheinstetten, HRB 102368 Amtsgericht Mannheim
Geschäftsführer/Managing Directors: Joerg Laukien, Dr. Bernd Gewiese, Dr. Dieter Schmalbein, Dr. Gerhard Roth

Diese E-Mail und alle Anlagen können Betriebs- oder Geschäftsgeheimnisse, oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail und aller Anlagen ausdrücklich untersagt. Bitte benachrichtigen Sie den Absender und löschen/vernichten Sie die empfangene E-Mail und alle Anlagen.
Vielen Dank.


This message and any attachments may contain trade secrets or privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it and its attachments is strictly prohibited. Please inform the sender immediately and delete/destroy the original message and any copies.
Thank you.


Related Messages


Powered by MHonArc, Updated Mon Nov 22 16:00:16 2010