Dear Edward, after coming back from England, let me provide some comments. Some of the literature use the words error and uncertainty interchangeably. The quantity sigma used in least squares fitting would be the square of error or uncertainty. The pdc output refers to error, so error_R1 = error_T1/T1 Where do the errors come from ? The source of the errors is the spectrum. Each data point has an error. Commonly this error is derived from signal-to-noise, usually obtained from signal free regions. In most cases the standard deviation is calculated from such regions then multipled with a factor. The PDC uses multiple such regions, takes the one with the lowest sdev and multiples it by 2.0. There is no unqiue definition of this factor. When used as a peak picking threshold we often use 3.5 * sdev. The noise is calculated for each plane of the pseudo 3D separately. But this would not account for systematic errors. Therefore it is advised to also repeat experiments and check the variation of data points from spectrum to spectrum. Unfortunately it is up to the NMR user to do this or not and in most cases we are happy to see1-3 mixing times repeated at least once. That is not enough to calculate proper errors, therefore we just take the spread (max difference) as an estimate and apply it to all planes. Next the user has the freedom to calculate the peak integrals. We offer: just peak intensity, peak shape integral, peak area integral and peak fit. They all have their disadvantages in many cases the peak intensity is not even the worst. The error estimation also depend on the chosen method. In case of shape and area integral the integral error goes down with the number of integrated data points since at least random errors partially cancel out. In case of peak fit we take the error as coming out of Levenberg Marquardt, internally based on covariance analysis. Since the number of data points in pseudo 3D spectra is small per peak the 2D peak fit usually results in large errors. Also, the peak fitting assumes line shapes as gaussian, lorentizian or mixtures of both but the actual peak shape often differs from that. As it comes to relaxation parameter fitting we supply the peak integral errors to Levenberg-Marquardt and get fitted parameters and their errors back, again based on covariance analysis. Alternatively the user may run MC simulations, usually 1000. The input variation of the integrals comes from a gaussian random generator, the width of the gaussian distribution for each integral is taken identical to the estimated error of that integral. Literature says that the error then obtained from MC should be identical to the error obtained from LM. I checked this by eye, they are at least similar. All errors, regardless from LM or MC are finally multiplied with a factor obtained from a Student T distribution at given confidence level and degrees of freedom. The number we get at the end agree with what we get from Matlab which has become an internal standard for us at Bruker. As it comes to model free modelling the errors obtained so far are taken into account. But there is heavy criticism from some of the experts. They say for example that the errors tend to be too small, especially the NOE error which is only based on the ratio of 2 numbers, whereas T1 and T2 errors are based fitting of 10-20 input values. Therefore in the PDC we allow tht user override the determined errors by default errors e.g. 2% for NOE and 1% for T1 and T2. They also say that the modelling output should contain the back calculated T1, T2 and NOE and should get markers if T1 and T2 are well reproduced regardless what NOE is doing. I don't want to comment on this but I have implemented it in the PDC. During the curve fitting we refer to T1, T2 for no deeper reason. The older Bruker software tools did it and a module in TopSpin is for example called the T1/T2 module. Today I'm only a developer (in former times I could define projects by myself) and other people officially tell me what to do. That is the only reason for using T1, T2. As soon as it goes beyond the relaxation curve fitting everything else internally continues with R1, R2, most of the literature presents the formulas with R1, R2 and I didn't want to rewrite all these and introduce mistakes. Technically, it would be no problem to present everything in terms of R1, R2 but now several people already use the software an nobody complained. Perhaps with a future version I should just allow the user to switch between the one or other representation. I will discuss this with some people here. I think everybody here understands that Bruker is not a research institution and does not have to ressources and knowledge to do the modelling on a level you do. In my talks I advertised that our PDC allows a very convenient data analysis especially if the Bruker release pulse programs (written by Wolfgang Bermel) are used. With our old T1/T2 module in TopSpin we had a big problem. It was so bad that many people just used nmrPipe, Sparky, ccpn or anything else even with having the need to a lot of manual work. I found it convenient to furthermore offer some of the diffusion tensor and modelling stuff to be a bit more complete. I'm absolute happy if I can say however that there is relax available which is much more advanced and can read our output. What disappoints me a bit at the moment is the behaviour of the users (independent of the software they use). Typically, they say, it is good to have all the modelling but what can we do with it ? The overall dynamical features of the molecule are quite obvious already from looking at the NOE, T1/T2 or reduced spectral densities. Many people just use relaxation data to check if there is aggregation. Your customer contacts should be much better than mine, what is you experience ? Just to indicate the future resources for the PDC: Until ENC 2011 I have permission to use 50% of my time to add more features, e.g. allow user defined spectral density functions and use multiple fields for modeling. But I already got a more general project, it will be called Dynamics Center that must cover all kinds of dynamics, that includes diffusion, kinetics and some solid state stuff like REDOR experiments. Applications will include smaller molecules. Best regards, Peter On 11/16/2010 7:10 PM, Edward d'Auvergne wrote: Dear Peter, Thank you for posting this info to the relax mailing lists. It is much appreciated. I hadn't thought too much about this, but this is as you say: an error propagation through a ratio. The same occurs within the steady-state NOE error calculation. As y=1/B and errA=0, we could simply take the PDC file data and convert the error as: sigma_R1 = sigma_T1 / T1^2. This would be a 100% exact error calculation. Therefore within relax, we will only need to read the final relaxation data from the PDC files and nothing about the peak intensities. Reading additional information from the PDC files could be added later, if someone needs that. One thing that would be very useful would be to have higher precision values and errors in the PDC files. 5 or more significant figures verses the current 2 or 3 would be of great benefit for downstream analyses. For a plot this is not necessary but for high precision and highly non-linear analysis such as model-free (and SRLS and spectral density mapping), this introduces significant propagating truncation errors. It would be good to avoid this issue. An additional question is about the error calculation within the Protein Dynamics Centre. For model-free analysis, the errors are just as important or maybe even more important than the data itself. So it is very important to know that the errors input into relax are of high quality. Ideally the R1 and R2 relaxation rate errors input into relax would be from the gold standard of error propagation - Monte Carlo simulations. Is this what the PDC uses, or is the less accurate jackknife technique used, or the even lowest accuracy covariance matrix estimate? And how are replicated spectra used in the PDC? For example, if only a few time points are duplicated, if all time points are duplicated, if all time points are triplicated (I've seen this done before), or if no time points are duplicated. How does the PDC handle each situation and how are the errors calculated? relax handles these all differently, and this is fully documented at http://www.nmr-relax.com/api/1.3/prompt.spectrum.Spectrum-class.html#error_analysis. Also, does the PDC use peak heights or peak volumes to measure signal intensities? Sorry for all the questions, but I have one more. All of the fundamental NMR theories work in rates (model-free, SRLS, relaxation dispersion, spectral density mapping, Abragam's relaxation equations and their derivation, etc.), and most of the NMR dynamics software accepts rates and their errors and not times. The BMRB database now will also accept rates in their new version 3.1 NMR-STAR definition within the Auto_relaxation saveframe. Also most people in the dynamics field publish R1 and R2 plots, while T1 and T2 plots are much rarer (unless you go back to the 80's). If all Bruker users start to publish Tx plots while most of the rest publish Rx plots, comparisons between different molecular systems will be complicated. So is there a specific reason the PDC outputs in relaxation times rather than in rates? Cheers, Edward On 16 November 2010 06:52, Neidig Klaus-Peter <Klaus-Peter.Neidig@xxxxxxxxxxxxxxxxx> wrote:Dear all, Dear Michael & Edward, I'm currently on the way to England, thus only a short note: The error or an inverse is a special case of the error of a ratio. A search for "error propagation" in the internet yields hundreds of hits. There are also some discussions about correlation bewtween involved quantities. If y=A/B with given errors of A and B then the absolute error of y is y * sqrt [(errA/errB)^2 + (errB/B)^2] If A=1 you get error of y is y*errB/B, since the error of a constant is 0. I compared the results with the errors I got from Marquardt if I fit a* exp(-Rt) instead of a* exp(-t/T) by eye up to a number of digits. I hope, I did it right. Best regards, Peter _______________________________________________ relax (http://nmr-relax.com) This is the relax-devel mailing list relax-devel@xxxxxxx To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel. --
Bruker
BioSpin
|