Hi Edward, after reading the mails again and checking the code again I think it would be good to do another cross-check, perhaps best with an artificial example: suppose we have mixing times 10, 10, 10 (3 replicates) 20 30, 30 (2 replicates) 40 50 and spectra with 3 peaks. A: I would take peak1, then check for the largest difference in intensity in the first 3 replicates and in the 2 replicates. The largest found difference is then assigned as a systematic error to peak1 at all mixing times. The procedure is repeated for peak2 and peak3. Most likely peak1, peak2 and peak3 then have different systematic errors. That would be like a worst case scenario per peak and is what currently was active in the PDC. B: The absolute worst scenario would be to check for the largest ever ocuring difference and assign that to all 3 peaks at all mixing times. This we used in earlier versions of the PDC. C: Your variance averaging would work as follows: loop over the first 3 replicates, calculate variance for each peak loop over the 2 replicates, calculate variance for each peak finally take the average over 6 variances and assign this as a systematic error to all peaks at all mixing times Is this correct ? I think A: and C: should be offered. After implementing this I see my data that the C: produces much larger errors than A and the errors of the fitted parameters get much larger, sometimes larger than the values itself. This seems not acceptable. In the mean I also spoke to Wolfgang Bermel. My statement that peaks with longer T2 seem to very more in replicates is not obvious to him. When looking to the data it turned out that I had looked at peaks that do not only have longer T2 but also have larger resonance offsets. Wolfgang suspects that their might be an offset effect. The statement that spectra taken at longer mixing times show a lower base plane noise seems to be not true in general. The trend I have seen in one of the spectra is not fully convincing so we checked a couple of other spectra all taken with the latest release pulse programs. With those data we would not really confirm the trend. We concluded to have an eye on that from now on. Best regards, Peter On 11/24/2010 3:11 PM, Edward d'Auvergne wrote: On 24 November 2010 10:31, Dr. Klaus-Peter Neidig <peter.neidig@xxxxxxxxxxxxxxxxx> wrote:Dear Edward, I talked to some collegues at Bruker to discuss your various items. In general there is agreement that being as much consistent with relax as possible is a good idea. However other software at Bruker does not have to be affected (some of the code is used in multiple areas) and changes that cause too much effort and are already documented and in use should be avoided. Enhancing the export functionality to provide more infos to relax is regarded a good way to go.Hi, Yes, it would be a bad idea to make big changes. Interfacing with relax will be quite straightforward and probably no changes will be required from Bruker's side.In detail:Do the users define the signal-free region? For example there is usually more noise in the random coil part of the spectrum due to unwanted junk in the sample, so the peaks in this region have larger errors. What about signals close to the water signal? Is if possible to have different errors for these peaks?No, the PDC does currently not allow to assign different errors to different peaks. We take several regions in the spectrum automatically, e.g. close to the 4 corners and finally regard the one with the lowest noise. We have to rely on some assumptions like proper base line correction, reasonably clean sample, careful post processing, if any.Ah, ok. We will have to make sure users manually exclude spins with signals near the water region. Again this is an easy thing to solve.I don't understand the origin of this scaling. Since Art Palmer's seminal 1991 publication (http://dx.doi.org/10.1021/ja00012a001), the RMSD of the pure base-plane noise has been used for the standard deviation. The key text is on page 4375:Thanks for the various references. The factor we used so far is empirical but no problem, I will remove it. Other software pieces at Bruker that use such factors are not affected. I did run some tests already, the fitted parameters do not really change but of course the errors of the fitted parameters do. This needs to be documented to the user.Thank you. In a model-free analysis the accuracy of the errors is just as important, or maybe even more important, than the data itself so this will make a big difference.From my work a long time ago, I noticed that the spectral errors decreased with an increasing relaxation period. This can be taken into account in relax if all spectra are duplicated/triplicated/etc. But if not, then the errors for all spectra are averaged (using variance averaging, not standard deviation averaging). For a single duplicated/triplicated spectrum, the error is taken as the average variance of each peak. So when some, but not all, spectra are replicated, there will be one error for all spin system at all relaxation periods. This sounds similar to what the PDC does, but is it exactly the same?I think the Bruker internal opinion was different in a sense that one should either have enough replicates (which never happens) and do sdev averaging or just provide an estimate of the systematic error. The change in intensity obviously depends on the peaks. We have seen peaks with longer T2 varying much more than those with shorter T2. We concluded to to assume a worst case error scenario and check for the largest difference of all peak intensities at replicated mixing times. This error was then applied (added) to all peaks at all mixing times. I should also mention that in the PDC the user has the option to replace peak intensities/integrals of replicated mixing times by their mean and each mixing time occurs only once in the fit. This yields slightly different results. The request for implementing this came from our application people who obviously talked to their customers.Having enough replicates would never be possible for properly determining errors, so it must be crudely estimated. The maximum error would mean that the errors would probably never be underestimated. But for downstream analysis, underestimating some errors (and overestimating others) would be better as no bias would be introduced into the errors. Using the maximum error applied to all peaks will actually result in molecules appearing more rigid than they are! The result will be the hiding of many motional processes. So the second option would be better. Even better would be to use an average error rather than the maximum by averaging variances (sdev values cannot be averaged as they cannot be summed, but variances sum). Is the peak intensity error located in the PDC files, I cannot see it in our current testing files https://gna.org/file/testT2.txt?file_id=10641?Since changing the error estimation of replicates could influence the results significantly, we do not want to blindly do it. The proposal would be to offer several options (variance averaging, sdev averaging, worst case estimate) --
Bruker
BioSpin
| |||||||||