Hi,
Cheers. Would you have new NOE and R2 data file examples as well? If
you could, please try to attach the files to the task
(https://gna.org/task/?7180) rather than attaching it to a message, as
big attachments are quite a drain on the infrastructure. I will
comment more below.
On 22 February 2011 11:31, Dr. Klaus-Peter Neidig
<peter.neidig@xxxxxxxxxxxxxxxxx> wrote:
Dear Edward,
below is the document I had recently sent. I have added 3 pictures and marked those options that you
recommend, i.e. peak intensities, variance averaging in case of replicates and fit using the determined
integral (or intensity) errors.
The choice the height or volume of the peak isn't too bad, but the
field most often uses heights (and almost always use the terminology
'heights' ;) as the measure as volumes are problematic when peaks
overlap. Does the variance averaging include using the baseplane RMSD
measure? Or is this only for when spectra are duplicated? Does the
PDC make a distinction between all spectra replicated and a few
replicated? These points are not clear from the 3 figures, though it
might be clearer when using the PDC.
For the error estimation method, the options "error estimation by fit"
or "error estimation by weighted fit" is not clear. Does this mean
using the low quality covariance matrix estimate (the estimated errors
are from the estimated matrix)? What are the weights? Is the
technique behind the unweighted and weighted "error estimation by fit"
explained to the user? Are the gold standard Monte Carlo simulations
not the default because of the longer computation time?
One last thing, I still really cannot understand the error in the
'results' section and the scale. Why does the scale even exist? Does
this have something to do with the Bruker NC_PROC parameter? Errors
are simply errors, they are the standard deviation defined very
precisely as the square root of the variance. When presenting the
details about a normal distribution, you give the mean and variance,
not a scaled variance. There is no need to scale the error. In the
results section with the header:
Peak name F1 [ppm] F2 [ppm] T1 [s] error errorScale
The actual real error is equal to the presented error divided by
errorScale. Why is a scaled error, which is more than double the real
error, presented to the user? I can only imagine that this will cause
user confusion.
That picture still contains the word Monte Carlo. There will be an internal meeting here at Bruker about
this topic and I have to wait what comes out. Your statement is that the PDC is rather doing bootstrapping
and that this is not useful to determine the errors of the fit parameters. I will put this into the discussion.
It most definitely is Bootstrapping and not Monte Carlo simulations.
To the statistical field, it is well known that Bootstrapping does not
produce an error estimate. Whether the numbers are similar in certain
situations is not relevant.
In the output I changed "used integral errors" just into "integrals" not as I proposed yesterday "experimental
integral errors" which I think sounds stupid. In fact I now write integrals and integral errors for the 2 tables.
Strictly spoken integrals could be intensities of course.
Sorry, my confusion was not from the name but from the subsequent use
of those values by the PDC. All of the terminology made it clear that
it was the errors.
I also attached a new output example and this time I adjusted "variance averaging" instead of "worst case.."
Variance averaging results in one number that is added to the base plane RMSD. We then have identical
numbers in each column. Yesterday we used the "worst case.." option produces a number for each peak
that was added to the base plane RMSD. All error numbers in the table then were different.
I've made a comparison of relax's fit and the PDC values below. The
error estimates are not identical, but they are very close. The PDC
looks to be underestimating more then overestimating the errors (when
comparing to the gold standard). This is probably because of the
difference between the error estimate using the itself estimated
covariance matrix verses Monte Carlo simulations.
Cheers,
Edward
[edward@localhost pdc_files]$ relax convert_data_r1.py
relax repository checkout
Molecular dynamics by NMR data analysis
Copyright (C) 2001-2006 Edward d'Auvergne
Copyright (C) 2006-2011 the relax development team
This is free software which you are welcome to modify and redistribute
under the conditions of the
GNU General Public License (GPL). This program, including all
modules, is licensed under the GPL
and comes with absolutely no warranty. For details type 'GPL' within
the relax prompt.
Assistance in using the relax prompt and scripting interface can be
accessed by typing 'help' within
the prompt.
script = 'convert_data_r1.py'
----------------------------------------------------------------------------------------------------
#! /usr/bin/env python
from string import split, strip
# The file data.
file = open('testT1.txt')
lines = file.readlines()
file.close()
# The data - PDC T2, PDC T2 err, PDC scale factor, relax R2, relax R2
err (MC sim).
data = [
['Gln', 2, None, None, None, 2.1931543671, 0.0150953094197],
['Ile', 3, None, None, None, 2.33146407173, 0.0160577446857],
['Phe', 4, None, None, None, 2.3482964848, 0.019713488727],
['Val', 5, None, None, None, 2.2490812645, 0.0164275028474],
['Lys', 6, None, None, None, 2.27941978889, 0.015118434456],
['Thr', 7, None, None, None, 2.21973220186, 0.0142724888389],
['Leu', 8, None, None, None, 2.24024670209, 0.0220723067644],
['Thr', 9, None, None, None, 1.9682398599, 0.0320202880659],
['Gly', 10, None, None, None, 2.08801972269, 0.0133512466715],
['Lys', 11, None, None, None, 1.99776427143, 0.0119683402446],
['Thr', 12, None, None, None, 2.16448623389, 0.0174891646325],
['Ile', 13, None, None, None, 2.27255388183, 0.0203045621019],
['Thr', 14, None, None, None, 2.19928611083, 0.0137179789481],
['Leu', 15, None, None, None, 2.27943290864, 0.01841703596],
['Glu', 16, None, None, None, 2.08711900135, 0.0110867843226],
['Val', 17, None, None, None, 2.28472626854, 0.0149813716965],
['Glu', 18, None, None, None, 2.10803348086, 0.0165871439247],
['Ser', 20, None, None, None, 2.24335721396, 0.0136423661701],
['Asp', 21, None, None, None, 2.3491468761, 0.0123466109271],
['Thr', 22, None, None, None, 2.27508355307, 0.0179254994426],
['Ile', 23, None, None, None, 2.46729511309, 0.022032166148],
['Asn', 25, None, None, None, 2.39562318634, 0.0182835550536],
['Val', 26, None, None, None, 2.13340147848, 0.0124491345714],
['Lys', 27, None, None, None, 2.36258684463, 0.021452496764],
['Ala', 28, None, None, None, 2.38480717279, 0.0167515948037],
['Lys', 29, None, None, None, 2.32696831127, 0.0186312168945],
['Thr', 30, None, None, None, 2.42063807895, 0.016486280768],
['Gln', 31, None, None, None, 2.37225963882, 0.0145645530678],
['Asp', 32, None, None, None, 2.34327154101, 0.0141218929004],
['Lys', 33, None, None, None, 2.1927627806, 0.0133120293255],
['Glu', 34, None, None, None, 2.23809535599, 0.0144347657445],
['Gly', 35, None, None, None, 2.2461346515, 0.0122274121117],
['Ile', 36, None, None, None, 1.99122925973, 0.010910732682],
['Asp', 39, None, None, None, 2.29368330103, 0.0126561091892],
['Gln', 40, None, None, None, 2.25464084772, 0.0146681195393],
['Gln', 41, None, None, None, 2.22441709558, 0.0180042863108],
['Arg', 42, None, None, None, 2.28040886598, 0.0226407473358],
['Leu', 43, None, None, None, 2.17231047053, 0.0237437949291],
['Ile', 44, None, None, None, 2.27954369008, 0.0201909508149],
['Phe', 45, None, None, None, 2.26618944243, 0.0208950941528],
['Ala', 46, None, None, None, 2.17293180472, 0.0496094013897],
['Gly', 47, None, None, None, 2.19171718851, 0.0143718164149],
['Lys', 48, None, None, None, 2.20180904452, 0.0159731114836],
['Gln', 49, None, None, None, 2.11417079383, 0.0137897460469],
['Leu', 50, None, None, None, 2.24457803209, 0.0193730177104],
['Glu', 51, None, None, None, 2.13348726698, 0.0178416908891],
['Asp', 52, None, None, None, 2.0460014256, 0.0130027410002],
['Arg', 54, None, None, None, 2.19214984513, 0.0173101801326],
['Thr', 55, None, None, None, 2.27509231624, 0.0229093320248],
['Leu', 56, None, None, None, 2.36912374644, 0.0178696073777],
['Ser', 57, None, None, None, 2.29814984784, 0.0129737044328],
['Asp', 58, None, None, None, 2.37723175535, 0.0159530364094],
['Tyr', 59, None, None, None, 2.22560065336, 0.0147756564606],
['Asn', 60, None, None, None, 2.28460144064, 0.0121672447449],
['Ile', 61, None, None, None, 2.34228508004, 0.0180155615894],
['Gln', 62, None, None, None, 1.97375590837, 0.0127517602123],
['Lys', 63, None, None, None, 2.16513054478, 0.00997271942879],
['Glu', 64, None, None, None, 2.31915705344, 0.0199426460098],
['Ser', 65, None, None, None, 2.18919504068, 0.0149076372426],
['Thr', 66, None, None, None, 2.1815681959, 0.0155283232121],
['Leu', 67, None, None, None, 2.26786646063, 0.0200115697764],
['His', 68, None, None, None, 2.24651038599, 0.0204872753534],
['Leu', 69, None, None, None, 2.24847954259, 0.0171435701978],
['Val', 70, None, None, None, 2.31549239579, 0.0204085987616],
['Leu', 71, None, None, None, 2.17566168322, 0.0141512899583],
['Arg', 72, None, None, None, 2.1637210197, 0.0124881076678],
['Leu', 73, None, None, None, 1.85663248042, 0.00972554000058],
['Arg', 74, None, None, None, 1.61255064461, 0.0109491376192],
['Gly', 75, None, None, None, 1.16260116356, 0.0127191083108],
['Gly', 76, None, None, None, 0.763062241201, 0.00312485609259]]
# Get the data.
in_data = False
index = 0
for line in lines:
# Split the line.
row = split(line, "\t")
# Strip the rubbish.
for j in range(len(row)):
row[j] = strip(row[j])
# Empty line.
if len(row) == 0:
continue
# The section.
if row[0] == 'SECTION:' and row[1] == 'results':
in_data = True
continue
# Not in the data section.
if not in_data:
continue
# The header line.
if row[0] == 'Peak name':
continue
# The values.
data[index][2] = float(row[3])
data[index][3] = float(row[4])
data[index][4] = float(row[5])
# Increment the residue index.
index += 1
r1 = []
r1_err = []
print("%-4s %-3s %-15s %-15s %-15s %-15s" % ('Name', 'Num', 'r1',
'err', 'r1_relax_ratio', 'err_relax_ratio'))
for i in range(len(data)):
r1.append(1.0/data[i][2])
r1_err.append((data[i][3] / data[i][4]) / data[i][2]**2)
print("%-4s %-3s %15.10f %15.10f %15.10f %15.10f" % (data[i][0],
data[i][1], r1[-1], r1_err[-1], r1[-1]/data[i][5],
r1_err[-1]/data[i][6]))
----------------------------------------------------------------------------------------------------
Name Num r1 err r1_relax_ratio
err_relax_ratio
Gln 2 2.1929824561 0.0148809999 0.9999216147
0.9858029042
Ile 3 2.3315458149 0.0163053899 1.0000350609
1.0154221671
Phe 4 2.3485204321 0.0203675004 1.0000953659
1.0331758464
Val 5 2.2492127755 0.0166899526 1.0000584732
1.0159762412
Lys 6 2.2794620470 0.0157196001 1.0000185390
1.0397637516
Thr 7 2.2197558269 0.0145190072 1.0000106432
1.0172722778
Leu 8 2.2401433692 0.0218844554 0.9999538743
0.9914892740
Thr 9 1.9681165125 0.0315532109 0.9999373311
0.9854130882
Gly 10 2.0881186051 0.0135167223 1.0000473570
1.0123940199
Lys 11 1.9976028765 0.0124515847 0.9999192122
1.0403769011
Thr 12 2.1645021645 0.0175877778 1.0000073600
1.0056385321
Ile 13 2.2727272727 0.0193061408 1.0000762978
0.9508277371
Thr 14 2.1992522542 0.0137104168 0.9999846056
0.9994487429
Leu 15 2.2794620470 0.0184387216 1.0000127831
1.0011774750
Glu 16 2.0872469213 0.0117003083 1.0000612902
1.0553383197
Val 17 2.2846698652 0.0148687303 0.9999753129
0.9924812359
Glu 18 2.1079258010 0.0157358966 0.9999489193
0.9486802945
Ser 20 2.2431583670 0.0139071770 0.9999113619
1.0194109174
Asp 21 2.3490721165 0.0119404908 0.9999681759
0.9671067493
Thr 22 2.2753128555 0.0182975360 1.0001007886
1.0207545988
Ile 23 2.4673081668 0.0226133071 1.0000052907
1.0263769318
Asn 25 2.3957834212 0.0180897942 1.0000668865
0.9894024526
Val 26 2.1335609132 0.0116003422 1.0000747326
0.9318191703
Lys 27 2.3623907394 0.0204662691 0.9999169956
0.9540273708
Ala 28 2.3849272597 0.0166220009 1.0000503550
0.9922637881
Lys 29 2.3272050268 0.0189131753 1.0001017270
1.0151336546
Thr 30 2.4207213750 0.0167409705 1.0000344108
1.0154485866
Gln 31 2.3724792408 0.0148096778 1.0000925708
1.0168302292
Asp 32 2.3430178069 0.0144732222 0.9998917180
1.0248783413
Lys 33 2.1929824561 0.0131268532 1.0001001821
0.9860895665
Glu 34 2.2381378693 0.0140511562 1.0000189953
0.9734246076
Gly 35 2.2461814915 0.0118392398 1.0000208536
0.9682539250
Ile 36 1.9912385504 0.0108451543 1.0000046658
0.9939895492
Asp 39 2.2935779817 0.0127101007 0.9999540829
1.0042660471
Gln 40 2.2547914318 0.0147550484 1.0000667885
1.0059263816
Gln 41 2.2241992883 0.0182240447 0.9999020834
1.0122058949
Arg 42 2.2805017104 0.0217261363 1.0000407139
0.9596033213
Leu 43 2.1724961981 0.0234451234 1.0000854977
0.9874210682
Ile 44 2.2794620470 0.0205899990 0.9999641844
1.0197637155
Phe 45 2.2660321777 0.0217755697 0.9999306039
1.0421379092
Ala 46 2.1729682747 0.0483898292 1.0000167837
0.9754165112
Gly 47 2.1915406531 0.0148651034 0.9999194534
1.0343232176
Lys 48 2.2016732717 0.0155452463 0.9999383358
0.9732134099
Gln 49 2.1141649049 0.0142897383 0.9999972145
1.0362582636
Leu 50 2.2446689113 0.0197525364 1.0000404883
1.0195900640
Glu 51 2.1335609132 0.0185638164 1.0000345192
1.0404740540
Asp 52 2.0458265139 0.0132912614 0.9999145105
1.0221892002
Arg 54 2.1920210434 0.0176455608 0.9999412441
1.0193747658
Thr 55 2.2753128555 0.0229745618 1.0000969364
1.0028472997
Leu 56 2.3691068467 0.0178614867 0.9999928667
0.9995455566
Ser 57 2.2983222248 0.0131015178 1.0000750068
1.0098517268
Asp 58 2.3769907297 0.0162427580 0.9998986108 1.0181609029
Tyr 59 2.2256843980 0.0154779709 1.0000376279 1.0475318627
Asn 60 2.2846698652 0.0125098898 1.0000299503 1.0281612720
Ile 61 2.3424689623 0.0183247678 1.0000785055 1.0171632834
Gln 62 1.9739439400 0.0128640189 1.0000952659 1.0088033912
Lys 63 2.1649707729 0.0098056514 0.9999262068 0.9832475002
Glu 64 2.3191094620 0.0194617967 0.9999794790 0.9758883889
Ser 65 2.1891418564 0.0151022978 0.9999757060 1.0130577769
Thr 66 2.1815008726 0.0153661351 0.9999691400 0.9895553365
Leu 67 2.2680880018 0.0202677453 1.0000976871 1.0128013689
His 68 2.2466861379 0.0203686835 1.0000782333 0.9942114404
Leu 69 2.2487069935 0.0157971135 1.0001011576 0.9214599619
Val 70 2.3153507756 0.0204338316 0.9999388380 1.0012363831
Leu 71 2.1758050479 0.0144021081 1.0000658947 1.0177240510
Arg 72 2.1635655560 0.0124230435 0.9999281499 0.9947899129
Leu 73 1.8566654289 0.0098446747 1.0000177464 1.0122496780
Arg 74 1.6126431221 0.0110867953 1.0000573486 1.0125724704
Gly 75 1.1626555052 0.0130808608 1.0000467414 1.0284416584
Gly 76 0.7627765065 0.0029557520 0.9996255421 0.9458842109
.