Hi Andy and Troels, Please see below. Note I have CCed Nikolai, Martin, and Flemming as we are talking about their work and they may have an interest in the discussions. This is a public mailing list, so any messages sent are permanently archived and can never be deleted from the web. This thread, including all messages, is archived at http://thread.gmane.org/gmane.science.nmr.relax.devel/5410 (as well as 3 other places on the web). The implementation of your analytic model Andy is discussed in http://thread.gmane.org/gmane.science.nmr.relax.devel/5414. On 4 May 2014 22:59, Andrew Baldwin <andrew.baldwin@xxxxxxxxxxxxx> wrote:
Dear Troels, The existence of typos is exceedingly probable! The editors have added more errors in the transcrpition, so it'll be curious to see how accurate the final draft will be. Your comments come just in time for alterations. So I'll pay a pint of beer for every identified error. I didn't use that particular supplementary section to directly calculate anything, so this is the place in the manu where errors are most likely to occur. I do note the irony that this is probably one of few bits in the article that people are likely to read. 1) Yep - that's a typo. alpha- should definitely be KEG-KGE (score 1 pint). 2) Yep! zs should be zetas (score 1 pint).
As this is open for the world to see, and permanently archived, you'll have to keep your word ;) Hopefully the editor will still accept these important corrections.
3) This one is more interesting. Carver and Richard's didn't define a deltaR2.
This is not quite correct. Thought they didn't have a deltaR2 parameter, they did use the notation 1/T2A - 1/T2B (equation A8 of the original paper as Troels mentioned). I have respected that notation in relax: http://www.nmr-relax.com/manual/full_CR72_2_site_CPMG_model.html Also see section 10.3.3 in the PDF (http://download.gna.org/relax/manual/relax.pdf).
To my knowledge, I think Nikolai was the first to do that in his 2002 paper on reconstructing invisible states.
For reference here, the paper is J. Am. Chem. Soc. 2002, 124, 12352-12360 (http://dx.doi.org/10.1021/ja0207089). The notation defined here is: Delta_R = RB - RA. This is due to the separation of components in equation 2 - the RA part, as well as site A chemical shift, shifting to the first component and then being subtracted from the diagonal in the second. Because of this equation 2, Nikolai's definition of Delta_R is opposite from the Carver and Richards 1/T2A - 1/T2B component. But if site B was not "invisible" and you could see both exchanging peaks, then you could have an equivalent of equation 2 where R2B and omega_B are in the first component and then Delta_R' would be defined as RA - RB. You could then study the Delta_R and Delta_R' together. Note that at no point in relax do we use Nikolai's Delta_R definition. In Nikolai and Martin's code - the Maple expansion - which is in relax, the R20A == R20B assumption is made (see http://www.nmr-relax.com/manual/NS_2_site_expanded_CPMG_model.html). And in the other numeric models you will see that we never use a difference: http://www.nmr-relax.com/api/3.1/lib.dispersion.ns_matrices-module.html http://www.nmr-relax.com/api/3.1/lib.dispersion.ns_matrices-pysrc.html#rcpmg_2d http://www.nmr-relax.com/api/3.1/lib.dispersion.ns_matrices-pysrc.html#rcpmg_3d http://www.nmr-relax.com/api/3.1/lib.dispersion.ns_matrices-pysrc.html#rr1rho_3d_3site http://www.nmr-relax.com/api/3.1/lib.dispersion.ns_matrices-pysrc.html#rr1rho_3d
So here, I think you'll find that my definition is the correct one. There is a typo in the Carver Richard's equation.
Why is this typo in equation A8 not mentioned in the literature? There should also be a published re-derivation of these equations showing where this mistake occurred. This should stand out like a sore thumb!
I've just done a brief lit review, and as far as I can see, this mistake has propagated into every transcription of the equation since.
Well if there is a mistake in the original publication, that is to be expected. A re-derivation showing the mistake is also to be expected.
I should note that in the manu in this section. Likely this has gone un-noticed as differences in R2 are difficult to measure by CPMG, instead R2A=R2B being assumed. Though thanks to Flemming, it is now possible to break this assumption: J Am Chem Soc. 2009 Sep 9;131(35):12745-54. doi: 10.1021/ja903897e.
The last column of Table 2 of the Carver and Richards' paper shows that they themselves have analysed synthetic cases where R20A != R20B. Note that in this paper from Flemming, that he uses the notation R2,int(A) - R2,int(B). But the numerical solution in equations 4 and 5 do not use this difference (well partial solution as these are the differentials). Anyway, the original Carver and Richards model allows the R20A == R20B assumption to be broken.
Also, in the Ishima and Torchia paper looking at the effects of relaxation differences: Journal of biomolecular NMR. 2006 34(4):209-19
In this paper, they optimise the original Carver and Richards model. But they do not mention the typo in the original paper. Here their definition of Delta_R2 is actually the systematic error in the CPMG experiment (see equation 1), not the R20 rate difference between the two states. Also if you look at the first paragraph in the section labelled as "Data fitting using Carver–Richards equation", you can see that they only optimised the parameters R20, delta_omega, pA and tex. So as far as I was aware, they have used R20A == R20B throughout. Hence this does not demonstrate a mistake in the Carver and Richards equations.
I understand their derivations started from Nikolai's formula, which is exact, so the error wouldn't come to light there either, necessarily.
I don't see how these publications support the fact that the original Carver and Richards equation is wrong. Maybe there are better references?
To prove the point try this: numerically evaluate R2,effs over a big grid of parameters, and compare say Nikolai's formula to your implementation of Carver Richard's with a +deltaR2 and a -deltaR2 in the alpha- parameter. Which agrees better? You'll see that dR2=R2E-R2G being positive wins.
This can be performed in relax by using the 'NS CPMG 2-site 3D full' model to simulate the R2eff data, and the 'CR72 full' model with and without the 'typo correction' to fit it. Rather than an extensive grid search, a single point where R20A and R20B are quite different should be a sufficient demonstration.
The attached should also help. It's some test code I used for the paper that has been briefly sanitised. In it, the numerical result, Carver Richards, Meiboom, Nikolai's and my formula are directly compared for speed (roughly) and precision.
Could you please attach files to https://gna.org/support/?3155 or the other support requests (https://gna.org/support/?group=relax) rather than sending it to public mailing lists. Cheers!
The code added for Nikolai's formula at higher precision was contributed by Nikolai after the two of us discussed some very odd results that can come from his algorithm. Turns out evaluating his algorithm at double precision (roughly 9 decimal places, default in python and matlab) is insuffficient.
Which optimisation algorithm did you use and what function tolerance, X tolerance (minimum step size), and maximum number of iterations did you use? These settings are often dominant to the equations of the base model. Do you have a script or parameter set which demonstrates the failure of Nikolai's Maple expansion? I would like to get this into relax to blast it with the Nelder-Mead simplex algorithm which trounces Levenberg-Marquardt in accuracy and speed. There is also 3D space mapping functionality (the dx.map user function) which can be used to directly show the change in the optimisation space due to precision, etc.
About 23 decimal places is necessary for exactness. This is what is discussed in the supplementary info in the recent paper. In short, on my laptop: formula / relative speed / biggest single point error in s-1 over grid Mine 1 7.2E-10 Meiboom 0.3 9E7 Carver Richards 0.53 8 Nikolai (double precision 9dp) 3.1 204 Nikolai (long double 18dp) 420 3E-2 Nikolai (23 dp) 420 2E-7 Numerical 118 0 The error in Carver Richard's with the alpha- sign made consistent with the Carver Richard's paper gives an error of 4989 s-1 in the same test. Ie, it's a bad idea. Note that the precise calculations in python at arbitrary precision are slow not because of the algorithm, but because of the mpmath module, which deals with C code via strings. Accurate but slow. This code has a small amount of additional overhead that shouldn't strictly be included in the time. I've no doubt also that algorithms here could be sped up further by giving them a bit of love. Those prefaces aside, I'm surprised that your benchmarking last Friday showed Nikolai's to be so much slower than mine? In my hands (attached) at double precision, Nikolai's formula is only 3x slower than mine using numpy. Though this is not adequate precision for mistake free calculations.
It's ok. In relax, the models are implemented as the original authors intended. If there is a mistake in the original model, then that mistake is replicated. But, importantly, well documented. The subsequent paper which fixes the model will then be used to implement a second model with the fix. This is to allow relax users to test historical results and to compare them to all other models and fixed models. It also means that history is not rewritten and that original equations are permanently preserved. Nikolai's code was implemented as 64-bit - due to it being Matlab code - so that's what we will preserve in relax. For reference, Troels' post with the timings is at http://thread.gmane.org/gmane.science.nmr.relax.devel/5410/focus=5515. The time for your model, Andy, was 6.001s whereas Nikolai's Maple expansion (the "NS CPMG 2-site expanded" model in relax) was 8.735s. I have made sure that Nikolai's code is as optimised as possible in relax. For example in the script you attached, you have a lot of integers: t3 = mp.mpc(0,1)/2; t4 = t3*dw; t5 = Ka/2; t6 = Ra/2; t7 = Kb/2; t8 = Rb/2; t9 = -2*(mp.mpc(0,1)); t14 = 2*(mp.mpc(0,1)); t20 = 2*Kb*Ka; t22 = 2*Ka*Rb; t24 = 2*Ra*Kb; t26 = 2*Ra*Rb; t28 = 2*Kb*Rb; t30 = 2*Ka*Ra; If you use 2.0 instead of 2, then Python type conversions are avoided, speeding up the code. As far as I'm aware, such a trick is not required in Matlab. There are many other small optimisations I have made. For your model labelled as "B14" in relax, these optimisation are not present yet. But the factor of 3x rather than current 1.5x will probably be replicated once the code is optimised.
Perhaps more physically compelling, to get my equation to reduce to Carver Richard's, alpha- has to be defined as I show it. Note that the origin of this parameter is its role in the Eigenvalue of the 2x2 matrix. This is the one advantage of a derivation - you can see where the parameters come from and get a feel for what their meaning is. So this typo, and the fact that Carver and Richards inadvertently change their definition of tau_cpmg half way through the paper, are both errors in the original manuscript. Though given that proof reading and type-setting in 1972 would have been very very painful, that there are this few errors is still actually quite remarkable. I notice in your CR72 code, you are currently using the incorrect definition, so I would suggest you change that.
We should show this to be the case in relax's implementation before changing the Carver and Richards formula. Yes, I have used the original definition. But I have not encountered literature, apart from your paper Andy (http://dx.doi.org/10.1007%2Fs10858-012-9694-6), that says that it is incorrect. Therefore I would really also like to have a direct literature quote + derivation/demonstration of the Carver and Richards typo before we change it. And a internal demonstration of inconsistency between the 'NS CPMG 2-site 3D full' and 'CR72 full' models in relax. Once a strong proof of the Carver and Richards mistake is demonstrated, I would then insist that the corrected model be called something different and that we preserve the original equations as the 'CR72 full' model. The Carver and Richards model is so historically important for the dispersion field that we must preserve it as is. Cheers, Edward