Dispersion GUI mode - comparison of the analyses

To statistically compare the non-clustered and clustered analyses, the advanced Akaike's Information Criterion (AIC) as derived in d'Auvergne and Gooley (2003) can be used. This information is stored within the recorded log files. Open the ∼/tmp/dispersion/log_non_clustered file and search for the model selection section. The text for residues 59 to 67 should be:

[basicstyle=\ttfamily \tiny,language=relax_log,numbers=none]
The spin cluster [':59@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2          Criterion     
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   1577.42286    1581.42286    
CR72 - relax_disp (Mon Feb 17 18:00:16 2014)                       5                 30                   31.48415      41.48415      
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   31.84758      41.84758      
The model from the data pipe 'CR72 - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':60@N'].
# Data pipe                                       Num_params_(k)    Num_data_sets_(n)    Chi2          Criterion     
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)    2                 30                   2647.97449    2651.97449    
The model from the data pipe 'No Rex - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':61@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2           Criterion      
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   15019.24382    15023.24382    
CR72 - relax_disp (Mon Feb 17 18:00:16 2014)                       5                 30                   77.50622       87.50622       
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   74.73334       84.73334       
The model from the data pipe 'NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':62@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2         Criterion    
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   722.91592    726.91592    
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   30.11618     40.11618     
The model from the data pipe 'NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':63@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2          Criterion     
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   5455.72135    5459.72135    
CR72 - relax_disp (Mon Feb 17 18:00:16 2014)                       5                 30                   58.56731      68.56731      
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   59.90738      69.90738      
The model from the data pipe 'CR72 - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':64@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2           Criterion      
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   13736.91051    13740.91051    
CR72 - relax_disp (Mon Feb 17 18:00:16 2014)                       5                 30                   28.66223       38.66223       
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   29.54008       39.54008       
The model from the data pipe 'CR72 - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':65@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2          Criterion     
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   2498.29408    2502.29408    
CR72 - relax_disp (Mon Feb 17 18:00:16 2014)                       5                 30                   35.13518      45.13518      
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   36.17043      46.17043      
The model from the data pipe 'CR72 - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':66@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2         Criterion    
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   962.74016    966.74016    
CR72 - relax_disp (Mon Feb 17 18:00:16 2014)                       5                 30                   15.02929     25.02929     
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   15.08439     25.08439     
The model from the data pipe 'CR72 - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.


The spin cluster [':67@N'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2           Criterion      
No Rex - relax_disp (Mon Feb 17 18:00:16 2014)                     2                 30                   16773.20431    16777.20431    
CR72 - relax_disp (Mon Feb 17 18:00:16 2014)                       5                 30                   118.17857      128.17857      
NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)    5                 30                   111.56710      121.56710      
The model from the data pipe 'NS CPMG 2-site expanded - relax_disp (Mon Feb 17 18:00:16 2014)' has been selected.

For the log file from the clustered analysis (the ∼/tmp/dispersion/log_clustered file), the text should be as follows:

[basicstyle=\ttfamily \tiny,language=relax_log,numbers=none]
The spin cluster [':59', ':60', ':61', ':62', ':63', ':64', ':65', ':66', ':67'].
# Data pipe                                                        Num_params_(k)    Num_data_sets_(n)    Chi2           Criterion      
No Rex - relax_disp (Sun Feb 23 19:36:51 2014)                     16                240                  56914.82514    56946.82514    
NS CPMG 2-site expanded - relax_disp (Sun Feb 23 19:36:51 2014)    26                240                  510.96553      562.96553      
The model from the data pipe 'NS CPMG 2-site expanded - relax_disp (Sun Feb 23 19:36:51 2014)' has been selected.

The numbers for the `NS CPMG 2-site expanded' model can be directly compared. This is because the parameter number, data set number, chi-squared value and AIC value (labelled as `Criterion' in the logs) can be summed for the non-clustered analysis and then compared to the clustered values.

Analysis	Parameter	Data set	Chi-squared	AIC value
	number (k)	number (n)	value
Non-clustered	40	240	388.966	468.966
Clustered	26	240	510.966	562.966

The Akaike Information Criterion value is much less for the non-clustered analysis. Therefore this result is the most parsimonious - the result closest to Occam's razor as defined by frequentist statistics. Therefore the non-clustered analysis is a statistically better description of the experimental data for this set of residues. For a different cluster of spins, the result may be different. If the assumptions of the same dynamics for all spins (both populations p_A and exchange rates k_ex) is correct, the results of the clustered analysis are nevertheless useful as it can decrease parameter uncertainty. If the assumption is not correct, then the decrease in parameter uncertainty will be coupled with a parameter bias - a shift of the parameter away from reality. This should be avoided at all costs.

To perform a relaxation dispersion analysis on your own system, care in the setup, model choice and design of the clustering should be taken:

Inspect the dispersion curves of all spin systems one by one and decide if any spins should be deselected for the entire analysis (due to the data being of insufficient quality).
Depending on the dynamics of the system, the type of data collected (SQ CPMG vs. MMQ CPMG vs. R_1ρ), and personal preferences, chose which limited set of models will be used in the analysis. For this, the published literature should be consulted. Only use models for which you are sure are suited to the system being studied.
Decide on a number of spin clustering schemes to compare to the non-clustered analysis.
For deciding which analysis is best for representing the dynamics of the system, a balance between the statistical significance (based on modern frequentist statistics such as AIC), the decrease in parameter uncertainty, and the increase in parameter bias needs to be made.
All results for all spins should be carefully inspected and compared.

The relax user manual (PDF), created 2024-06-08.