Author: bugman Date: Wed May 14 00:36:39 2014 New Revision: 23163 URL: http://svn.gna.org/viewcvs/relax?rev=23163&view=rev Log: Added a new section for the stereochemistry analysis to the N-state model chapter of the manual. This is just an initial introduction and an inclusion of the sample script. Modified: trunk/docs/latex/n_state.tex Modified: trunk/docs/latex/n_state.tex URL: http://svn.gna.org/viewcvs/relax/trunk/docs/latex/n_state.tex?rev=23163&r1=23162&r2=23163&view=diff ============================================================================== --- trunk/docs/latex/n_state.tex (original) +++ trunk/docs/latex/n_state.tex Wed May 14 00:36:39 2014 @@ -83,3 +83,171 @@ This is not used in optimisation but rather is used to calculate NOE constraint violations. These violations are then compared to evaluate the ensemble. In the stereochemistry auto-analysis, these violations will also be converted to Q factors to allow direct comparison with RDC Q factors. + + + +% Stereochemistry. +%%%%%%%%%%%%%%%%%% + +\section{Determining stereochemistry in dynamic molecules} + +A published application of the N-state model in relax is: +\begin{itemize} + \item \bibentry{Sun11} +\end{itemize} + +This analysis of the stereochemistry of a small molecule consists of two steps. +The first part is to determine the relative configuration. +The idea is to use NMR data (consisting of RDCs and NOEs) to find the relative configuration. +Ensembles of 10 members are created from molecular dynamics simulations (MD)\index{molecular dynamics simulation} or simulated annealing (SA)\index{simulated annealing}. +These are then ranked by the RDC Q factor and NOE violation. +By converting the NOE violation into a Q factor: +\begin{equation} + Q_{\textrm{NOE}}^2 = \frac{U}{\sum_i \textrm{NOE}^2}, +\end{equation} + +where U is the quadratic flat bottom well potential, i.e.\ the NOE violation in \AA$^2$, and the denominator is the sum of all squared NOEs. +A combined Q factor is calculated as: +\begin{equation} + Q_{\textrm{total}}^2 = Q_{\textrm{NOE}}^2 + Q_{\textrm{RDC}}^2. +\end{equation} + +The second step is to distinguish enantiomers. +As NMR data is symmetric, it cannot distinguish enantiomers. +Therefore an optical technique such as \href{http://en.wikipedia.org/wiki/Optical\_rotatory\_dispersion}{optical rotatory dispersion} can be used. +For molecules experiencing large amounts of motion, sampling all possible conformations, calculating the expected dispersion properties, and calculating an averaged dispersion curve is not feasible. +The idea is therefore to combine NMR and ORD by taking the best NMR ensembles from step one to use for ORD spectral prediction. + + +% Auto-analysis. +%~~~~~~~~~~~~~~~ + +\subsection{Stereochemistry -- the auto-analysis} + + +Step one of the N-state model is implemented as an auto-analysis. +This is located in the module \module{auto\_analysis\pysep{}stereochem\_analysis} (see \url{http://www.nmr-relax.com/api/3.1/auto_analyses.stereochem_analysis-module.html}). +The auto-analysis is accessed via the \module{Stereochem\_\linebreak[0]analysis} class, the details of which can be seen at \url{http://www.nmr-relax.com/api/3.1/auto_analyses.stereochem_analysis.Stereochem_analysis-class.html}. + + +% The sample script. +%~~~~~~~~~~~~~~~~~~~ + +\subsection{Stereochemistry -- the sample script} + +The following script was used for the analysis in \citet{Sun11}. +It is used to complete the first step of the analysis, the determination of relative configuration, and for the generation of ensembles for the second step of the analysis. + + +\begin{lstlisting} +"""Script for the determination of relative stereochemistry. + +The analysis is preformed by using multiple ensembles of structures, randomly sampled from a given set of structures. The discrimination is performed by comparing the sets of ensembles using NOE violations and RDC Q factors. + +This script is split into multiple stages: + + 1. The random sampling of the snapshots to generate the N ensembles (NUM_ENS, usually 10,000 to 100,000) of M members (NUM_MODELS, usually ~10). The original snapshot files are expected to be named the SNAPSHOT_DIR + CONFIG + a number from SNAPSHOT_MIN to SNAPSHOT_MAX + ".pdb", e.g. "snapshots/R647.pdb". The ensembles will be placed into the "ensembles" directory. + + 2. The NOE violation analysis. + + 3. The superimposition of ensembles. For each ensemble, Molmol is used for superimposition using the fit to first algorithm. The superimposed ensembles will be placed into the "ensembles_superimposed" directory. This stage is not necessary for the NOE analysis. + + 4. The RDC Q factor analysis. + + 5. Generation of Grace graphs. + + 6. Final ordering of ensembles using the combined RDC and NOE Q factors, whereby the NOE Q factor is defined as:: + + Q^2 = U / sum(NOE_i^2), + + where U is the quadratic flat bottom well potential - the NOE violation in Angstrom^2. The denominator is the sum of all squared NOEs - this must be given as the value of NOE_NORM. The combined Q is given by:: + + Q_total^2 = Q_NOE^2 + Q_RDC^2. +""" + +# relax module imports. +from auto_analyses.stereochem_analysis import Stereochem_analysis + + +# Stage of analysis (see the docstring above for the options). +STAGE = 1 + +# Number of ensembles. +NUM_ENS = 100000 + +# Ensemble size. +NUM_MODELS = 10 + +# Configurations. +CONFIGS = ["R", "S"] + +# Snapshot directories (corresponding to CONFIGS). +SNAPSHOT_DIR = ["snapshots", "snapshots"] + +# Min and max number of the snapshots (corresponding to CONFIGS). +SNAPSHOT_MIN = [0, 0] +SNAPSHOT_MAX = [76, 71] + +# Pseudo-atoms. +PSEUDO = [ + ["Q7", ["@H16", "@H17", "@H18"]], + ["Q9", ["@H20", "@H21", "@H22"]], + ["Q10", ["@H23", "@H24", "@H25"]] +] + +# NOE info. +NOE_FILE = "noes" +NOE_NORM = 50 * 4**2 # The NOE normalisation factor (sum of all NOEs squared). + +# RDC file info. +RDC_NAME = "PAN" +RDC_FILE = "pan_rdcs" +RDC_SPIN_ID1_COL = 1 +RDC_SPIN_ID2_COL = 2 +RDC_DATA_COL = 2 +RDC_ERROR_COL = None + +# Bond length. +BOND_LENGTH = 1.117 * 1e-10 + +# Log file output (only for certain stages). +LOG = True + +# Number of buckets for the distribution plots. +BUCKET_NUM = 200 + +# Distribution plot limits. +LOWER_LIM_NOE = 0.0 +UPPER_LIM_NOE = 600.0 +LOWER_LIM_RDC = 0.0 +UPPER_LIM_RDC = 1.0 + + +# Set up and code execution. +analysis = Stereochem_analysis( + stage=STAGE, + num_ens=NUM_ENS, + num_models=NUM_MODELS, + configs=CONFIGS, + snapshot_dir=SNAPSHOT_DIR, + snapshot_min=SNAPSHOT_MIN, + snapshot_max=SNAPSHOT_MAX, + pseudo=PSEUDO, + noe_file=NOE_FILE, + noe_norm=NOE_NORM, + rdc_name=RDC_NAME, + rdc_file=RDC_FILE, + rdc_spin_id1_col=RDC_SPIN_ID1_COL, + rdc_spin_id2_col=RDC_SPIN_ID2_COL, + rdc_data_col=RDC_DATA_COL, + rdc_error_col=RDC_ERROR_COL, + bond_length=BOND_LENGTH, + log=LOG, + bucket_num=BUCKET_NUM, + lower_lim_noe=LOWER_LIM_NOE, + upper_lim_noe=UPPER_LIM_NOE, + lower_lim_rdc=LOWER_LIM_RDC, + upper_lim_rdc=UPPER_LIM_RDC +) +analysis.run() +\end{lstlisting}