r6050 - /1.3/docs/latex/data_analysis.tex -- May 01, 2008

Author: bugman
Date: Thu May  1 17:16:03 2008
New Revision: 6050

URL: http://svn.gna.org/viewcvs/relax?rev=6050&view=rev
Log:
Removed a LaTeX file which was split into separate chapters long ago.


Removed:
    1.3/docs/latex/data_analysis.tex

Removed: 1.3/docs/latex/data_analysis.tex
URL: 
http://svn.gna.org/viewcvs/relax/1.3/docs/latex/data_analysis.tex?rev=6049&view=auto
==============================================================================
--- 1.3/docs/latex/data_analysis.tex (original)
+++ 1.3/docs/latex/data_analysis.tex (removed)
@@ -1,380 +1,0 @@
-% Data analysis.
-%%%%%%%%%%%%%%%%
-
-\chapter{Data analysis}
-
-
-
-% Introduction.
-%%%%%%%%%%%%%%%
-
-\section{Introduction}
-
-This chapter aims to explain not only the steps involved in the data 
analysis of relaxation data but also how to use the program relax to achieve 
this.  Although a work in progress it covers how to calculate the NOE, how to 
optimise and find the $\Rone$ and $\Rtwo$ relaxation rates, and how to 
implement model-free analysis.
-
-
-% Calculating the NOE.
-%%%%%%%%%%%%%%%%%%%%%%
-
-\section{Calculating the NOE}
-\index{NOE|textbf}
-
-The calculation of NOE values is a straight forward and quick procedure 
which involves two components -- the calculation of the value itself and the 
calculation of the errors.  To understand the steps involved the execution of 
a sample NOE calculation script will be followed in detail.
-
-
-% The sample script.
-\subsubsection{The sample script}
-
-\begin{exampleenv}
-\# Script for calculating NOEs. \\
- \\
-\# Create the run \\
-name = `noe' \\
-run.create(name, `noe') \\
- \\
-\# Load the sequence from a PDB file. \\
-pdb(name, `Ap4Aase\_new\_3.pdb', load\_seq=1) \\
- \\
-\# Load the reference spectrum and saturated spectrum peak intensities. \\
-noe.read(name, file=`ref.list', spectrum\_type=`ref') \\
-noe.read(name, file=`sat.list', spectrum\_type=`sat') \\
- \\
-\# Set the errors. \\
-noe.error(name, error=3600, spectrum\_type=`ref') \\
-noe.error(name, error=3000, spectrum\_type=`sat') \\
- \\
-\# Individual residue errors. \\
-noe.error(name, error=122000, spectrum\_type=`ref', res\_num=114) \\
-noe.error(name, error=8500, spectrum\_type=`sat', res\_num=114) \\
- \\
-\# Deselect unresolved residues. \\
-deselect.read(name, file=`unresolved') \\
- \\
-\# Calculate the NOEs. \\
-calc(name) \\
- \\
-\# Save the NOEs. \\
-value.write(name, param=`noe', file=`noe.out', force=1) \\
- \\
-\# Create grace files. \\
-grace.write(name, y\_data\_type=`ref', file=`ref.agr', force=1) \\
-grace.write(name, y\_data\_type=`sat', file=`sat.agr', force=1) \\
-grace.write(name, y\_data\_type=`noe', file=`noe.agr', force=1) \\
- \\
-\# View the grace files. \\
-grace.view(file=`ref.agr') \\
-grace.view(file=`sat.agr') \\
-grace.view(file=`noe.agr') \\
- \\
-\# Write the results. \\
-results.write(name, file=`results', dir=None, force=1) \\
- \\
-\# Save the program state. \\
-state.save(`save', force=1)
-\end{exampleenv}
-
-
-% Initialisation of the run.
-\subsubsection{Initialisation of the run} \label{NOE initialisation}
-
-Firstly to simplify referencing of the run name in the relevent functions 
the name \texttt{`noe'} is assigned to to the object \texttt{name} by the 
command
-
-\example{name = `noe'}
-
-Therefore instead of typing \texttt{`noe'} each time the run needs to be 
referenced, \texttt{name} can be used instead.  The run is created by the 
command
-
-\example{run.create(name, `noe')}
-
-This user function will then create a run which is named \texttt{`noe'}, the 
second argument setting the run type to that of calculating the NOE.  Setting 
the run type is important so that the program knows which user functions are 
compatible with the run, for example the function \texttt{minimise()} is 
meaningless in this sample script as the NOE values are directly calculated 
rather than optimised.
-
-
-% Loading the data.
-\subsubsection{Loading the data}
-
-The first thing which need to be completed prior to any residue specific 
command is to load the sequence.  In this case the command
-
-\example{pdb(name, `Ap4Aase\_new\_3.pdb', load\_seq=1)}
-\index{PDB}
-
-will extract the sequence from the PDB file `Ap4Aase\_new\_3.pdb'.  The 
first argument specifies the run into which the sequence will be loaded, the 
second specifies the file name, and the third causes the function to extract 
the sequence rather than just load the PDB into relax.  Although the PDB 
coordinates have been loaded into the program the structure serves no purpose 
when calculating NOE values.
-
-The next two commands
-
-\begin{exampleenv}
-noe.read(name, file=`ref.list', spectrum\_type=`ref') \\
-noe.read(name, file=`sat.list', spectrum\_type=`sat')
-\end{exampleenv}
-
-load the peak heights\index{peak!height} of the reference and saturated NOE 
experiments (although the volume\index{peak!volume} could be used instead).  
The keyword argument \texttt{format} has not been specified hence the default 
format of a Sparky\index{computer prgrams!Sparky} peak list (saved after 
typing \texttt{`lt'}) is assumed.  If the program XEasy\index{computer 
prgrams!XEasy} was used to analyse the spectra the argument 
\texttt{format='xeasy'} is necessary.  The first column of the file should be 
the Sparky assignment string and it is assumed that the 4$^\textrm{th}$ 
column contains either the peak height.  If you have any other format you 
would like read by relax please send an email to the relax development 
mailing list\index{mailing list!relax-devel} detailing the software used, the 
format of the file (specifically where the residue number and peak 
intensity\index{peak!intensity} are located), and possibly attaching an 
example of the file itself.
-
-
-% Setting the errors.
-\subsubsection{Setting the errors}
-
-In this example the errors where measured from the base plain noise.  The 
Sparky RMSD\index{RMSD} function was used to estimate the maximal noise 
levels across the spectrum in regions containing no peaks.  For the reference 
spectrum the RMSD was approximately 3600 whereas in the saturated spectrum 
the RMSD was 3000.  These errors are set by the commands
-
-\begin{exampleenv}
-noe.error(name, error=3600, spectrum\_type=`ref') \\
-noe.error(name, error=3000, spectrum\_type=`sat')
-\end{exampleenv}
-
-For the residue G114, the noise levels are significantly increased compared 
to the rest of the protein as the peak is located close to the water signal.  
The higher errors for this residue are specified by the commands
-
-\begin{exampleenv}
-noe.error(name, error=122000, spectrum\_type=`ref', res\_num=114) \\
-noe.error(name, error=8500, spectrum\_type=`sat', res\_num=114)
-\end{exampleenv}
-
-
-% Unresolved residues.
-\subsubsection{Unresolved residues}
-
-As the peaks of certain residues overlap to such an extent that the heights 
cannot be resolved a simple text file was created called \texttt{unresolved} 
in which each line consists of a single residue number.  By using the command
-
-\example{deselect.read(name, file=`unresolved')}
-
-all residues in the file \texttt{unresolved} are excluded from the analysis.
-
-
-% The NOE.
-\subsubsection{The NOE}
-
-At this point the NOE can be calculated.  The user function
-
-\example{calc(name)}
-
-will calculate both the NOE and the errors.  The NOE value will be 
calculated using the formula
-\begin{equation}
-NOE = \frac{I_{sat}}{I_{ref}},
-\end{equation}
-
-\noindent where $I_{sat}$ is the intensity of the peak in the saturated 
spectrum and $I_{ref}$ is that of the reference spectrum.  The error is 
calculated by
-\begin{equation}
-\sigma_{NOE} = \sqrt{\frac{(\sigma_{sat} \cdot I_{ref})^2 + (\sigma_{ref} 
\cdot I_{sat})^2}{I_{ref}}},
-\end{equation}
-
-\noindent where $\sigma_{sat}$ and $\sigma_{ref}$ are the peak intensity 
errors in the saturated and reference spectra respectively.  To create a file 
of the NOEs the command
-
-\example{value.write(name, param=`noe', file=`noe.out', force=1)}
-
-will create a file called \texttt{noe.out} with the NOE values and errors.  
The force flag will cause any file with the same name to be overwritten.  An 
example of the format of \texttt{noe.out} is
-
-{\footnotesize \begin{verbatim}
-Num  Name  Value                         Error
-1    GLY   None                          None
-2    PRO   None                          None
-3    LEU   None                          None
-4    GLY   0.12479588727508535           0.020551827436105764
-5    SER   0.42240815792914105           0.02016346825976852
-6    MET   0.45281703194372114           0.026272719841642134
-7    ASP   0.60727570079478255           0.032369427242382849
-8    SER   0.63871921623680161           0.024695665815261791
-9    PRO   None                          None
-10   PRO   None                          None
-11   GLU   None                          None
-12   GLY   0.92927160307645906           0.059569089743604184
-13   TYR   0.88832516377296256           0.044119641308479306
-14   ARG   0.84945042565860407           0.060533543601110441
-\end{verbatim}}
-
-
-% Viewing the results.
-\subsubsection{Viewing the results}
-
-Any two dimensional data set can be plotted in relax in conjunction with the 
program \href{http://plasma-gate.weizmann.ac.il/Grace/}{Grace}\index{computer 
program!Grace|textbf}.  The program is also known as Xmgrace and was 
previously known as ACE/gr or Xmgr.  The highly flexible relax user function 
\texttt{grace.write} is capable of producing 2D plots of any x-y data sets.  
The three commands
-
-\begin{exampleenv}
-grace.write(name, y\_data\_type=`ref', file=`ref.agr', force=1) \\
-grace.write(name, y\_data\_type=`sat', file=`sat.agr', force=1) \\
-grace.write(name, y\_data\_type=`noe', file=`noe.agr', force=1)
-\end{exampleenv}
-
-create three separate plots of the peak intensity of the reference and 
saturated spectra as well as the NOE.  The x-axis in all three defaults to 
the residue number.  As the x and y-axes can be any parameter the command
-
-\example{grace.write(name, x\_data\_type=`ref', y\_data\_type=`sat', 
file=`ref\_vs\_sat.agr', force=1)}
-
-would create a plot of the reference verses the saturated intensity with one 
point per residue.  Returning to the sample script three Grace data files are 
created \texttt{ref.agr}, \texttt{sat.agr}, and \texttt{noe.agr} and placed 
in the default directory \texttt{./grace}.  These can be visualised by 
opening the file within Grace.  However relax will do that for you with the 
commands
-
-\begin{exampleenv}
-grace.view(file=`ref.agr') \\
-grace.view(file=`sat.agr') \\
-grace.view(file=`noe.agr')
-\end{exampleenv}
-
-An example of the output after modifying the axes is shown in 
figure~\ref{fig: NOE plot}.
-
-\begin{figure}
-\centerline{\includegraphics[width=0.8\textwidth, bb=0 0 792 
612]{images/noe.eps.gz}}
-\caption[NOE plot]{A Grace\index{software!Grace|textbf} plot of the NOE 
value and error against the residue number.  This is an example of the output 
of the user function \texttt{grace.write()}.}\label{fig: NOE plot}
-\end{figure}
-
-
-
-% Relaxation curve-fitting.
-%%%%%%%%%%%%%%%%%%%%%%%%%%%
-
-\newpage
-\section{The $\Rone$ and $\Rtwo$ relaxation rates - relaxation curve-fitting}
-\index{relaxation curve-fitting|textbf}
-
-Relaxation curve-fitting involves a number of steps including the loading of 
data, the calculation of both the average peak 
intensity\index{peak!intensity} across replicated spectra and the standard 
deviations\index{standard deviation} of those peak intensities, selection of 
the experiment type, optimisation of the parameters of the fit, Monte Carlo 
simulations\index{Monte Carlo simulation} to find the parameter errors, and 
saving and viewing the results.  To simplify the process a sample script will 
be followed step by step as was done with the NOE calculation.
-
-% The sample script.
-\subsubsection{The sample script}
-
-\begin{exampleenv}
-\# Script for relaxation curve-fitting. \\
- \\
-\# Create the run. \\
-name = `rx' \\
-run.create(name, `relax\_fit') \\
- \\
-\# Load the sequence from a PDB file. \\
-pdb(name, `Ap4Aase\_new\_3.pdb', load\_seq=1) \\
- \\
-\# Load the peak intensities. \\
-relax\_fit.read(name, file=`T2\_ncyc1.list', relax\_time=0.0176) \\
-relax\_fit.read(name, file=`T2\_ncyc1b.list', relax\_time=0.0176) \\
-relax\_fit.read(name, file=`T2\_ncyc2.list', relax\_time=0.0352) \\
-relax\_fit.read(name, file=`T2\_ncyc4.list', relax\_time=0.0704) \\
-relax\_fit.read(name, file=`T2\_ncyc4b.list', relax\_time=0.0704) \\
-relax\_fit.read(name, file=`T2\_ncyc6.list', relax\_time=0.1056) \\
-relax\_fit.read(name, file=`T2\_ncyc9.list', relax\_time=0.1584) \\
-relax\_fit.read(name, file=`T2\_ncyc9b.list', relax\_time=0.1584) \\
-relax\_fit.read(name, file=`T2\_ncyc11.list', relax\_time=0.1936) \\
-relax\_fit.read(name, file=`T2\_ncyc11b.list', relax\_time=0.1936) \\
- \\
-\# Calculate the peak intensity averages and the standard deviation of all 
spectra. \\
-relax\_fit.mean\_and\_error(name) \\
- \\
-\# Deselect unresolved residues. \\
-deselect.read(name, file=`unresolved') \\
- \\
-\# Set the relaxation curve type. \\
-relax\_fit.select\_model(name, `exp') \\
- \\
-\# Grid search. \\
-grid\_search(name, inc=11) \\
- \\
-\# Minimise. \\
-minimise(`simplex', run=name, constraints=0) \\
- \\
-\# Monte Carlo simulations. \\
-monte\_carlo.setup(name, number=500) \\
-monte\_carlo.create\_data(name) \\
-monte\_carlo.initial\_values(name) \\
-minimise(`simplex', run=name, constraints=0) \\
-monte\_carlo.error\_analysis(name) \\
- \\
-\# Save the relaxation rates. \\
-value.write(name, param=`rx', file=`rx.out', force=1) \\
- \\
-\# Grace plots of the relaxation rate. \\
-grace.write(name, y\_data\_type=`rx', file=`rx.agr', force=1) \\
-grace.view(file=`rx.agr') \\
- \\
-\# Save the program state. \\
-state.save(file=name + `.save', force=1)
-\end{exampleenv}
-
-
-% Initialisation of the run and loading of the data.
-\subsubsection{Initialisation of the run and loading of the data}
-
-The start of this sample script is very similar to that of the NOE 
calculation on page~\pageref{NOE initialisation}.  The two commands
-
-\begin{exampleenv}
-name = `rx' \\
-run.create(name, `relax\_fit')
-\end{exampleenv}
-
-initialise the run by setting the variable \texttt{name} to \texttt{`rx'} to 
be used in the calls to user functions and creating a run called 
\texttt{`rx'}.  The run type is set to relaxation curve-fitting by the 
argument \texttt{`relax\_fit'}.  The sequence is extracted from a 
PDB\index{PDB} file using the same command as the NOE calculation script
-
-\example{pdb(name, `Ap4Aase\_new\_3.pdb', load\_seq=1)}
-\index{PDB}
-
-To load the peak intensities\index{peak!intensity} into relax the user 
function \texttt{relax\_fit.read} is executed.  Two important keyword 
arguments to this command are the file name and the relaxation time period of 
the experiment in seconds.  It is assumed that the file format is that of a 
Sparky\index{computer prgrams!Sparky} peak list.  Using the format argument, 
this can be changed to XEasy\index{computer prgrams!XEasy} text window output 
format.  To be able to import any other type of format please send an email 
to the relax development mailing list\index{mailing list!relax-devel} with 
the details of the format.  Adding support for new formats is trivial.  The 
following series of commands will load peak intensities from six different 
relaxation periods, four of which have been duplicated
-
-\begin{exampleenv}
-relax\_fit.read(name, file=`T2\_ncyc1.list', relax\_time=0.0176) \\
-relax\_fit.read(name, file=`T2\_ncyc1b.list', relax\_time=0.0176) \\
-relax\_fit.read(name, file=`T2\_ncyc2.list', relax\_time=0.0352) \\
-relax\_fit.read(name, file=`T2\_ncyc4.list', relax\_time=0.0704) \\
-relax\_fit.read(name, file=`T2\_ncyc4b.list', relax\_time=0.0704) \\
-relax\_fit.read(name, file=`T2\_ncyc6.list', relax\_time=0.1056) \\
-relax\_fit.read(name, file=`T2\_ncyc9.list', relax\_time=0.1584) \\
-relax\_fit.read(name, file=`T2\_ncyc9b.list', relax\_time=0.1584) \\
-relax\_fit.read(name, file=`T2\_ncyc11.list', relax\_time=0.1936) \\
-relax\_fit.read(name, file=`T2\_ncyc11b.list', relax\_time=0.1936)
-\end{exampleenv}
-
-
-% The rest of the setup.
-\subsubsection{The rest of the setup}
-
-Once all the peak intensity data has been loaded a few calculations are 
required prior to optimisation.  Firstly the peak intensities for individual 
residues needs to be averaged across replicated spectra.  The peak intensity 
errors also have to be calculated using the standard deviation formula.  
These two operations are executed by the user function
-
-\example{relax\_fit.mean\_and\_error(name)}
-
-Any residues which cannot be resolved due to peak overlap were included in a 
file called \texttt{`unresolved'}.  This file consists solely of one residue 
number per line.  These residues are excluded from the analysis by the user 
function
-
-\example{deselect.read(name, file=`unresolved')}
-
-Finally the experiment type is specified by the command
-
-\example{relax\_fit.select\_model(name, `exp')}
-
-The argument \texttt{`exp'} sets the relaxation curve to a two parameter 
\{$\mathrm{R}_x$, $I_0$\} exponential which decays to zero.  The formula of 
this function is
-\begin{equation}
- I(t) = I_0 e^{-\mathrm{R}_x \cdot t},
-\end{equation}
-
-\noindent where $I(t)$ is the peak intensity at any time point $t$, $I_0$ is 
the initial intensity, and $\mathrm{R}_x$ is the relaxation rate (either the 
$\Rone$ or $\Rtwo$).  Changing the user function argument to \texttt{`inv'} 
will select the inversion recovery experiment.  This curve consists of three 
paremeters \{$\Rone$, $I_0$, $I_{\infty}$\} and does not decay to zero.  The 
formula is
-\begin{equation}
- I(t) = I_{\infty} - I_0 e^{-\Rone \cdot t}.
-\end{equation}
-
-
-% Optimisation.
-\subsubsection{Optimisation}
-
-Now that everything has been setup minimision can be used to optimise the 
parameter values.  Firstly a grid search is applied to find a rough starting 
position for the subsequent optimisation algorithm.  Eleven increments per 
dimension of the model (in this case the two dimensions \{$\mathrm{R}_x$, 
$I_0$\}) is sufficient.  The user function for executing the grid search is
-
-\example{grid\_search(name, inc=11)}
-
-The next step is to select one of the minimisation algorithms to optimise 
the model parameters.  Currently for relaxation curve-fitting only simplex 
minimisation is supported.  This is because the relaxation curve-fitting C 
module is incomplete only implementing the chi-squared function.  The 
chi-squared gradient (the vector of first partial derivatives) and 
chi-squared Hessian (the matrix of second partial derivatives) are not yet 
implemented in the C modules and hence optimisation algorithms which only 
employ function calls are supported.  Simplex minimisation is the only 
technique in relax which fits this criteron.  In addition constraints cannot 
be used as the constraint algorithm is dependent on gradient calls.  
Therefore the minimisation command for relaxation curve-fitting is forced to 
be
-
-\example{minimise(`simplex', run=name, constraints=0)}
-
-
-% Error analysis.
-\subsubsection{Error analysis}
-
-Only one technique adequately estimates parameter errors when the parameter 
values where found by optimisation -- Monte Carlo simulations\index{Monte 
Carlo simulation}.
-
-\textbf{\textit{Please write me!}}
-
-
-
-% Model-free analysis.
-%%%%%%%%%%%%%%%%%%%%%%
-
-\newpage
-\section{Model-free analysis}
-\index{model-free analysis|textbf}
-
-\textbf{\textit{Please write me!}}
-
-Until this chapter is written please look at the sample script 
\texttt{`full\_analysis.py'}.
-
-
-% Reduced spectral density mapping.
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-
-\newpage
-\section{Reduced spectral density mapping}
-\index{reduced spectral density mapping|textbf}
-
-\textbf{\textit{Please write me!}}
-
-Until this chapter is written please look at the sample script 
\texttt{`jw\_mapping.py'}.
r6050 - /1.3/docs/latex/data_analysis.tex

Header

Content

Related Messages