Re: Curve fitting and input -- October 11, 2008

On Thu, Oct 9, 2008 at 10:57 PM, Sébastien Morin
<sebastien.morin.1@xxxxxxxxx> wrote:

Hi all,

Talking with a collegue interested in getting his R1, R2 and NOE from
relax, I realized that input of peak amplitudes is now supported in two
formats, Sparky and XEasy, i.e. using peak lists which include
intensities. I would like to propose two ideas concerning the reading of
peak amplitudes within relax...

1.
Someone proposed the implementation for the reading of a NMRView peak
list. I agree that this should be done. However, NMRView peak lists
include both a peak intensity and a peak volume. (What about in Sparky
and XEasy ?) Some people prefer using peak intensities, some argue that
peak volumes are less error prone since they average local noise
(intensities being sensitive to noise spikes). Hence, when inputing a
NMRView peak list, one should specify the wish to use intensities or
volumes. This will have an impact on how errors are calculated,
especially if standard deviation of the spectra noise is to be used as
an error source...


The proposal was by Ryan MB Hoffman <rmb dot hoffman at gmail dot com>
in the bug #11913 report at http://gna.org/bugs/?11913.  Note that
I've created the 'spectral_errors' branch in response to Wei Xia's
post at https://mail.gna.org/public/relax-users/2008-09/msg00000.html.
 Tyler Reddy <TREDDY at dal dot ca> has also asked about this
(https://mail.gna.org/public/relax-users/2008-10/msg00016.html).  Wei
and Tyler, I've added you to the CC list in case you may be interested
in this discussion.  Please ignore these messages if you are not
though.  Although this branch is to develop the code to handle the
situation where no duplicate spectra have been collected in an R1 or
R2 experiment, this branch will affect what you propose.  See my
message at https://mail.gna.org/public/relax-users/2008-09/msg00002.html
for details.  So I would suggest to that these ideas go into this
branch.

The idea I had is to create a new user function class called
'spectrum' to handle reading peak intensities and handling spectral
errors (and any other spectrum related methods needed in the future).
A number of diverse user functions will be collected into this class
and modified to be more generic.  For example:

spectrum.read_intensities() -> from relax_fit.read() and noe.read().
spectrum.error() -> from noe.error().

For inputting peak intensities (be that heights or volumes), the
spectrum.read_intensities() user function can be used and the data
stored in the SpinContainer instances.  This function can be modified
to handle a number of file formats including 'generic', 'sparky',
'xeasy', 'nmrpipe', and any other formats a relax user is willing to
help us with.  It might be best to default to 'generic' to be software
agnoistic.  This 'generic' format can be of the format:

mol_name    res_num    res_name    spin_num    spin_name    intensity

We should introduce the 'sep' arg as well, so that columns can be
whitespace delimited, comma delimited, tab delimited, etc.  The column
position arguments need to be added to allow multiple arrangements of
this format, including certain columns being missing (mol and spins
for example).  For the other formats, the intensity column can
currently be specified so the user can input, say, Sparky data where
peak heights (or volumes) are not in the 4th column.

Unlike certain other programs, relax's flexibility allows the user to
do whatever they wish - even if what they are doing is completely
wrong.  The philosophy for relax is that it's not up to the developers
to decide how a user should do their analysis (although that can be
influenced by including simple and well designed sample scripts for
performing the fundamental analysis types).  So using volumes or
heights is up to the user (neither here is wrong, they both have their
advantages and disadvantages).  And by using the intensity column
argument, the NMRView user can choose whether heights or volumes will
be used.

2.
Why use separate peak lists to input amplitudes ? Why not also let
people use whatever program they like to extract peak intensities or
volumes and ask for an input text file including ALL amplitudes,
something formatted like :

  #    res.    A_1    A_2    A_3    A_4    ...    A_x

where A are amplitudes. Using this approach, the different delays would
need to be specified in the script or within a separate input file such as:

A_1   0.01
A_2   0.01
A_3   0.03
...     ....
A_x   xxx


What do you think ?


This is a great idea - it's where I got the name 'generic'.  Since
it's your idea, do you have a better name than 'generic'?  By
combining multiple calls to spectrum.read_intensities() while changing
the intensity column argument, this file can then be read.  I've used
this approach before with the relax_data.read() user function.
Actually, the design of this user function would be the perfect model
for spectrum.read_intensities().

If you'd like to play around with these ideas in the 'spectral_errors'
branch, that would be more than welcome.  This is a situation which
would really benefit from first implementing a series of system tests
for reading all of these formats, including your generic format.  The
argument unit tests for the user function interface would also be very
helpful for catching common typo bugs.  Oh, we also have to be careful
about reading 1.3.1 and 1.3.2 XML results files if we rename or
restructure any variables.

Cheers,

Edward

Re: Curve fitting and input

Header

Content

Related Messages