On Thu, Oct 9, 2008 at 10:57 PM, Sébastien Morin <sebastien.morin.1@xxxxxxxxx> wrote:
Hi all, Talking with a collegue interested in getting his R1, R2 and NOE from relax, I realized that input of peak amplitudes is now supported in two formats, Sparky and XEasy, i.e. using peak lists which include intensities. I would like to propose two ideas concerning the reading of peak amplitudes within relax... 1. Someone proposed the implementation for the reading of a NMRView peak list. I agree that this should be done. However, NMRView peak lists include both a peak intensity and a peak volume. (What about in Sparky and XEasy ?) Some people prefer using peak intensities, some argue that peak volumes are less error prone since they average local noise (intensities being sensitive to noise spikes). Hence, when inputing a NMRView peak list, one should specify the wish to use intensities or volumes. This will have an impact on how errors are calculated, especially if standard deviation of the spectra noise is to be used as an error source...
The proposal was by Ryan MB Hoffman <rmb dot hoffman at gmail dot com> in the bug #11913 report at http://gna.org/bugs/?11913. Note that I've created the 'spectral_errors' branch in response to Wei Xia's post at https://mail.gna.org/public/relax-users/2008-09/msg00000.html. Tyler Reddy <TREDDY at dal dot ca> has also asked about this (https://mail.gna.org/public/relax-users/2008-10/msg00016.html). Wei and Tyler, I've added you to the CC list in case you may be interested in this discussion. Please ignore these messages if you are not though. Although this branch is to develop the code to handle the situation where no duplicate spectra have been collected in an R1 or R2 experiment, this branch will affect what you propose. See my message at https://mail.gna.org/public/relax-users/2008-09/msg00002.html for details. So I would suggest to that these ideas go into this branch. The idea I had is to create a new user function class called 'spectrum' to handle reading peak intensities and handling spectral errors (and any other spectrum related methods needed in the future). A number of diverse user functions will be collected into this class and modified to be more generic. For example: spectrum.read_intensities() -> from relax_fit.read() and noe.read(). spectrum.error() -> from noe.error(). For inputting peak intensities (be that heights or volumes), the spectrum.read_intensities() user function can be used and the data stored in the SpinContainer instances. This function can be modified to handle a number of file formats including 'generic', 'sparky', 'xeasy', 'nmrpipe', and any other formats a relax user is willing to help us with. It might be best to default to 'generic' to be software agnoistic. This 'generic' format can be of the format: mol_name res_num res_name spin_num spin_name intensity We should introduce the 'sep' arg as well, so that columns can be whitespace delimited, comma delimited, tab delimited, etc. The column position arguments need to be added to allow multiple arrangements of this format, including certain columns being missing (mol and spins for example). For the other formats, the intensity column can currently be specified so the user can input, say, Sparky data where peak heights (or volumes) are not in the 4th column. Unlike certain other programs, relax's flexibility allows the user to do whatever they wish - even if what they are doing is completely wrong. The philosophy for relax is that it's not up to the developers to decide how a user should do their analysis (although that can be influenced by including simple and well designed sample scripts for performing the fundamental analysis types). So using volumes or heights is up to the user (neither here is wrong, they both have their advantages and disadvantages). And by using the intensity column argument, the NMRView user can choose whether heights or volumes will be used.
2. Why use separate peak lists to input amplitudes ? Why not also let people use whatever program they like to extract peak intensities or volumes and ask for an input text file including ALL amplitudes, something formatted like : # res. A_1 A_2 A_3 A_4 ... A_x where A are amplitudes. Using this approach, the different delays would need to be specified in the script or within a separate input file such as: A_1 0.01 A_2 0.01 A_3 0.03 ... .... A_x xxx What do you think ?
This is a great idea - it's where I got the name 'generic'. Since it's your idea, do you have a better name than 'generic'? By combining multiple calls to spectrum.read_intensities() while changing the intensity column argument, this file can then be read. I've used this approach before with the relax_data.read() user function. Actually, the design of this user function would be the perfect model for spectrum.read_intensities(). If you'd like to play around with these ideas in the 'spectral_errors' branch, that would be more than welcome. This is a situation which would really benefit from first implementing a series of system tests for reading all of these formats, including your generic format. The argument unit tests for the user function interface would also be very helpful for catching common typo bugs. Oh, we also have to be careful about reading 1.3.1 and 1.3.2 XML results files if we rename or restructure any variables. Cheers, Edward