mailRe: [bug #21599] allow import of nmr data from ccpn projects (optionally also export)


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on February 10, 2014 - 09:59:
Hi Wayne,

Welcome to the relax mailing lists!  It would be great if you could
look into this.  You are most welcome to develop the code needed.
There is detailed information in the relax development chapter of the
manual explaining the steps required, the coding conventions, etc -
see http://www.nmr-relax.com/manual/relax_development.html or
http://download.gna.org/relax/manual/relax.pdf (the PDF is of higher
quality).  I'll answer your points below:


Justin will have a suitable project.

This would be useful, as the first step to implementing anything in
relax is to create a 'system test'.  This would involve having a
hugely truncated data set sitting in the relax test_suite/shared_data/
directories.  One or more tests would be added to
test_suite/system_tests/ which would execute all of the desired relax
user functions and then check that the data has been correctly loaded.
 Finally the code is written to implement the CCPN data reading.  Once
the tests pass, then the implementation is complete.  All of this
would happen in a dedicated relax development branch.

The tests, when written well, also ensure that the feature will be
fully functional and bug free forever - or as long as the software
relax exists.


I can help on the software side (although I'll need some input from Rasmus 
as well).

This would be great!  I have talked to Rasmus about the data model,
but this was in the context of replacing the relax data store with the
CCPN data model.  However this data model was deemed insufficient for
all the diverse needs of relax and it would restrain the relax
developers too much.  Also ripping out the heart of a large and
complex program and replacing it with something completely different
is probably not feasible.


I assume that Relax doesn't want to ship CCPN code so likely there would 
need to be an environment variable to indicate where the CCPN code exists, 
if it exists.

There is probably no need to support the entirety of the CCPN data
model.  This also depends if the target would be the CCPN data model
itself or the output files from CCPN analysis.  For example support
for reading peak lists from CCPN analysis was planned to be
implemented, see the files in
test_suite/shared_data/peak_lists/ccpn_analysis/ and
https://gna.org/bugs/?17341 (however this did not happen).

Note that small parts of the CCPN code could be shipped with relax to
support fragments of the data model.  The code could go into the
extern/ relax package, and include the text of
https://www.gnu.org/licenses/lgpl-2.1.html in the COPYING file.  It is
ok to distribute LGPL v2.1+ libraries with relax (which itself is GPL
v3+).  The extern.sobol package is itself LGPL licenced.


 We use an environment variable CCPNMR_TOP_DIR to indicate where the 
top-level CCPN directory is, but we don't set it globally (e.g. in .cshrc 
or .bashrc) but instead it is set in a script that you run when you want to 
run the program.  But that doesn't mean it couldn't be set globally by 
someone who wants to use Relax + CCPN.  The relevant directory to add to 
the PYTHONPATH is then CCPNMR_TOP_DIR + '/ccpnmrX.Y/python', where X.Y = 
major.minor release.  Possibly we should just assume that that directory 
has been added to the PYTHONPATH so the imports then work, and if the 
imports don't work then there is no CCPN code (!).

In relax, no environmental variables are used at the moment.
Environmental variables tend to cause Mac OS X and MS Windows users to
freak out.  The best way to do this, and the way it is currently done,
would be for the user to point to the CCPNMR_TOP_DIR when calling the
user function.  The path would either be hardcoded in the user's
scripts, or they manually select it from the GUI.


Loading CCPN projects is one line of code so I can help with that.  The one 
issue that arises is how to pick out the data the user wants to use in 
Relax, and any other parameters that need setting.  Rasmus is working on a 
similar issue for programs like Cyana, so I will ask him about his thoughts.

From the perspective of relax, there would definitely need to be more
than one line of code.  And this again depends on if the CCPN data
model is read or the output files from CCPN analysis.  The targets for
relax would be the following user functions:

spectrum.read_spins - for reading out the spin information consisting
of the molecule name, the residue name and number, and the spin name
and number (each of these would be optional).  See
http://www.nmr-relax.com/manual/spectrum_read_spins.html.
spectrum.read_intensities - for reading out the peak intensity
information (either heights or volumes, as specified by the user
function).  See
http://www.nmr-relax.com/manual/spectrum_read_intensities.html.
chemical_shift.read - for reading out the CS data.  See
http://www.nmr-relax.com/manual/chemical_shift_read.html.

Other information could be read into relax by creating new user
functions in the future.  From the perspective of adding code to CCPN
to output something that relax can read, this might be problematic -
unless that output is a simple *.list peak list which would be read by
the above 3 user functions.  For example in the future the internal
format of the chemical shift information is likely to change as the
anisotropic and rhomic components of the chemical shift tensor are
important for RNA/DNA, polysaccharides and small molecules.  What do
you think?  Would the CCPN analysis peak list or the raw data model
itself be the best target?

Cheers,

Edward



Related Messages


Powered by MHonArc, Updated Mon Feb 10 10:20:24 2014