mailRe: Redesign of the relax data model: 3. Molecules, residues, and spins


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on January 15, 2007 - 09:17:
On 1/8/07, gary thompson <garyt@xxxxxxxxxxxxxxx> wrote:
>   On Wed, 2006-10-11 at 17:02 +1000, Edward d'Auvergne wrote:
>

[snip]

>   Before reading this post, please read the previous posts:
>
>   * The parent message 'Redesign of the relax data model:  A HOWTO for
>   breaking relax.' located at
>   https://mail.gna.org/public/relax-devel/2006-10/msg00053.html
>   (Message-id:
>   <1160550133.9523.54.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).
>
>   * Section 1 'Redesign of the relax data model:  1.  Why change?' located
>   at https://mail.gna.org/public/relax-devel/2006-10/msg00054.html
>   (Message-id:
>   <1160551172.9523.60.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).
>
>   * Section 2 'Redesign of the relax data model:  2.  A new run concept'
>   located at https://mail.gna.org/public/relax-devel/2006-10/msg00056.html
>   (Message-id:
>   <1160555137.9523.70.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).
>
>
>
>   3.  Molecules, residues, and spins
>

[snip]

>   3.2  The data selection concept - identifying spin systems
>
>   3.2.1  Function arguments
>
>   The current way that spins are identified in the user functions (as well
>   as internal relax functions) is through the residue number and/or
>   residue name.  There is no formal or consistent way that this is done
>   though.  Some arguments are called 'res_num' while others are just
>   'num'.  The proposal is to standardise the interface and create the file
>   called 'generic_fns/spin_selector.py'.  Using the three-level spin data
>   model introduced in section 3.1, six identifiers are possible.  These
>   are:
>
>   Molecule number, 'data.mol[0].num' (e.g. the NMR model number).
>   Molecule name,   'data.mol[0].name' (e.g. the chain or segment ID).
>   Residue number,  'data.mol[0].res[0].num'.
>   Residue name,    'data.mol[0].res[0].name'.
>   Atom number,     'data.mol[0].res[0].spin[0].num' (e.g. the PDB atom
>   number).
>   Atom name,       'data.mol[0].res[0].spin[0].name' (e.g. the PDB atom
>   name).
>
>   These could be synonymous with the spin identifying function arguments
>   'mol_num', 'mol_name', 'res_num', 'res_name', 'atom_num', and
>   'atom_name'.  These would all default to the inactive value of None and
>   would be the very last arguments of the relevant user functions.  Are
>   there other ways that a spin or set of spins be identified?

one answer would be to use a little language concept. Thus for example
Molmol and the UCSF systems use

#<molecule number> | <molecule_name>:<residue_selection>[,
<residue_selection>...]@<atom_selection> [,<atom_selection>]

residue_selection=<residue_number> | <residue_range> | <residue_type>
residue_selection=<residue_number>-<residue_number>
atom_selection=<string_and_wildcards>and Note thuis


this reduces selection to a single argument plus a simple parser which would yield selection objects which can identify if a molecule/residue/spin selection is selected and be passed around the system. Having selection object engenders clarity and simplicity:

e.g.

class selection:
   def selected_spins(self):
        '''returns an iterator of spins which are selected where a spin is
a reference to a spin in the form
'self.relax.data.mol[0].res[16].spin[3]'''


This allows for considerable flexibility for the user and a simple internal structure

This is a much better way of selecting spin systems! Is there a reference for this specification? Duplicating the reference in the user function docstrings to explain how to use this argument would be important. Does anyone know alternatives to this?


>   3.4  The spin loop
>

wouldn't a function that returned an iterator be better?

That would be much better. Originally I designed relax to use as little of the Python language specific features as possible. The reason was to enable relax or parts of relax to be ported to other languages (like relax's C modules). This is of little importance now as all the number crunching, CPU intensive code is located in the 'math_fns' directory. Using a generator/iterator through the yield statement would be a much cleaner solution.


>   Many parts of relax require looping over all the relaxation data (or
>   spins).  The implementation of this proposal will require nested looping
>   over all molecules, all residues, and all spins combined with tests for
>   matches to the 'mol_num', 'mol_name', 'res_num', 'res_name', 'atom_num',
>   and 'atom_name' arguments.  Rather than implementing this numerous times
>   throughout the program, the loop could be implemented just once within
>   the function 'self.relax.generic_fns.spin_selector.spin_loop()'.  In
>   addition to the six identifiers, this new function could except as an
>   argument a spin-specific function passed by the part of the code
>   requesting the loop.  The 'spin_loop()' function will then pass the data
>   structure 'spin', which is for example an alias to
>   'self.relax.data.mol[0].res[16].spin[3]', to the spin-specific function.
>   A sample implementation of the loop function could be:
>
>
>       def spin_loop(fn=None, mol_num=None, mol_name=None, res_num=None,
>   res_name=None, atom_num=None, atom_name=None):
>           """Function for selectively looping over all spins."""
>
>           # Reassign the data container.
>           data = self.relax.data[self.relax.run]
>
>           # Loop over the molecules.
>           for mol in data.mol:
>               # Skip the molecule if there is no match to 'mol_num'.
>               if type(mol_num) == int and not mol.num == mol_num:
>                   continue
>               elif type(mol_num) == str and not match(mol_num, `mol.num`):
>                   continue
>
>               # Skip the molecule if there is no match to 'mol_name'.
>               if mol_name != None and not match(mol_name, `mol.name`):
>                   continue
>
>               # Loop over the residues.
>               for res in mol.res:
>                   # Skip the residue if there is no match to 'res_num'.
>                   if type(res_num) == int and not res.num == res_num:
>                       continue
>                   elif type(res_num) == str and not match(res_num,
>   `res.num`):
>                       continue
>
>                   # Skip the residue if there is no match to 'res_name'.
>                   if res_name != None and not match(res_name, `res.name`):
>                       continue
>
>                   # Loop over the spins.
>                   for spin in res.spin:
>                       # Skip the spin if there is no match to 'atom_num'.
>                       if type(atom_num) == int and not spin.num ==
>   atom_num:
>                           continue
>                       elif type(atom_num) == str and not match(atom_num,
>   `spin.num`):
>                           continue
>
>                       # Skip the spin if there is no match to 'atom_name'.
>                       if atom_name != None and not match(atom_name,
>   `spin.name`):
>                           continue
>
>                       # Execute the supplied spin-specific function,
>   passing in the data for the current spin.
>                       fn(spin)
>
>
>   It will be up to the spin-specific function passed in by the calling
>   function to handle the 'spin.select' value.  Because of the complexity
>   of the loop, the use of this single 'spin_loop()' function will simplify
>   the relax code base, will minimise potential bugs, and will simplify
>   future changes to the relax data model (if necessary).

use of an iterator object will provide flexibility as iterators can be
wrapped filtered and generally mucked about with using pythons loops
and iter tools. Whats more they are  doddle to code as all you do is
write an ordinary function and call yield with a value each time you
have  identified a selected spin
(http://www.python.org/dev/peps/pep-0255/).... This also allows
arbitrary selection to be added as wrapper iterators or filtered
iterators

Writing a function which yields the spin system specific data container would be better I'll get to the rest of your post Gary in a second email.

Cheers,

Edward



Related Messages


Powered by MHonArc, Updated Fri Jan 19 13:20:19 2007