re: Redesign of the relax data model: 3. Molecules, residues, and spins -- January 10, 2007

On Sun, 2007-01-07 at 22:17 +0000, gary thompson wrote:

[snip]

  3.2  The data selection concept - identifying spin systems

  3.2.1  Function arguments

  The current way that spins are identified in the user functions (as well
  as internal relax functions) is through the residue number and/or
  residue name.  There is no formal or consistent way that this is done
  though.  Some arguments are called 'res_num' while others are just
  'num'.  The proposal is to standardise the interface and create the file
  called 'generic_fns/spin_selector.py'.  Using the three-level spin data
  model introduced in section 3.1, six identifiers are possible.  These
  are:

  Molecule number, 'data.mol[0].num' (e.g. the NMR model number).
  Molecule name,   'data.mol[0].name' (e.g. the chain or segment ID).
  Residue number,  'data.mol[0].res[0].num'.
  Residue name,    'data.mol[0].res[0].name'.
  Atom number,     'data.mol[0].res[0].spin[0].num' (e.g. the PDB atom
  number).
  Atom name,       'data.mol[0].res[0].spin[0].name' (e.g. the PDB atom
  name).

  These could be synonymous with the spin identifying function arguments
  'mol_num', 'mol_name', 'res_num', 'res_name', 'atom_num', and
  'atom_name'.  These would all default to the inactive value of None and
  would be the very last arguments of the relevant user functions.  Are
  there other ways that a spin or set of spins be identified?


one answer would be to use a little language concept. Thus for example
Molmol and the UCSF systems use

#<molecule number> | <molecule_name>:<residue_selection>[,
<residue_selection>...]@<atom_selection> [,<atom_selection>]

residue_selection=<residue_number> | <residue_range> | <residue_type>
residue_selection=<residue_number>-<residue_number>
atom_selection=<string_and_wildcards>and Note thuis


this reduces selection to a single argument plus a simple parser which
would yield selection objects which can identify if a
molecule/residue/spin selection is selected and be passed around the
system.


[snip again]


  3.4  The spin loop


wouldn't a function that returned an iterator be better?

  Many parts of relax require looping over all the relaxation data (or
  spins).  The implementation of this proposal will require nested looping
  over all molecules, all residues, and all spins combined with tests for
  matches to the 'mol_num', 'mol_name', 'res_num', 'res_name', 'atom_num',
  and 'atom_name' arguments.  Rather than implementing this numerous times
  throughout the program, the loop could be implemented just once within
  the function 'self.relax.generic_fns.spin_selector.spin_loop()'.  In
  addition to the six identifiers, this new function could except as an
  argument a spin-specific function passed by the part of the code
  requesting the loop.  The 'spin_loop()' function will then pass the data
  structure 'spin', which is for example an alias to
  'self.relax.data.mol[0].res[16].spin[3]', to the spin-specific function.
  A sample implementation of the loop function could be:


      def spin_loop(fn=None, mol_num=None, mol_name=None, res_num=None,
  res_name=None, atom_num=None, atom_name=None):
          """Function for selectively looping over all spins."""

          # Reassign the data container.
          data = self.relax.data[self.relax.run]

          # Loop over the molecules.
          for mol in data.mol:
              # Skip the molecule if there is no match to 'mol_num'.
              if type(mol_num) == int and not mol.num == mol_num:
                  continue
              elif type(mol_num) == str and not match(mol_num, `mol.num`):
                  continue

              # Skip the molecule if there is no match to 'mol_name'.
              if mol_name != None and not match(mol_name, `mol.name`):
                  continue

              # Loop over the residues.
              for res in mol.res:
                  # Skip the residue if there is no match to 'res_num'.
                  if type(res_num) == int and not res.num == res_num:
                      continue
                  elif type(res_num) == str and not match(res_num,
  `res.num`):
                      continue

                  # Skip the residue if there is no match to 'res_name'.
                  if res_name != None and not match(res_name, `res.name`):
                      continue

                  # Loop over the spins.
                  for spin in res.spin:
                      # Skip the spin if there is no match to 'atom_num'.
                      if type(atom_num) == int and not spin.num ==
  atom_num:
                          continue
                      elif type(atom_num) == str and not match(atom_num,
  `spin.num`):
                          continue

                      # Skip the spin if there is no match to 'atom_name'.
                      if atom_name != None and not match(atom_name,
  `spin.name`):
                          continue

                      # Execute the supplied spin-specific function,
  passing in the data for the current spin.
                      fn(spin)


  It will be up to the spin-specific function passed in by the calling
  function to handle the 'spin.select' value.  Because of the complexity
  of the loop, the use of this single 'spin_loop()' function will simplify
  the relax code base, will minimise potential bugs, and will simplify
  future changes to the relax data model (if necessary).


use of an iterator object will provide flexibility as iterators can be
wrapped filtered and generally mucked about with using pythons loops
and iter tools. Whats more they are  doddle to code as all you do is
write an ordinary function and call yield with a value each time you
have  identified a selected spin
(http://www.python.org/dev/peps/pep-0255/).... This also allows
arbitrary selection to be added as wrapper iterators or filtered
iterators


The UCSF selection syntax is sufficiently powerful for all relax needs,
as well as being simple and well known amongst potential users. It seems
like an excellent alternative to the current spin selection methods.
Coding the parser as an iterator is also a good idea. 

To extend things a bit further, we could incorporate all of this with a
functor similar to that proposed for handling multiple run selections
(https://mail.gna.org/public/relax-devel/2007-01/msg00013.html and
https://mail.gna.org/public/relax-devel/2007-01/msg00020.html ). Of
course the spin functor would opperate at a different level of code to
the run functor - whereas all user functions would be instances of the
run functor, only certain internal functions (those that act on a single
spin) would be instances of the spin functor.


Chris

re: Redesign of the relax data model: 3. Molecules, residues, and spins

Header

Content

Related Messages