On Sun, 2007-01-07 at 22:17 +0000, gary thompson wrote: [snip]
3.2 The data selection concept - identifying spin systems 3.2.1 Function arguments The current way that spins are identified in the user functions (as well as internal relax functions) is through the residue number and/or residue name. There is no formal or consistent way that this is done though. Some arguments are called 'res_num' while others are just 'num'. The proposal is to standardise the interface and create the file called 'generic_fns/spin_selector.py'. Using the three-level spin data model introduced in section 3.1, six identifiers are possible. These are: Molecule number, 'data.mol[0].num' (e.g. the NMR model number). Molecule name, 'data.mol[0].name' (e.g. the chain or segment ID). Residue number, 'data.mol[0].res[0].num'. Residue name, 'data.mol[0].res[0].name'. Atom number, 'data.mol[0].res[0].spin[0].num' (e.g. the PDB atom number). Atom name, 'data.mol[0].res[0].spin[0].name' (e.g. the PDB atom name). These could be synonymous with the spin identifying function arguments 'mol_num', 'mol_name', 'res_num', 'res_name', 'atom_num', and 'atom_name'. These would all default to the inactive value of None and would be the very last arguments of the relevant user functions. Are there other ways that a spin or set of spins be identified?one answer would be to use a little language concept. Thus for example Molmol and the UCSF systems use #<molecule number> | <molecule_name>:<residue_selection>[, <residue_selection>...]@<atom_selection> [,<atom_selection>] residue_selection=<residue_number> | <residue_range> | <residue_type> residue_selection=<residue_number>-<residue_number> atom_selection=<string_and_wildcards>and Note thuis this reduces selection to a single argument plus a simple parser which would yield selection objects which can identify if a molecule/residue/spin selection is selected and be passed around the system.
[snip again]
3.4 The spin loopwouldn't a function that returned an iterator be better?Many parts of relax require looping over all the relaxation data (or spins). The implementation of this proposal will require nested looping over all molecules, all residues, and all spins combined with tests for matches to the 'mol_num', 'mol_name', 'res_num', 'res_name', 'atom_num', and 'atom_name' arguments. Rather than implementing this numerous times throughout the program, the loop could be implemented just once within the function 'self.relax.generic_fns.spin_selector.spin_loop()'. In addition to the six identifiers, this new function could except as an argument a spin-specific function passed by the part of the code requesting the loop. The 'spin_loop()' function will then pass the data structure 'spin', which is for example an alias to 'self.relax.data.mol[0].res[16].spin[3]', to the spin-specific function. A sample implementation of the loop function could be: def spin_loop(fn=None, mol_num=None, mol_name=None, res_num=None, res_name=None, atom_num=None, atom_name=None): """Function for selectively looping over all spins.""" # Reassign the data container. data = self.relax.data[self.relax.run] # Loop over the molecules. for mol in data.mol: # Skip the molecule if there is no match to 'mol_num'. if type(mol_num) == int and not mol.num == mol_num: continue elif type(mol_num) == str and not match(mol_num, `mol.num`): continue # Skip the molecule if there is no match to 'mol_name'. if mol_name != None and not match(mol_name, `mol.name`): continue # Loop over the residues. for res in mol.res: # Skip the residue if there is no match to 'res_num'. if type(res_num) == int and not res.num == res_num: continue elif type(res_num) == str and not match(res_num, `res.num`): continue # Skip the residue if there is no match to 'res_name'. if res_name != None and not match(res_name, `res.name`): continue # Loop over the spins. for spin in res.spin: # Skip the spin if there is no match to 'atom_num'. if type(atom_num) == int and not spin.num == atom_num: continue elif type(atom_num) == str and not match(atom_num, `spin.num`): continue # Skip the spin if there is no match to 'atom_name'. if atom_name != None and not match(atom_name, `spin.name`): continue # Execute the supplied spin-specific function, passing in the data for the current spin. fn(spin) It will be up to the spin-specific function passed in by the calling function to handle the 'spin.select' value. Because of the complexity of the loop, the use of this single 'spin_loop()' function will simplify the relax code base, will minimise potential bugs, and will simplify future changes to the relax data model (if necessary).use of an iterator object will provide flexibility as iterators can be wrapped filtered and generally mucked about with using pythons loops and iter tools. Whats more they are doddle to code as all you do is write an ordinary function and call yield with a value each time you have identified a selected spin (http://www.python.org/dev/peps/pep-0255/).... This also allows arbitrary selection to be added as wrapper iterators or filtered iterators
The UCSF selection syntax is sufficiently powerful for all relax needs, as well as being simple and well known amongst potential users. It seems like an excellent alternative to the current spin selection methods. Coding the parser as an iterator is also a good idea. To extend things a bit further, we could incorporate all of this with a functor similar to that proposed for handling multiple run selections (https://mail.gna.org/public/relax-devel/2007-01/msg00013.html and https://mail.gna.org/public/relax-devel/2007-01/msg00020.html ). Of course the spin functor would opperate at a different level of code to the run functor - whereas all user functions would be instances of the run functor, only certain internal functions (those that act on a single spin) would be instances of the spin functor. Chris