Re: r3245 - /1.3/generic_fns/selection.py -- April 03, 2007

I've spent a bit of time in the last week or so trying to impliment
boolean operators in the mol-res-spin selection language. I've come to
the conclusion that this will not be possible in the current
implimentation of the spin loop and related functions. 

Consider the selection "#Ap4Aase:4 | #RNA". We mean this to select
residue 4 of the molecule Ap4Aase, and all residues of the molecule RNA.
In the current implimentation, however, it selects all residues of both
molecules. The residue_loop look like:

for mol in data:
    if not mol in selection_object:
        continue
    # both Ap4Aase and RNA get to here; Ap4Aase from the first clause
    # of the selection, RNA from the second
    for res in mol.residues:
        if not res in selection_object:
            continue
        yield res
        # All residues get here, thanks to the second clause of the 
        # selection. Because it doesn't explicitly select residues, 
        # all residues are implicitly selected, and there is no way of
        # knowing which molecule res belongs to.

I see two solutions to the problems I'm running into:

1) Subtly change the data structure so that each spin 'knows' what
residue it belongs to, and each residue knows what molecule it belongs
to. (ie. instances of the SpinContainer class have an attribute residue,
that is a pointer to the residue instance that contains that spin). Then
restructure the spin-loop as:

for spin in data.spins:
    if spin in selection_object:
        yield spin


This has a drawback in terms of efficiency, in that all spins in the
data structure must be explicitly considered, whereas the current nested
spin-loop only considers spins that belong to selected residues, and
only residues of selected spins. I'm not sure how much of a hit this
will amount to in real situations.


2) More radically change the implimentation of the spin loop, such that
it is subsumed into the Selection class. ie. instances of the selection
class will have a method called spin_loop (and residue_loop, and
molecule_loop), which returns the equivalent iterator object. Then we
effectively (though not literally) do the boolean operations on the list
of selected spins, not on the abstract selection object.


Clearly option 2 is a more radical departure from the agreed design, but
it is likely to have better performance characteristics. Any thoughts on
the best way forward?


Chris
Re: r3245 - /1.3/generic_fns/selection.py

Header

Content

Related Messages