Re: Resolution to bug #5501, "residue deselection problem on relax_data.read()". -- April 07, 2006

Sorry for dragging my feet on this one...

I've implimented the following fix:


Dragging your feet?  It hasn't been that long considering that the bug
isn't fatal.  Thanks for the fix.

relax_data.read() no-longer attempts to deselect residues.
This means that all residues in the sequence will be selected,
irrespective of whether or not they contain data, unless the user says
so, or minimise(), grid_search(), or calc() are called.
Over-fitting is prevented on calls to these functions by de-selecting
residues with k > n.  This overfit_deselect() function treats mf, jw,
and relax_fit run types appropriately, and also checks for structural
data where required.


I've checked your modifications located at
https://mail.gna.org/public/relax-commits/2006-04/msg00022.html, and I
have a few suggestions.  With the 'relax_data.read()' user function,
removing the deselection may be the only way to rationally handle the
situation while allowing the 'select' and 'unselect' user function
classes to function (see below).  The k > n test you've coded should
catch all problems (as well as the test of existence of relax_data).

The function 'self.overfit_deselect()' which you've added to
'generic_fns/minimise.py' would fit better into the program layout if
it was split into three.  The code inside the function is specific to
the run type and would therefore be better suited if three functions
called 'self.overfit_deselect()' were located within the files
'jw_mapping.py', 'model_free.py', and 'relax_fit.py' in the directory
'specific_fns' (simply containing your current code).  To handle this,
the file 'specific_fns/specific_setup.py' need three entries for the
three run types.  Then, within the three functions 'self.calc()',
'self.grid_search()', and 'self.minimise()' of
'generic_fns/minimise.py', code such as the following could be used:

    def calc(self, run=None, print_flag=1):
        """Function for calculating the function value."""

        # Test if the run exists.
        if not run in self.relax.data.run_names:
            raise RelaxNoRunError, run

        # Function type.
        function_type =
self.relax.data.run_types[self.relax.data.run_names.index(run)]

mod     # Specific calculate and over-fitting deselect function setup.
        calculate = self.relax.specific_setup.setup('calculate', 
function_type)
mod     overfit_deselect =
self.relax.specific_setup.setup('overfit_deselect', function_type)

mod     # Deselect residues lacking data:
mod     overfit_deselect(run)

        # Monte Carlo simulation calculation.

The lines beginning with 'mod' have changed.  This code arrangement
allows for all the model-free specific code to be located within a
single file.  Hence, the layout of the code of the entire program is
easier to navigate.  The tests you have implemented are perfect
though.  Finally, a test for the existence of the sequence may catch
future bugs as well (in the bizarre situation of someone accidentally
minimising the wrong run).


The only draw-back I see with this is that the user probably won't
expect minimise() to change residue selections. Not too much of a big
deal though...


They shouldn't expect it.  The user should at minimum be watching the
results so they should see the blank lines.  If the user isn't
checking the results, then this default behaviour is good as it should
stop junk results from being blindly published.

The other fix suggested was that relax_data.read() re-select residues
when data is found.  I don't like this option because the user might do
unselect.res() before doing relax_data.read(). Then having
relax_data.read() do re-selections could override the users selctions
ie. relax_data.read() has no way of knowing if a residue is de-selected
because a of previous relax_data.read() call or because of an explicit
unselect.res()


I agree, I didn't think of that scenario when I suggested the idea. 
The behaviour of the reselection is irrational.

The fix as I've implemented it passes the test-suite as well as a
barrage of relavant sample scripts (though see bug #5698)


I'll look at that bug report and send another email.

Edward

Re: Resolution to bug #5501, "residue deselection problem on relax_data.read()".

Header

Content

Related Messages