Sorry for dragging my feet on this one... I've implimented the following fix:
Dragging your feet? It hasn't been that long considering that the bug isn't fatal. Thanks for the fix.
relax_data.read() no-longer attempts to deselect residues. This means that all residues in the sequence will be selected, irrespective of whether or not they contain data, unless the user says so, or minimise(), grid_search(), or calc() are called. Over-fitting is prevented on calls to these functions by de-selecting residues with k > n. This overfit_deselect() function treats mf, jw, and relax_fit run types appropriately, and also checks for structural data where required.
I've checked your modifications located at https://mail.gna.org/public/relax-commits/2006-04/msg00022.html, and I have a few suggestions. With the 'relax_data.read()' user function, removing the deselection may be the only way to rationally handle the situation while allowing the 'select' and 'unselect' user function classes to function (see below). The k > n test you've coded should catch all problems (as well as the test of existence of relax_data). The function 'self.overfit_deselect()' which you've added to 'generic_fns/minimise.py' would fit better into the program layout if it was split into three. The code inside the function is specific to the run type and would therefore be better suited if three functions called 'self.overfit_deselect()' were located within the files 'jw_mapping.py', 'model_free.py', and 'relax_fit.py' in the directory 'specific_fns' (simply containing your current code). To handle this, the file 'specific_fns/specific_setup.py' need three entries for the three run types. Then, within the three functions 'self.calc()', 'self.grid_search()', and 'self.minimise()' of 'generic_fns/minimise.py', code such as the following could be used: def calc(self, run=None, print_flag=1): """Function for calculating the function value.""" # Test if the run exists. if not run in self.relax.data.run_names: raise RelaxNoRunError, run # Function type. function_type = self.relax.data.run_types[self.relax.data.run_names.index(run)] mod # Specific calculate and over-fitting deselect function setup. calculate = self.relax.specific_setup.setup('calculate', function_type) mod overfit_deselect = self.relax.specific_setup.setup('overfit_deselect', function_type) mod # Deselect residues lacking data: mod overfit_deselect(run) # Monte Carlo simulation calculation. The lines beginning with 'mod' have changed. This code arrangement allows for all the model-free specific code to be located within a single file. Hence, the layout of the code of the entire program is easier to navigate. The tests you have implemented are perfect though. Finally, a test for the existence of the sequence may catch future bugs as well (in the bizarre situation of someone accidentally minimising the wrong run).
The only draw-back I see with this is that the user probably won't expect minimise() to change residue selections. Not too much of a big deal though...
They shouldn't expect it. The user should at minimum be watching the results so they should see the blank lines. If the user isn't checking the results, then this default behaviour is good as it should stop junk results from being blindly published.
The other fix suggested was that relax_data.read() re-select residues when data is found. I don't like this option because the user might do unselect.res() before doing relax_data.read(). Then having relax_data.read() do re-selections could override the users selctions ie. relax_data.read() has no way of knowing if a residue is de-selected because a of previous relax_data.read() call or because of an explicit unselect.res()
I agree, I didn't think of that scenario when I suggested the idea. The behaviour of the reselection is irrational.
The fix as I've implemented it passes the test-suite as well as a barrage of relavant sample scripts (though see bug #5698)
I'll look at that bug report and send another email. Edward