> For 1) I would prefer the NaN catching to be outside of the > 'minimise/' directory. It should be safe to assume that that code > will soon not be part of relax. As for handling NaNs within the > minimisation code I know of no other minimisation package that does > this - if the user sends garbage to it then returning garbage should > be expected. The sender and receiver code should do the cleanup. I > do however think that testing for NaN during optimisation (in the > 'maths_fns' code) is too computationally expensive. If optimisation > terminates in a reasonable time then I don't think we should test for > NaNs during the number crunching phase. > We should check what the overhead is before we say too expensive
The number of times the family of functions 'self.func()' within the file 'maths_fns/mf.py' is called to generate the chi-squared value during optimisation is huge. When running relax, this function is the most called function in the entire code base. Putting the test for NaN within this function, or in the 'minimise/' code straight after this function is called, will be the most computationally expensive solution possible. A much more efficient design would be to catch the NaN just after optimisation has terminated as the test will only need to be done once per optimisation rather than thousands of times per optimisation.
> For 2) and 3) the NaN value comes from the chi2 variable which is just > a standard Python floating point number rather than a Numeric > construct. Will the shift to Numpy actually change the behaviour of > the default Python floats? Or will it just change the behaviour of > vectors, matrices, and other linear algebra? Or is there a function > similar to the fpconst.py function isNaN() which can be used to catch > it? Anyway, the 1.3 line is probably the best place to test the shift > from Numeric to Numpy - although in a private branch first. > my undertanding is that in general numpy propogates nans generally and that pure fp maths also propgtates nans. The only place there used to be problems is in ufuncs which used not to propogate nans but raise exceptions in numeric. There is a function similar to isNaN called originally isnan (and isinf) in scipy...Ingenerale we could have a grep for the use of isnan and isinf in the numpy/scipy codebase to see if they are caught much or just propgated. A quick look in scipy/numpy shows only a very few uses of isnan in numpy or scipy
I just had a look at scipy and the isnan function is defined in 'Lib/special/cephes/isnan.c'. They catch it based on the bit pattern, as you suggested previously, but dependent on whether it is an 'IBMPC', 'DEC', or 'MIEEE'. It should be pretty easy to implement a similar solution in relax using Python with not too many lines.
> As for the test suite, the optimisation code is completely untested. > It's where the major breakages occur, although the code in > 'maths_fns/' is problematic as well. A shift to Numpy will require > changes to both 'maths_fns/' and 'minimise/'. To catch problems the > four optimisation classes will need to be tested - standard single > residue, diffusion tensor, all parameters (model-free + diffusion > params), and the residue specific local tm models. It shouldn't be > too hard to code a number of tests for this as they can all use the > same data. Then all the optimisation algorithms in ALL combinations > need to be tested - that is quite a few. However as these minimisers > will be separated from relax, this won't be so easy. > I don't quite follow why this won't be easy. The combinatorial feature is of course a problem but I guess the likley combinations are the first target.
It would be easy. I can use the data of Schurr et al., (1994), which I have reanalysed for a paper which is in preparation, for the test-suite. Then the tests should be as simple as writing a number of relax user scripts - although within the test-suite. However the tests for the optimisation algorithms probably shouldn't go into the relax test-suite - that code will eventually be removed from relax. Still for our purposes it might be good to have a second set of tests within the test-suite to test simultaneously both the model-free code and minimisation code.
> I believe though that throwing a RelaxError when NaNs occur is the > best option. That is because NaN should NEVER occur. Even though it > may cause a week long calculation to die at the very end, hence the > optimisation was for nothing, an error should still be thrown (it's > much more likely that a week long optimisation will die at the very > start). The reason for throwing a RelaxError and killing the > calculation is simple. Despite the calculation running until the end > - it will need to be rerun. If the NaN only occurs for a single > residue the entire protein (the diffusion tensor) is nevertheless > affected.
surely not if you skip data with nan values?
Do you really want to do this? The NaN value is a sign that something is fatally wrong.
I have to say at the end of the week long calculation I would like to see the result. Thus for example (this came from chris) if a grid search failed I would personally like the residue to be deselected ( and maybe a warning generated on the console) and then the calculation should go on.. In general i feel exceptions are a blunt tool for these sort of problems as you lose the program state and don't get results on where the calculation was going to for everything other that the faulty data.
Sorry, my example of the week long calculation failing at the very end was a hypothetical which is probably impossible. The NaN value within model-free analysis is guaranteed to be caused by garbage input data, hence the RelaxError will be thrown before calculations really get under way. The example of the week long calculation was assuming that the new model-free optimisation protocol implemented in the sample script 'full_analysis.py' is run. That includes ~15 rounds of the iterative full optimisation of the system, 15 times for each of the spherical diffusion tensor, prolate spheroid, oblate spheroid, and ellipsoid. For each of these iterations many results files are generated. So even if the NaN and subsequent RelaxError occurs at the very end of the analysis - the results up to that point will be easily accessible. Optimisation can even continue from a point just before the error occurred. The amount of program state and computation time that is lost is relatively small.
Also for example if one MC caculation produced a nan and killed it all that would be annoying in the extreme
By construction of Monte Carlo simulations I can't see this as being possible. If the NaN occurs in the MC simulation, it must have previously occurred in the original optimisation.
I can think of counter examples so for example an nan while calculating a tensor should bring things to a close, but it would be nice to have a default error handler that saved state to some meaningful place.
That would be useful - just difficult to code. If code is written which will dump the saved state in the current directory just prior to throwing the relax error (hint in the RelaxError base class BaseError in 'errors.py') it will certainly be accepted into relax.
Edward