Re: [bug #6503] Uncaught nan in xh_vect -- August 04, 2006

Maths operations on NaNs shouldn't be an error, but they should *always*
return NaN (Python gets this much right, but only because all the
platforms it runs on gets it right, and Python blindly inherits this
behaviour). Comparison operations on NaN also shouldn't be an error, but
they should *always* return False (this Python often gets wrong, because
the platforms it runs on gets in wrong, and Python blindly inherits...).

To understand the 'correct' behaviour, its worth understanding why NaN
exists: consider a big calculation involving lots of input data, giving
many different results (a Model-Free calculation, eg). Suppose one or
two of the inputs are wrong (I should be so lucky). Suppose that because
of the bad inputs, somewhere in the calculation an undefined
mathematical operation occurs (say divide by zero, or log(0), eg). There
are two options: (1) either raise an exception and bring the whole
calculation to a halt, or (2) indicate that the result of the bad
operation is undefined, but continue the calculation on the rest of the
data.

Clearly if I have spent hours calculating good results from good data, I
dont want to thow all of that away just because one operation is bad.
What I do want, is that all the results that are contaminated by the bad
operation are clearly identifiable. This is the purpose of NaN - it
marks out the result of an undefined floating-point operation, and
propagates itself into all results that depend on it. It should do this
without throwing exceptions or otherwise halting whatever other
calculations might be going on independently within the same process.


Good point, it will be clearly visible in the results file (assuming
minimisation ever terminates).

The most important bit of all of this is that NaNs always propagate ie.
f(NaN) = NaN for all maths functions. Python gets this much right.
Python doesn't get NaN creation right - in many instances when an
undefined operation occurs, Python throws one of its bizare array of
FloatingPointErrors, rather than just returning NaN or INF. Python also
makes testing for NaNs very difficult, because all the comparison
behaviour is undefined. Here I have been a bit harsh on Python in my
previous posts - as of Python 2.4 comarisons that use the x==y or x!=y
syntax behave as they should, at least on Windows and gcc based
platforms. As I demonstrated, though, using cmp() leads to all manner of
chaos!


The floating point error occurred quite a bit until I caught most of
them.  I think I converted it to a RelaxError, although it sounds like
the creation of a NaN is a much better behaviour.  Those Lorentzians
in the model-free spectral density functions are particularly bad when
using a grid search where the correlation time starts at zero.  Python
deserves to be picked on a bit - implementing the IEEE 754 standard
should have occurred a long ago even if that means avoiding the
underlying C libraries.


>
> I will also look into placing an upper iteration limit on the
> backtracking line search to prevent it from hanging.

The other possibility is a test to ensure the line search is actually
going somewhere, but that could be expensive in terms of performance.


This may be difficult.  The efficacy of the backtracking line search
(Nocedal and Wright, 1999) or any optimisation algorithm requires
mathematical proofs.  Adding a this type of test and then
mathematically proving that the algorithm will still work is well
above my capabilities.  Iteration limits though shouldn't be too much
of an issue - if they are large enough.  It's will be expensive but
minimisation will then terminate.


> Chris, if you
> set the 'print_flag' value to say 5, does relax just sit there doing
> nothing or does the line search continue forever?  If it hangs without
> printing anything it could be a bug in Numeric.  In version 24.x the
> eigenvalue function freezes and ctrl-c does nothing (try the
> eigenvalue Hessian modification together with Newton minimisation to
> see the result).  You have to manually 'kill' relax.

This is really bizare - if I do the minimisation with print_flag<=3, the
minimisation gets stuck in a loop, but ctrl-c works. If print_flag>3,
ctrl-c doesn't work, and I have to kill relax. Ctrl-c not woking
possibly indicates that we are stuck somewhere in the C world (in
Numeric eg.) and so the signal never gets out to the python world. But
why does the print_flag level make a difference?


Ah, that could be because the higher amount of output actually prints
out the eigenvalues - a fatal operation under Numeric 24.  I can't
check that at the moment though.

Edward

Re: [bug #6503] Uncaught nan in xh_vect

Header

Content

Related Messages