mailRe: [bug #6503] Uncaught nan in xh_vect


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Chris MacRaild on August 08, 2006 - 11:07:
On Mon, 2006-08-07 at 19:40 +1000, Edward d'Auvergne wrote:
On 8/7/06, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:
Edward d'Auvergne wrote:

It ought to be said that nan and inf etc have very clearly defined bit
patterns in memory that are defined by an ieee754 cf
http://steve.hollasch.net/cgindex/coding/ieeefloat.html and
http://www.psc.edu/general/software/packages/ieee/ieee.html. Thus its a
question of python/compilers not putting nan in the right place and not
using comaprisons etc properly when nans are are present. It should
always be possible to detect if a nan is present by the bit pattern in
memory (and this is the way fpconst.py does it). It is of course
possible to test if nans are supported correctly on a paricular
architecture and compiler by comparisons etc and if not warnings etc
could be raised or alternative code executed)


another thing to say is that though fpconst.py  is gpl incompatible due
to some arcana about /patent termination cases
(http://en.wikipedia.org/wiki/Apache_License)/ in the apache license,
due to they way it works should be relativley easy to cook up our own
version which would bullet proof and low maintenance in the same way as
fpconst.py is... and also quite legal


Could the fpconst.py file be legally included in a relax distribution?

I am not a lawyer and the waters eem to be muddy...


      Is the Apache license compatible with the GPL (GNU Public License)?

It is the unofficial position of The Apache Software Foundation that the
Apache license *is* compatible with the GPL. However, the Free Software
Foundation holds a different position, although we have not been able to
get them to give us categorical answers to our queries asking for
details on just what aspects they consider incompatible.

Did you contact the FSF about the issue?

Whether to mix software covered under these two different licenses
*must* be a determination made by those attempting such a synthesis.

According to the FSF list of licences
(http://www.gnu.org/philosophy/license-list.html), all of the Apache
licences are GPL incompatible.  For version 2 of the licence they say:

This is a free software license but it is incompatible with the GPL.
The Apache Software License is incompatible with the GPL because it
has a specific requirement that is not in the GPL: it has certain
patent termination cases that the GPL does not require. (We don't
think those patent termination cases are inherently a bad idea, but
nonetheless they are incompatible with the GNU GPL.)

After looking at the GPL FAQ and thinking about it for a while, I
don't believe the file can be put into the relax distribution.  From
the link http://www.gnu.org/licenses/gpl-faq.html#MereAggregation I
would classify the co-distribution of relax with fpconst.py as an
imported module to be a single program rather than an aggregation of
programs.  Because of the licence incompatibility, the link
http://www.gnu.org/licenses/gpl-faq.html#GPLIncompatibleAlone explains
how it is not legal to co-distribute.

however I don''t see that ther is a problem if the file is in a separate
module... plenty of people install apache based software and gpl
software in their python machines. At worst all that would happen is
someone would complain and we would have to write out own...
( I reckon I could do the whole thing in a few hours at most including
some unit tests)

Because what I wrote above, it would probably have to be listed as a
dependency.  However I don't really like that idea as dependencies are
a pain.  Do you have the link to it's homepage?  All the links I've
tried that look like it would be its homepage (for example
http://www.warnes.net/rwndown) are still down.  Having it in-built
into relax would be better.

I agree that having an additional dependency is a bad idea. The options
then are: 1) do nothing, and accept Pythons default NaN handling.
Although I agree that NaNs are a rare occurence, I think this option is
just putting off the inevitable - as more users push relax into more
places, its inevitable that NaNs will appear again and give us more
headaches, probably in ways we cant anticipate. 2) Code our own version
of fpconst for relax. Then we could catch NaNs as and when we need to.
3) Make the move from Numeric to Numpy. As I understand it, this should
give us both the ability to catch NaNs, but also give us reliable NaN
comparisons (ie. NaN will compare False with everything). I've done some
very preliminary testing towards this, and at first glance it seems
easier than I expected. I have run the convertcode script that is
distributed with Numpy on all modules in the maths directory of the
relax 1.3 branch. All tests pass except the openDx tests, which also
failed in 1.3 before the conversion. Obviously this will require a lot
of care and more thorough testing than the current test-suite allows,
but it atleast seems feasible. As we have discussed before, that shift
to Numpy will be necessary soon enough anyway, so perhaps now is the
time to give it a shot?


However I still question whether the floating point value of NaN
actually needs to be caught at all - it would be useful, but is
unnecessary.  It's a rare occurrence to begin with and I can easily
modify the optimisation component of relax to break the infinite
loops.  Therefore does anything else need to be done?  It will be
glaringly obvious in the final results file if NaNs get through the
analysis.  And if there is an issue in relax's handling of NaN then
that should be classified as a relax bug.


[snip]

NaN should propagate - however I do see one point at which they will
disappear and that is during AIC model selection.  The AIC value will
be NaN (NaN + 2k = NaN) and any model with a non-NaN AIC value will be
selected.  Unless of course all model-free models for that residue are
affected.  Yet if they do propagate the results for that residue will
make no sense.  

Agreed, and that is one reason why we need to catch NaNs (or have
properly defined comparisons on NaNs). In the current situation, if we
do model selection on both NaN and finite values, the behaviour will be
undefined. The options are to use a system that has defined NaN
comparisons (Numpy seems to fit the bill) or to remove NaN values before
model selection (another test in the model elimiation, perhaps?).
Obviously this last option requires that we are able to catch NaNs in a
defined way.

Chris

That should be an indication to the end user that
there is a problem and that they should look at the final results file
in detail - none of the logs or intermediate results files need to be
initially examined to identify that there is a problem.  If, however,
we can prevent users from encountering this problem as much as
possible, while making it possible to continue running if it does
occur, then NaNs shouldn't be a problem in relax.

Edward





Related Messages


Powered by MHonArc, Updated Tue Aug 08 12:40:14 2006