Re(2): r2801 - /branches/test_suite/float.py -- November 15, 2006

In advance sorry for the slightly strange format but I am replying to this by
gmail at home having cut and pasted the message from the archive
(this is my second attempt it lost all the first set of edits) n.b. I will 
only
send this to the list now as i don't have eds e-mail address at home


edward.dauvergne@xxxxxxxxx wrote:

 >Author: bugman
>Date: Sat Nov 11 10:57:35 2006
>New Revision: 2801
>
>URL: http://svn.gna.org/viewcvs/relax?rev=2801&view=rev
>Log:
>Two functions for converting a string bit pattern to a float and a number of
IEEE-754 constants.
>
>The fucntions 'bitpatternToFloat()' and 'bitpatternToInt()' have been added
to 'float.py'.  The
>first converts a 64 bit IEEE-754 bit pattern in the form of a string into a
64 bit Python float.
>The second converts an arbitrary bit pattern into its integer
representation.  By looping over each
>8 characters of the string (1 byte), 'bitpatternToFloat()' calls
'bitpatternToInt()' to create an
>array of integers.  Then 'packBytesAsPyFloat()' is called to convert the
byte array into the float.
>These two functions convert between big and little endian when necessary.
>
>
Nice functions! However, I would suggest that these should go in utility
functions or  the test suite.... I am trying to keep float.py as the
bare minimum file for testing floating point numbers and floating point
constants


I agree - these string to float conversions are auxiliary to intended
purpose of the IEEE-754 module.  Unfortunately as you hadn't committed
the utility function code yet - I wasn't able to put the functions
there :).

This is true but certainly at this stage the product is just too half cut for

anyone else and I wanted to concentrate on the primary problem (i.e. float.py)
and complete that first unit, test it etc before I did some work on
the capabilities


 How would you like to design the ieee754 module?  Have you searched
for documentation describing how to create standard Python modules,
i.e. the format and structure of the modules?  Should the utility
functions be part of the module?  Or should the be their own module?


Something I have to thing about and read python in a nutshell. basically 
if the
utility functions are private they shouldn't be part of the public interface.
Otherwise it is just a question of keeping the utility of the module 
focused...

The 'bitpatternToFloat()' function I added is 100% specific to
IEEE-754 though.  Do you think it would be a good idea, for future
Python inclusion and to improve on 'fpconst', to create a module which
includes as many IEEE-754 related functions as possible?

I would try to keep the number of functions to the minimum and maintain 
fpconst
compatabilty. all other functions can go in separate modules if needed.


A couple of other matters as well

 1. the constants are not much use at the end of the file as if anything
else in the file wants to refer to them they won't be able to


I didn't expect anything in 'float.py' to use them.  I just expanded
the constants proposed in PEP-754
(http://www.python.org/dev/peps/pep-0754/) into a complete list of all
the IEEE-754 special numbers.  That was pretty simple to do.  They can
be imported into the unit tests as an accurate replacement for
FLOAT_EPSILON, etc.  Their main usage would be by the users of the
module.


fair enough still the top is the place to put them (this is also where 
users
would expect to search for constants...)


2. I was thinking of a test case with random numbers in the exponent of
the nan and then reporting what the bit pattern that failed was....

You mean mantissa?


sorry mental typo

The three NaN just give the users of the module a

few different NaN numbers to play with.  I can't stand the way I've
named them though!


I would suggest that we don't want to have multiple nans. decide on one 
good bit
pattern that is easy to identify and then use it everywhere.
This will make debugging a lost easier. One suggestion is to have a
memorable hex
pattern e.g. 'deadara' or something similar

as an aside from my (limited) reading of the ieee spec it seems that
the extra patterns
are just there to give the designers of fpus some flexibility

now when we come round to unit testing thats a different matter here we want
lots of nans and if the names are ugly who cares as long as they don't change!

3. I was also thinking of having test caseswith minimum and maximum bit
patterns e.g. max denorm min denorm nan with highest mantissa bit set
nan with smallest mantissa bit set etc


These can still be pre-defined constants within the ieee754 module for
others to use.  We may as well give the users of the module access to
a complete series of ieee754 special numbers, including the limits.
Essentially I have added all of the 64 bit special values discussed in
the current Wikipedia article http://en.wikipedia.org/wiki/IEEE-754.

see my comments on nans


 ehen
>By passing big-endian 64 bit patterns to the 'bitpatternToFloat()' function,
the following IEEE-754
>constants are defined:
>    PosZero:
0000000000000000000000000000000000000000000000000000000000000000
>    NegZero:
1000000000000000000000000000000000000000000000000000000000000000
>    PosEpsilonDenorm:   ehen
0000000000000000000000000000000000000000000000000000000000000001
>    NegEpsilonDenorm:
1000000000000000000000000000000000000000000000000000000000000001
>    PosEpsilonNorm:
0000000000010000000000000000000000000000000000000000000000000001
>    NegEpsilonNorm:
1000000000010000000000000000000000000000000000000000000000000001
>    PosMax:
01111111111011111111111111111111111111111111111111ehen11111111111111
>    NegMin:
1111111111101111111111111111111111111111111111111111111111111111
>    PosInf:
01111111111100000000000000000000000000000000000000ehen00000000000000
>    NegInf:
1111111111110000000000000000000000000000000000000000000000000000
>    PosNaN_A:
0111111111110000000000000000000000000000001000000000000000000000
>    NegNaN_A:
1111111111110000000000000000000000000000001000000000000000000000
>    PosNaN_B:
0111111111110000000000000000011111111111111111111110000000000000
>    NegNaN_B:
1111111111110000000000000000011111111111111111111110000000000000
>    PosNaN_C:
01111111111101010101010101010101010101010101010101ehen01010101010101
>    NegNaN_C:
1111111111110101010101010101010101010101010101010101010101010101
>    PosNaN = PosNaN_C
>    NegNaN = NegNaN_C
>
>
>
I have some code that reads and displays these as ehenoctes which makes life
easier...

  11111111 11110101 01010101 01010101 01010101 01010101 01010101 01010101
I don't know if the spaces help much.  As the mantehenissa is from 0-51,
the exponent from 52-62, and the sign bit is at position 63, wouldn't
the best place for spaces be to separate out the components of the
IEEE-754 float, e.g.:

 1 1111111111 10101010101010101010101010101010101010101010101010101

 ieee754 doesn't seem to be an octet based system.

this is true ieee754 breaks the mantissa and exponent  at a  nibble boundary

however it is good to have an easily read structure that is less 
monolithic....
and  the underlying structure of
numbers on computers is octet based, most computer programmers think
in octets/bytes at
the basic level  and when you manipulate the contents of nans you will tend to
split in to octets for bit manipulation. so octets seem a good idea...


one (ugly) suggestion would be

0bs1 e111111 1111 m1010 10101010 10101010 10101010 10101010 10101010 101010101

and then strip all leading 0b's all spaces and all letters. the addition of 0b
would allow you to use hex as well if you wanted (i would add that most system
programmers much prefer hex to bit patterns. if you look at many programs they
don't have binary constants but hex constants, and languages such as
python and c
support hex constants but not binary ones [most programmers find them
too wordy]


>+def bitpatternToFloat(string, endian='big'):
>+    """Convert a 64 bit IEEE-754 ascii bit pattern into a 64 bit Python
float.
>+
>+    @param string:  The ascii bit pattern repesenting the IEEE-754 float.
>+    @type string:   str
>+    @param endian:  The endianness of the bit pattern (can be 'big' or
'little').
>+    @type endian:   str
>+    @return:        The 64 bit float corresponding to the IEEE-754 bit
pattern.
>+    @returntype:    float
>+    @raise:         TypeError if 'string' is not a string, the length of
the 'string' is not 64, or
>+        if 'string' does not consist solely of the characters '0' and '1'.
>+    """
>+
>+    # Test that the bit pattern is a string.
>+    if type(string) != str:
>+        raise TypeError, "The string argument '%s' is not a string." %
string
>
>

 I think this is to restrictive opther things can act as sequences...
Feel free to remove the test.  I can see how it would be restrictive.

will do


>+    # Test the length of the bit pattern.
>+    if len(string) != 64:
>+        raise TypeError, "The string '%s' is not 64 characters long." %
string
>+
>+    # Test that the string consists solely of zeros and ones.
>+    for char in string:
>+        if char not in ['0', '1']:
>+            raise TypeError, "The string '%s' should be composed solely of 
the
characters '0' and '1'." % string
>+
>+    # Reverse the byte order as neccessary.
>+    if endian == 'big' and sys.byteorder == 'little':
>+        string = string[::-1]
>+    elif endian == 'little' and sys.byteorder == 'big':
>+        string = string[::-1]
>+
>
>

 this could be better written as

 # Reverse the byte order as neccessary.
if endian != sys.byteorder:
    string = string[::-1]


Good point.  Again feel free to make the modification.

will do


and then add an assert effectivley of the form endian..tolower()  in
('little','big')

I think you need to assert that big and little are passed in endian if this is
a public function. The
general way to think about these things is that you should test all data for
validity at public interfaces. Then if you have internal functions you don't
have to test for validity (as you will slow down to repeated
testing). Its a bit like testing messages at a firewall before they
enter the dmz. However,
it is still a wise decision to check argument validity in all functions in
debugging mode e.g. with assert as this will catch errors early

as to the to lower it just gives a bit more generality if you don'
have to worry
about case

also you asked about using > and <.  I personally much prefer big and
little they are much more obvious and intuitive </> are open to interpretation

Is this necessary?  If the user supplies dud arguments, I'm not sure
what to do.  There should be Python standards for standard Python
modules somewhere which say how to properly handle incorrect user
input.

yep exceptions ;-)


 Edward


regards
gary

Re(2): r2801 - /branches/test_suite/float.py

Header

Content

Related Messages