mailRe: Optimisation tests in the test suite.


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Gary S. Thompson on October 20, 2006 - 16:47:
Edward d'Auvergne wrote:

On 10/20/06, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:

This is quite an interesting result ;-) If you are doing float maths and
don't call any higher level functions the results from the two test
cases _ought_ to be the same if both platforms impliment ieee-754.
However, there are some caveats once you call high level functions all
bets are off as there as there can be impementation/algorithm
dependancies (certainly the c standard on first reading says nothing
about what rounding sin etc use!..?). Now there are further complcations
as some compilers use fused multiply-add instructions and other
optimisation which are not coveredby ieee-754 and are not standardised
by ieee-754 in terms of their rounding behaviour....( except in C99
which has a pragma to control this sort of thing: FP_CONTRACT)


The Numeric function calls which use BLAS and LAPACK I'm sure would be
notorious here.


So here my thought. What we have here are regression tests so we either


1. define a set of results for each test on each particular platform (you have a mode there someone can run the tests on a version we believe works, and then say e-mails them to us for inclusion) We then store them and then use those results only for that platform 2. define a s set of result for each test which encompases worst case performance (as long as it is reasonable), run the tests on a variety of platforms and if it fails on some platforms decide on a case by case basis if the result is reasonable and downgrade you regression tests till it works everywhere.

I would go for 2. its a lot easier to work with and much more likley to
be used by the user for testing their implementation


I agree!  There are too many variables to sanely handle point 1.  The
model-free parameter tests should be tight but the optimisation stats
tests should be set to the hypothetical worst case.  The question is,
how would you initially define 'worst case' when building these tests?

1. Implement the test case and if possible calulculate the correct results and use as the test case.
2. If you can't do this (which happens in many cases)
a. write a test case and run the code in a state where you believe it to be fully functional and working.
b. get a result check it to thye best of you ability and add a 'reasonable' amount of uncertainty (2-10 ulp [units in the last place]? in many cases but in some cases much more!)
c. run it on some other architectures without changing the code, if the results are wildly different investigate to see if the result is due to a problem or if it just an implementation problem
d. enshrine the results in the test case and ask people to report errors (if possible make it easy for them: dump to a file etc in a clean format)
e. if it fails again repeat step c if need be regressing to the revision of the routine you had at that point.... and see if the code is failing or if it is a platform problem


any that is what i would do


regards gary

Edward

.



--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK             Tel. +44-113-3433024
email: garyt@xxxxxxxxxxxxxxx                   Fax  +44-113-2331407
-------------------------------------------------------------------





Related Messages


Powered by MHonArc, Updated Fri Oct 20 17:20:28 2006