mailRe: [bug #14182] System tests failures depend on the actual machine


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on September 03, 2009 - 19:00:
Hi,

This is very strange, very strange indeed!  I've never seen anything
quite like this.  Is it only your laptop that is giving this variable
result?  I'm pretty sure that it's not related to a random seed
because the optimisation at no point uses random numbers - it is 100%
fixed, pre-determined, etc. and should never, ever vary (well on
different machines it will change, but never on the same machine).
What is the operating system on the laptop?  Can you run a ram
checking program or anything else to diagnose hardware failures?
Maybe the CPU is overheating?  Apart from hardware problems, since you
never recompile Python or numpy between these tests I cannot think of
anything else that could possibly cause this.

Cheers,

Edward



2009/9/3 Sébastien Morin <sebastien.morin.1@xxxxxxxxx>:
Hi Ed,

I've just tried what you proposed and observed something quite strange...

Here are the results:


./relax scripts/optimisation_testing.py > /dev/null
 (stats from my laptop, different trials, see below)
   iter      161   147   151
   f_count   765   620   591
   g_count   168   152   158

./relax -s
 (stats from my laptop, different trials, see below)
   iter      146   159   160   159
   f_count   708   721   649   673
   g_count   152   166   167   166


Problem 1:
The results should be the same in both situations, right ?

Problem 2:
The results should not vary when the test is done multiple times, right ?


I have tested different things to find out why the tests give rise to
different results as a function of time...

./relax scripts/optimisation_testing.py > /dev/null
   If you modify the file "test_suite/system_tests/__init__.py", then
the result will be different. By modifying, I mean just comment a few
lines in the run() function. (I usually do that when I want to speed up
the process of testing a specific issue.) Maybe this behavior is related
to random seed based on the code files...

./relax -s
   This one varies as a function of time without any change. Just doing
the test several times in a row will have it varying... Maybe this
behavior is related to random seed based on the date and time...


Any idea ?

If you want, Ed, I could create you an account on one of these
strange-behaving computers...


Regards,


Séb




Edward d'Auvergne wrote:
Hi,

I've now written a script so that you can fix this.  Try running:

./relax scripts/optimisation_testing.py > /dev/null

This will give you all the info you need, formatted ready for copying
and pasting into the correct file.  This is currently only
'test_suite/system_tests/model_free.py'.  Just paste the pre-formatted
python comment into the correct test, and add the different values to
the list of values checked.

Cheers,

Edward


2009/9/3 Sébastien Morin <sebastien.morin.1@xxxxxxxxx>:
Hi Ed,

I just checked my original mail
(https://mail.gna.org/public/relax-devel/2009-05/msg00003.html).


For the failure "FAIL: Constrained BFGS opt, backtracking line search
{S2=0.970, te=2048, Rex=0.149}", the counts were initially as follows:
   f_count   386
   g_count   386
and are now:
   f_count   743   694   761
   g_count   168   172   164


For the failure "FAIL: Constrained BFGS opt, More and Thuente line
search {S2=0.970, te=2048, Rex=0.149}", the counts were initially as
follows:
   f_count   722
   g_count   164
and are now:
   f_count   375   322   385
   g_count   375   322   385


The different values given for the "just-measured" parameters account
for the 3 different computers I have access to that give rise to these
two annoying failures...

I wounder if the names of the tests in the original mail were not mixed,
as numbers just measured in the second test seem closer to those
originally posted in the first test, and vice versa...

Anyway, the problem is that there are variations between the different
machines. Variations are also present for the other parameters (s2, te,
rex, chi2, iter).

Regards,


Séb  :)



Edward d'Auvergne wrote:
Hi,

Could you check and see if the numbers are exactly the same as in your
original email 
(https://mail.gna.org/public/relax-devel/2009-05/msg00003.html)?
 Specifically look at f_count and g_count.

Cheers,

Edward


2009/9/2 Sébastien Morin <sebastien.morin.1@xxxxxxxxx>:
Hi Ed,

I updated my svn copies to r9432 and checked if the problem was still
present.

Unfortunately, it is still present...

Regards,


Séb



Edward d'Auvergne wrote:
Hi,

Ah, yes, there is a reason.  I went through and fixed a series of
these optimisation difference issues - in my local svn copy.  I
collected these all together and committed them as one after I had
shut the bugs.  This was a few minutes ago at r9426.  If you update
and test now, it should work.

Cheers,

Edward



2009/9/2 Sébastien Morin <sebastien.morin.1@xxxxxxxxx>:

Hi Ed,

I just tested the for the presence of this bug (1.3 repository, r9425)
and it seems it is still there...

Is there a reason why it was closed ?
From the data I have, I guess this bug report should be re-opened.

Maybe I could try to give more details to help debugging...


Séb  :)



Edward d Auvergne wrote:

Update of bug #14182 (project relax):

                  Status:               Confirmed => Fixed
             Assigned to:                    None => bugman
             Open/Closed:                    Open => Closed


    _______________________________________________________

Reply to this item at:

  <http://gna.org/bugs/?14182>

_______________________________________________
  Message sent via/by Gna!
  http://gna.org/




--
Sébastien Morin
PhD Student
S. Gagné NMR Laboratory
Université Laval & PROTEO
Québec, Canada



--
Sébastien Morin
PhD Student
S. Gagné NMR Laboratory
Université Laval & PROTEO
Québec, Canada



--
Sébastien Morin
PhD Student
S. Gagné NMR Laboratory
Université Laval & PROTEO
Québec, Canad




Related Messages


Powered by MHonArc, Updated Sat Sep 05 07:20:31 2009