mailRe: The multi-processor code is back!


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on December 05, 2008 - 22:24:
Gary,

I've been investigating the problem of the STDOUT and STDERR capture
failure, and I think I know the problem.  This was a difficult one to
trace.  It appears as though the
multi.multi_processor_base.Multi_processor.post_run() method is not
being run if an error is thrown.  This may only effect the test-suite,
as relax would normally die, but this is important.  Therefore what
would you see as being the best way to properly execute post_run()?
Should there be error catching code using the 'try' statement which if
the error is encountered, post_run() is executed and then 'raise' is
run to throw the error again?

Cheers,

Edward


On Thu, Nov 6, 2008 at 11:26 AM, Edward d'Auvergne <edward@xxxxxxxxxxxxx> 
wrote:
Hi,

The following is a status update of the multi-processor code.  You may
have noticed that I have recently ported Gary's code to the new design
within the 'multi_processor_merge' branch.  This code is now fully
functional again, but before merging back into the 1.3 main line, I
will probably need your help Gary.  I will try to keep this short by
using point form (typing one handed due to RSI is a pain).  So these
are the important issues left to resolve:

Code clean up - I have gone through all the code and significantly
cleaned up whitespace and formatting issues to make the code compliant
with the rest of relax' code style, neatness and clarity.  I've also
made many epidoc fixes revealed when running 'scons api_manual_html'.
There isn't much more to do here.

FIXMEs - there are tonnes of these in the code left.  I have fixed a
number of these when I have been able to work out what the problem and
solution was.  However there are many that I was unable to decipher.
For example shifting class methods up and down.  I would guess this
would mean shifting and abstraction to the parent class, or
specialisation and moving to the inheriting class.  Before merging the
code into the main line, these need to be cleaned up.

TODOs - There are a number of these around the code.  Solving these
many not be necessary for a merge.  Gary, could you check this?

CHECKMEs - I would guess these are quite important to resolve!

Dead code - In reading the code, I noticed a number of methods which
are no longer used due to the evolution during the original
development.  It would be good to kill all of this non-functional
code.

Docstrings - In a few modules, no docstrings exist to explain the
purpose of the module, class, function, or method.  It would be quite
useful to have these, especially so that someone implementing say SSH
tunneling for grid computing can then easily mimic the MPI code ideas
and quickly have this new multi-processor fabric functional ;)

STDOUT capture - As the relax test suite clearly shows, we lose
control over SDTOUT redirection (compare the branch printout with the
1.3 line printout - they must be the same).  This is a bug that needs
to be fixed.  The multi-processor code is not restoring IO streams
correctly and then in the subsequent tests cannot capture SDTOUT.  The
test suite can be run in both uni-processor and MPI modes, and will
use multiple processors in some of the system tests.

relax manual - Gary, could you write a paragraph or 2 in the manual
explaining how to use the multi-processing capabilities?  This is
extremely important - the user should not even have to think - just
read, maybe install an MPI implementation, and then just go.  I would
suggest using much material from your intro post at
https://mail.gna.org/public/relax-devel/2007-05/msg00000.html
(Message-id: <463B2E31.30503@xxxxxxxxxxxxxxx>), including the figure.
I would suggest adding something to chapter 2 - How to use relax,
possibly between the sample scripts and the test suite sections.  If
you accidental write too much, then maybe it could be shifted into its
own chapter.  This shouldn't be necessary though.

I think this covers all the remaining issues.  Although a long list,
this should not be too much work.  Considering that I can't do any
releases for a little while due to injury, this code may make it to
relax 1.3.3.

Cheers,

Edward




Related Messages


Powered by MHonArc, Updated Thu Dec 11 11:20:14 2008