mailRe: r3236 - /branches/multi_processor/relax


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on March 20, 2007 - 10:02:
On 3/20/07, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:
Edward d'Auvergne wrote:

>   For example
> the 'threading' user function class which sets up grid computing.  In
> the future I'll probably want to use my algorithm for handling very
> slow machines on a grid (which avoids relax having to wait for a slow
> machine to terminate) and the setting up of slave relax processes
> (using the 'relax --thread' invocation).  Although the grid computing
> code is currently broken, this is only because there is a problem with
> the handling of SSH tunnel breakages.  I also have in mind some
> optimisations for minimising data flow through the tunnel


can you outline this?

Well, currently data is sent to the slave grid processes via ssh tunnels. Prior to running the grid processes relax saves the model-free results file and sends it to all machines. The slave processes read in the results file, run a single Monte Carlo simulation, save another model-free results file, and transfer this file back. The parent thread then reads the new results file into a temporary 'run' and copies the data to the real run.

This setup is slow, inefficient, and just plain sucks!  What I would
like to do is to set the slave processes to run in a state where they
wait for instructions.  The bare minimum data is then sent to them and
solely the optimised parameter vector and optimisation stats are
returned (i.e. the minimise() method of specific_fns.model_free
module).  This may be how you would like to set up MPI where MPI is
used as the communication interface?


> and Andrew
> Perry has had ideas about using heartbeats from the grid machine relax
> processes to probe for dead tunnels and processes.


again this would be interesting but I would have to look at it quite carefully as mpi has some threading limitations I believe...

I don't think these issues are related to MPI.


my thought on this was to turn  the treading code into an implimentation
of the multi code so my plan was to rip it out and then put it back ;-)
The architecture devised should be able to cope with ssh tunnels just as
well as a thread or a mpi invocation I think... If not I would be happy
to adapt to any ideas you have.

From my understanding of MPI, threads need not be used. So the
threading and grid computing code could possibly be left untouched by
the MPI patch.  If you would like to make use of threads in the MPI
code and if there is overlap between MPI threads and grid computing
threads, we can worry about issues then.  Oh, what exactly is 'multi
code'?


The basic idea is as you can see is to send pickled commands containing
data to the slave which then returns a pickled result object,  and I
think that this quite possible down an ssh tunnel. If not text commands
with embedded text would also work.

These pickled objects would be much more efficient than the way grid computing is currently set up. I would actually like to have pickled objects sent to and from the grid slave processes, but you don't need to worry about that. The grid computing setup could be improved after the MPI patch makes it into the main line by using the same pickling interface.

Cheers,

Edward



Related Messages


Powered by MHonArc, Updated Tue Mar 20 10:40:36 2007