mailRe: r3236 - /branches/multi_processor/relax


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Gary S. Thompson on March 20, 2007 - 10:20:
Edward d'Auvergne wrote:

On 3/20/07, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:

Edward d'Auvergne wrote:


>   For example
> the 'threading' user function class which sets up grid computing.  In
> the future I'll probably want to use my algorithm for handling very
> slow machines on a grid (which avoids relax having to wait for a slow
> machine to terminate) and the setting up of slave relax processes
> (using the 'relax --thread' invocation).  Although the grid computing
> code is currently broken, this is only because there is a problem with
> the handling of SSH tunnel breakages.  I also have in mind some
> optimisations for minimising data flow through the tunnel


can you outline this?


Well, currently data is sent to the slave grid processes via ssh
tunnels.  Prior to running the grid processes relax saves the
model-free results file and sends it to all machines.  The slave
processes read in the results file, run a single Monte Carlo
simulation, save another model-free results file, and transfer this
file back.  The parent thread then reads the new results file into a
temporary 'run' and copies the data to the real run.

This setup is slow, inefficient, and just plain sucks!  What I would
like to do is to set the slave processes to run in a state where they
wait for instructions.  The bare minimum data is then sent to them and
solely the optimised parameter vector and optimisation stats are
returned (i.e. the minimise() method of specific_fns.model_free
module).  This may be how you would like to set up MPI where MPI is
used as the communication interface?


yes



> and Andrew
> Perry has had ideas about using heartbeats from the grid machine relax
> processes to probe for dead tunnels and processes.


again this would be interesting but I would have to look at it quite carefully as mpi has some threading limitations I believe...


I don't think these issues are related to MPI.


there is some interest here as clusters can also lose nodes sometimes and if you are running for say a day or so that could be really annoying so the idea of a heart beat is also of interest here as well



my thought on this was to turn  the treading code into an implimentation
of the multi code so my plan was to rip it out and then put it back ;-)
The architecture devised should be able to cope with ssh tunnels just as
well as a thread or a mpi invocation I think... If not I would be happy
to adapt to any ideas you have.


From my understanding of MPI, threads need not be used. So the

threading and grid computing code could possibly be left untouched by the MPI patch. If you would like to make use of threads in the MPI code and if there is overlap between MPI threads and grid computing threads, we can worry about issues then. Oh, what exactly is 'multi code'?


The point of the multi code is that it is a __generic__ framework for carrying out multiprocessing, it will be able to use threads, ssh tunnels, or mpi... so the current threads architecture (with a bit of regigging, but not much) should fit in nicely with mpi and the threads code as well



The basic idea is as you can see is to send pickled commands containing
data to the slave which then returns a pickled result object,  and I
think that this quite possible down an ssh tunnel. If not text commands
with embedded text would also work.


These pickled objects would be much more efficient than the way grid
computing is currently set up.  I would actually like to have pickled
objects sent to and from the grid slave processes, but you don't need
to worry about that.  The grid computing setup could be improved after
the MPI patch makes it into the main line by using the same pickling
interface.


see above ;-) the intent is to 'assimilate' the current ssh tunnel code ;-)


Cheers,

Edward

.



--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK             Tel. +44-113-3433024
email: garyt@xxxxxxxxxxxxxxx                   Fax  +44-113-2331407
-------------------------------------------------------------------





Related Messages


Powered by MHonArc, Updated Tue Mar 20 11:00:20 2007