Re: multi processing -- July 06, 2006

Hi thanks for this info, as you can see from my previous e-mail I am not
going to build in a dependance on mpi in relax directly as I want to
have multiple backends. But this is till useful as I will use mpi as my
first transport for testing ;-)


Scientific is already a dependancy so if you use its MPI interface
there should be no problems.

> On 5/5/06, Andrew Perry <ajperry@xxxxxxxxxxxxxx> wrote:
>
>>
>> >>SSH tunnels is probably not the best option for your system.  Do you
>> >>know anything about MPI?
>> >
>> >I have read about MPI but have not implimented anything __YET__;-).
>> >Also
>> >I have compiled some MPI based programs. It seems to a bit of a pig
>> >and
>> >I don't think the low hanging fruit necessarily require that degree of
>> >fine grained distribution...
>>
>> If this is any help, I've done what I think is some fairly exhaustive
>> searching for python+mpi implementations recently. Note that I've never
>> _actually_ used any of them for a project yet.
>>
>
> Thanks, the info should help.
>
>> Scientific Python has an MPI interface, which is handy since it is
>> already a
>> relax dependancy. The drawback is that its documentation seems very
>> geared
>> toward those who already understand MPI reasonably well. The other
>> drawback
>> is that is seems to be only able to pass Numpy arrays and strings
>> between
>> nodes, which would mean some relax data structures would probably
>> need to be
>> 'repackaged' for sending via MPI.
>> 
http://starship.python.net/~hinsen/ScientificPython/ScientificPythonManual/Scientific_27.html
>>
>>
>
I have got this one up and this is what I am going to use make my
default implementation.  The good news is that the apparent drawbacks
don't seem to bad as you can dump arbitary data to in memory files via
cpickle and then send them over the mpi link  as a byte array

> If the code is implemented is implemented at the analysis specific
> level, for example the minimise() function in the file
> 'specific_fns/model_free.py', then almost all of the data structures
> are already converted to Numeric arrays.
>
that neat ;-) and useful information

>> Another one is:
>>  MYMPI - http://peloton.sdsc.edu/~tkaiser/mympi/ (and
>> http://grid-devel.sdsc.edu/gridsphere/gridsphere?cid=mympi
>> ) -  syntax intended to match C MPI API closely, and much like
>> Scientific.MPI only has direct support for some basic data types, not
>> arbitrary python objects.
>>
>> Most other implementations (below) support transmission of any python
>> object
>> that can be pickled, and so may take less code to implement in relax.
>> However, sending the whole data object when only select parts of it are
>> required for the calculation could be more inefficient than you would
>> like,
>> and so 'repackaging' and sending just what is needed may be better
>> anyway. I
>> wonder which is worse in this case .. the network overhead of sending a
>> large-ish python object, or the extra load on the 'master' node as it
>> repackages it to smaller Numpy array ..?? Guess it all depends on
>> whether
>> things are carved up 'batchwise' or more fine-grained (inner
>> loop/function
>> level).
>>
>
certainly for a first implementation I am going to go batchwise


Batchwise may actually require more changes to the code and be more
difficult to setup.  Then again, batchwise may be the best option for
MPI and may remove the need for threading.  As I don't know the first
thing about MPI, do you have any opinions Andrew?

> What data needs to be sent depends on what level the threading will be
> implemented on.  If each call to minimise() in
> 'specific_fns/model_free.py' is threaded, then only the data which is
> packaged within that function will need to be sent.  The node can then
> return solely the minimisation results (parameter vector, iteration
> count, function count, gradient count, hessian count, and warnings).
> My threading code is a little higher up in the chain within the
> minimise() function of the generic code (generic_fns/minimise.py)
> which calls the specific model-free minimise() function.  This code
> currently only works for Monte Carlo simulations.

>
> The repackaging overhead by the master node should be tiny compared to
> the calculation time.  The cost of sending data could become quite
> high if the threading is fine grained enough.  What really needs to be
> determined is what will be threaded.  Will individual model-free
> minimisation instances be threaded?  If the diffusion tensor is fixed
> then individual residue minimisations will be threaded.  If the
> diffusion tensor parameters are optimised, either with or without the
> model-free parameters, then there is one single instance.  If the
> local tm parameter is included, then again individual residues are
> optimised.  Using this fine grained approach communication to and from
> the nodes will likely be expensive.


yep I don't intend to go fine grain at the moment, though my general
design could be expanded that way if needed

>
> The second thing which could be threaded is the runs themselves.  For
> example if models m1 to m9 are optimised normally using a Python loop
> these could be threaded so that, assuming individual residue
> minimisations are threaded, then model m2 calculations could start
> while instances of model m1 are still being calculated on nodes.  This
> could cause significant speed ups if the protein has more residues
> than the cluster has nodes.  Otherwise each run could be sent to a
> different node (the amount of data sent would be much larger).


again  I would prefer not to optimise by models, but by residues at the
moment, as the management is just so much easier


The single point in the code I talked about in the last message would
be the best target.


>
> Finally Monte Carlo simulations are the highest level and most obvious
> target.  This is the part of model-free analysis which takes the
> longest.
>
>> MMPI - http://www.penzilla.net/mmpi/ - looks to be actively
>> developed, good
>> documentation with examples, including sending of python objects via
>> pickling.
>>
>>  pyPar -
>> http://datamining.anu.edu.au/~ole/work/software/pypar/ -
>> sends abitrary python objects, only two GPL licensed files so would
>> be very
>> easy to package directly with relax rather than make users chase
>> dependancies.
>>
>
> We could import the dependancy with a 'try:' statement so that MPI is
> only a dependency for those wishing to use multiple machines.  It
> looks like Pypar is dependent on a C MPI library anyway.


again the use of backends (which would be classes on the pythonpath)
will solve this problem. Instantion of plugins which lack the required c
libraries should obviously raise useful exceptions...


It shouldn't be an issue if Scientific MPI is used.


>
>> There are also two which are parallel python interpreters that require
>> recompilation, and seem to work a bit differently (still getting my head
>> around exactly how these are meant to be used).
>>
>>  http://www.cimec.org.ar/python/ - a parallel interpreter as well as
>> also
>> some MPI bindings for python. I tested the interpreter with relax and
>> LAM/MPI, seemed to spawn off lots of processes and run.
>>
>>  pyMPI - http://pympi.sourceforge.net/index.html - a
>> parallel python interpreter, decent docs at (
>> http://heanet.dl.sourceforge.net/sourceforge/pympi/pyMPI.pdf
>> ), seems mature despite out of data website.
>>
>> There is also:
>>
>> MPY - http://mpy.sourceforge.net/index.html (seems
>> abandoned since 2004)
>>
>> Hope this helps ...
>>
>> Andrew
>>
>
> Would you know which of these implementations are the most mature or
> the most used?  Stability would be better than fancy features.

if we use transport backends  as i discuss in my previous message we can
change the implementation relativley trivially and so avoid longterm
dependancy problems


Edward

Re: multi processing

Header

Content

Related Messages