Great work with the multi-processor branch! -- May 11, 2007

On Fri, 2007-05-04 at 13:59 +0100, Gary S. Thompson wrote:

garyt@xxxxxxxxxxxxxxx wrote:

Author: varioustoxins
Date: Fri May  4 01:07:07 2007
New Revision: 3280

URL: http://svn.gna.org/viewcvs/relax?rev=3280&view=rev
Log:
threaded multi processor support written, multi is now beta! 
multi now supports three substrates: mpi4py, uniprocessor, and threads

Added:
    branches/multi_processor/multi/prependStringIO.py
      - copied, changed from r3275, 
branches/multi_processor/multi/PrependStringIO.py
Removed:
    branches/multi_processor/multi/PrependStringIO.py
Modified:
    branches/multi_processor/multi/commands.py
    branches/multi_processor/multi/mpi4py_processor.py
    branches/multi_processor/multi/multi_processor.py
    branches/multi_processor/multi/processor.py
    branches/multi_processor/multi/uni_processor.py

[snip]

Following on from the above commit this is an overview of  the multi
processor branch, a review of the state of the branch,  how to use it,
outstanding items and changes required.


Awesome work with this branch Gary!  The code is shaping up nicely and I
think that a multi-processor relax release and announcement should be
imminent (or maybe some pre-release candidate packages).  I'll talk
about this more later on.

Architecture 
--------------

The processor extensions act as a wrapper around the core of relax and
with relativley minimal changes (see for example
multi.commands.MF_minimise_command and 
specific_fns.model_free.minimse()) which allows relax to distribute
distribute computational tasks in a mater slave manner on a  number of
different processor 'fabrics'.  The a fabric is thus  defines a
mechanism of distributing computational tasks. The three fabrics
currently supported are


When the multi-processor code is ported back to the 1.3 branch, once
that branch is closer to completion, the multi.commands.MF_* classes
should be moved into specific_fns.model_free.multi.MF_* where the file
is called 'specific_fns/model_free/multi.py'.  I'm in the process of
splitting up the model_free.py file into multiple, more manageable
files.  The code of the MF_* classes is specific to model-free analysis
and should really be placed in the 'specific_fns/model_free' directory.
This code is also all minimisation code so maybe it could go into the
future 'specific_fns/model_free/minimise.py' file.

The minimal changes required is very useful for other parts of relax.
This will make it very easy for others to parallelise other
computationally intensive parts of the program.  It's great that the
multi-processor framework makes it so easy to utilise!

- uni a simple single processor fabric that doesn't operate relax in
parallel and replicates the results that a normal relax tag 1.3.0
session would produce. This is the default fabric and is present to
provide a unified relax for both parallel and non parallel
architectures. (this may seem redundant as running the thread fabric
with 1 slave processor gives the same result; however, python can be
compiled without thread support and this provides an implementation
for these python configurations).
- thread this implimentation runs calculations on parallel threads
within the same machine and is ideal for a shared memory processor
such as one of the latest workstaions with multiple chips with
multiple cores
-mpi4py this implimentation uses the mpi4py library to communicate
using  MPI (message passing interface) to communicate between a
cluster of processes either on the same machine or on a disjoint set
of machines. This is one of the common methods  used to link computers
in beowulf clusters


It's awesome that you have all three implemented and working!  Btw, is
fabric the best word to describe these?  Would these be better described
as 'modes of execution'?

there are other ways that processes can communicate , and the
architecture of the multi module is such that adding a different
processor fabric implimentation is relativley simple. (see for example
how multi.mpi4py_processor is written as thin venear over the top of
multi.multi_processor and have very few lines of code (~120) [note the
name of multi_processor.py will change soon to
multi_processor_base] ). processor fabrics (multi.uni_prcoessor,
multi,mpi4py_processor) are loaded dynamically as plugins depending on
the command line option --multi/-m and so new processor fabrics are
easily created and loaded.

Processor fabrics which are obvious targets for implimentation
include 

- other implimentations using  different python mpi libraries (pypar
etc)
- use of ssh tunnels for parallel programming
- use of the twisted frame work for communication
http://twistedmatrix.com/projects/
- the parallel virtual machine (pvm) via pypvm
http://pypvm.sourceforge.net


Again it's great that you have created a framework where it will be so
easy to implement these different modes!

compilation
-------------


(Note, this isn't the compilation of relax ;)

How to get and use the current implimentation. The currrent
implimentation is in a branch of the relax project and can be accessed
with the follwing subversion command:



Checkout over SVN protocol (TCP 3690):
svn co svn://svn.gna.org/svn/relax/branches/multi_processor relax_multi
Checkout over http:
svn co http://svn.gna.org/svn/relax/branches/multi_processor relax_multi


We really should have a packaged release instead.

the implimentaton has no extra dependencies from a vanilla relax
installation [ http://www.nmr-relax.com/download.html]  apart from a
requirement for mpi4py if you are going to use the mpi4py processor
fabric. mpi4py can be obtained from the python cheese shop
[cheeseshop.python.org/pypi/] at
http://cheeseshop.python.org/pypi/mpi4py. You will need to compile it
against an mpi implimentation (I used lam: www.lam-mpi.org. Though
other mpi impilmentations should work I have not tried them)


I am using Mandriva 2007.1 linux and by using the software manager to
install 'mpi4py', I also end up with the 'mpich2' mpi software
dependency installed.  I wouldn't recommend doing this though because
Mandriva 2007.1 is fubar - it comes with python 2.5 but the mpi4py is
installed as a site-package for python 2.4 (among many other issues)!!!
Other linux versions will probably have greater success with their
software installers and dependencies and hence compilation may not be
necessary.

Three important  points to note when compiling  the mpi4py code  are
that

1. the mpi4py code must be compiled against a copy of the mpi
libraries whicht are in a shared object. so for example for lam when
you compuile it  you need to use 
 ./configure --enable-shared before you us 'make' and 'make install'
so that you get a lib<xxx>.so library as well as a lib<xxx>.a  ater
compilation where <xxx> is the name of you mpi library (mpich, lam
etc)
2. I believe  the code for you mpi installation and needs to be
position independant when compiled on x86_64 machines; so you need to
use the -FPIC flag
3. step 2 precludes the use of compilation with the portland groups
compilers as they don't seem to cope well with shared objects
(allegedly and in my hands)
4. [ok i lied] I have compiled the code for mpi4py and tested under
linux with both 32 bit and 64 bit processors (in my case I used a
single processor machine setup to run as a 6 task lam mpi box for
basic sanity testing, for real testing I used a 148 processor
cluster). I have not tried things out on windows or on osx so you
mileage may vary
5. [ok i lied alot, see douglas adams for examples of how to do this
sort of thing in real style;-)]  I believe you mpi implimentation
should be compiled with the same compiler as was used for your python
installation

command line and running the code
---------------------------------------

The multi branch adds two command line switches -m/--multi and
-n/--processors

  -m MULTIPROCESSOR, --multi=MULTIPROCESSOR
                        set multi processor method
  -n N_PROCESSORS, --processors=N_PROCESSORS
                        set number of processors (may be ignored)

--multi <MULTIPROCESSOR> specifies the multi processor implimentations
to use and <MULTIPROCESSOR> defaults to 'uni'  The name of the
processor to use should be the same as the first part of one of the
processor impimentation files in multi ie the correspondences are

'-m uni' loads: multi.uni_processor.Uni_processor from
uni_processor.py
'-m thread'  loads multi.thread_processor.Thread_processor from
thread_processor.py
'-m mpi4py'  loads multi.mpi4py_processor.Mpi4py_processor from
mpi4py_processor.py

--processors  sets the number of slave processors to use for
calculation (there is currently always one extra master processor that
allocates jobs and sevrices i/o for the thread and mpi4py processor
fabrics) and is only supported by the thread implimentation.


Could this value be sent to the MPI software so that instead of typing:

mpirun -np 6 relax --multi mpi4py test_small.py

you would instead type:

relax --multi mpi4py -n 6 test_small.py

?  This isn't particularily important, may take a large amount of time,
and may even be impossible.  It would standardise the command line usage
though.

 Uniprocessor always only has one processor and the mpi
implimentations use the number of processors allocated to them by the
mpi environment.

as an example of using the mpi4py version here are the commands I use
to run a 6 processor run [1 master and 5 slaves] on my linux box:

lamboot
mpirun -np 6 relax --multi mpi4py test_small.py
lamhalt
#lamclean if lam halt returns failure

the lamhalt may give errors, however, sometimes if you don't stop and
start lam cleanly you can get strange results

I'll start a second email responding to the rest.  This will keep this
post from getting too long.

Cheers,

Edward

Great work with the multi-processor branch!

Header

Content

Related Messages