Re: multi processing -- May 04, 2006

--- Begin Message ---

To: Edward d'Auvergne <edward.dauvergne@xxxxxxxxx>

Date: Tue, 18 Apr 2006 09:46:47 +0100

Message-id: <4444A777.4000403@bmb.leeds.ac.uk>

References: <443E3032.8040907@bmb.leeds.ac.uk> <7f080ed10604140305tc4036f9y3bd3f1e822a45580@mail.gmail.com>

User-agent: Mozilla Thunderbird 1.0 (X11/20041206)
Edward d'Auvergne wrote:
Whoa, that's a big supercomputer. You are most welcome to give it a go, it should speed up your model-free runs using relax. The changes will necessarily be extensive and will cause breakages while development occurs, so Gary if you decide to go forwards with it, I will probably fork relax and create an unstable development branch called 1.3 where all new developments will go. It might even be a good idea to create a private branch for your changes from 1.3. I will then reserve 1.2 for bug fixes only.
Yep that seems like a good idea, however, read on;-)
I've always planned on adding support for clusters and I have a basic framework in place which might be a good platform to start from. The other idea I've had in the back of my mind is the conversion of the all the model-free function code in the directory 'maths_fns' to C (while still retaining the Python code as an option),
This seems reasonable, when I do a wc | sort -nr on maths_fns I get
12149  48347 493475 total
 3857  20572 174665 jw_mf.py
 2966  10359 153396 mf.py
 1314   3520  39824 ri_comps.py
  924   2434  22280 correlation_time.py
  836   2937  23114 weights.py
  732   2476  24964 jw_mf_comps.py
  599   2748  24435 direction_cosine.py
  470   1269  12150 ri_prime.py
  175    700   6129 ri.py
  109    519   4614 chi2.py
  106    448   4185 jw_mapping.py
   33    177   1922 __init__.py
   28    188   1797 test.c_chi.py
and I guess mf.py would be the one to hit first... The  questions are
1. do we need to do all of it or could we just wrap the maths intensive parts and leave the object creation and management in python 2. Is there a low level test suite so conformity of python and C code can be verified 3. would it be better to do it in pytrex rather than straight C? I guess the thing to do would be to test it out and see what the quality of the C code is like

which may give potential gains of 10 to 20 times increased performance. This code is by far the most CPU intensive, the minimisation code isn't anywhere near as expensive.

yep seems logical, the only question is have you profiled? Chris was trying to do some before the break and there didn't seem to be any really hot spots.. but I maybe misreading the rumour mill (He is of course a gargantuan 5 feet way much of the time ;-) Chris any comments?

The framework currently in place is the threading code. The way the threading code works is through SSH tunnels. It starts a new instance of relax on the remote machine (or local if there are a number of CPUs or CPU cores), that instance gets data sent to it, does the calculation, and returns the result. It does work, although it's not very good at catching failures. I haven't used it lately so I don't know if it's broken.

Thats generally the idea I had, i.e. a fairly course grained approach. My thought was to add constructs to the top level commands (if needed) to allow allow subsets of a set of calculations to be run from a script. i.e. part of a grid search or a few monte carlo runs or a subset of minimisations for a set of residues. Then the real script would generate the required subscripts plus embedded data on the fly. I think this provides a considerable degree of flexibility. Thus for instance our cluster which runs grid engine needs a master script to start all the sub processes rather than a set of separate password less ssh logons which a cluster of workstations would require. In general I thought that catching failures other than a failure to start is not required...

SSH tunnels is probably not the best option for your system. Do you know anything about MPI?

I have read about MPI but have not implimented anything __YET__;-). Also I have compiled some MPI based programs. It seems to a bit of a pig and I don't think the low hanging fruit necessarily require that degree of fine grained distribution...

There are a number of options available for distributed calculations, but it will need to have a clean and stable Python interface.

obviously a stable interface with as little change to the current top level functions and as little suprise as possible is to be desired. I thought it might be a good idea to have some form of facade, so that the various forms of coarse grained multi processing looks the same, whichever one you are using. The idea would be only to have the setup and dispach code different.

Which ever system is decided upon, threading inside the program will probably be necessary so that each thread can be sent to a different machine. This requires calculations which can be parallelised. As minimisation is an iterative process with each iteration requiring the results of the previous, and as it's not the most CPU intensive part anyway, I can't see too many gains in modifying that code.
Agreed
I've already parallelised the Monte Carlo
simulations for the threading code as those calculations are the most
obvious target.
They are a time hog
But all residue specific calculations could be parellelised as well. This is probably where you can get the best speed ups.
Yes that and grid searches seem obvious candidates
I have a few more comments below.
On 4/13/06, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:
Dear Ed
   I was we have a 148 processor beowolf cluster ;-) I was thinking of
having a go at developing a distributed version of relax... are you ok
with that or do you have plans of your own?
The general idea was to have scripts look almost as they are but
1. have a command to register multi processor handlers
The user function class 'threading' is probably close to what you want.
I shall have a look at it
2. have a command to add machines and parameters to the multi processor pool
threading.add() is probably a good template.
again I shall have a read
3. add code to the generic functions/or replace the generic funcntions if the multiprocessing is setup to batch up components of calculations and pass them out to the compute servers

'generic/minimise.py' is the best bet. Otherwise there is 'maths_fns/mf.py' which can be hacked.
more reading ;-)
4. add code to multiplex the results back together again
That should be pretty straight forward.
obviously this would just be a prototype at first but it could be rather
useful
regards
gary
Bye,
Edward
.
Thanks
gary
--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK             Tel. +44-113-3433024
email: garyt@xxxxxxxxxxxxxxx                   Fax  +44-113-2331407
-------------------------------------------------------------------
--- End Message ---

Re: multi processing

Header

Content

Related Messages