mailRe: how to parallelise monte carlo and grid searches


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on March 30, 2007 - 11:29:
For the parallelisation of the Monte Carlo simulations, I would target
the minimise() function again.  I'll get to the grid search in a
second post.  The grid computing code that I wrote targeted the file
'generic_fns/minimise.py'.  You can see this code executed in the
minimise() method of the Minimise class and in the two threading
classes at the end of the module.  This could be a good place to
target the parallelisation because that minimise() function is where
the looping over the MC sims occurs.  If this is the target, then
there is the question of whether the main loop of the model-free
minimise() function should also be parallelised.  Obviously the slave
nodes can't act as a parent node sending out calculations to other
nodes.  Therefore if the generic minimise() function is targeted, the
specific minimise() function would need to operate in single processor
mode.  Unfortunately there will need to be a lot of data transmitted -
essentially one entire data pipe in the relax data storage object will
need to be pickled and transmitted to all nodes.  That is unless the
specific simulation data is pulled out each time and treated as normal
non-simulation data.

Alternatively, the looping over simulations could be brought into the
specific model-free code.  This will require greater changes to the
original relax code base.  In this option, again it should be decided
if individual optimisation instances should be parallelised together
with the simulations.  My opinion is that communication overhead would
be significantly decreased by parallelising only the simulation loop.
The reason for this is that for an individual optimisation, the
simulation parameter vector starts very close to the minimum and hence
optimisation is orders of magnitude quicker for that problem (and
there is no grid search).  So in this case, I would guess that the
speed limitation of the parallelisation of simulations + optimisations
will be almost all due to inter-node communications.

Cheers,

Edward


On 3/30/07, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:
Do you have any thoughts on the question of how to parallelise grid
searches and monte
carlo runs and where to dig into the code base?


regards gary

note I have already realise one wrinkle about paralleliseing the
montecarlo runs. We need to have consistent (!) use of random numbers on
the various processors. Therefore  am going to create a command in
processor that will create and resturn a block of random numbers (I
guess we also need to think about what distributions we need as well)



regards
gary

--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK             Tel. +44-113-3433024
email: garyt@xxxxxxxxxxxxxxx                   Fax  +44-113-2331407
-------------------------------------------------------------------






Related Messages


Powered by MHonArc, Updated Sun Apr 01 00:05:53 2007