Re: how to parallelise model_free minimise -- March 27, 2007

On 3/27/07, Edward d'Auvergne <edward.dauvergne@xxxxxxxxx> wrote:

On 3/27/07, gary thompson <garyt@xxxxxxxxxxxxxxx> wrote:
> On 3/26/07, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
> > On 3/27/07, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:

[snip]

> > > e.g.it would be nice to have
> > >
> > > for residue in  all residues:
> > >     for model in models:
> > >              do_stuff-(tm)
> > >
> > >
> > > as opposed to
> > >
> > > for model in models: #currently at the user level
> > >     for residue in  all residues:
> > >              do_stuff-(tm)
> > >
> > > now that might need something of the form
> > >
> > >         # Set the run names (also the names of preset model-free 
models).
> > >         if local_tm:
> > >             self.runs = ['tm0', 'tm1', 'tm2', 'tm3', 'tm4', 'tm5',
> > > 'tm6', 'tm7', 'tm8', 'tm9']
> > >         else:
> > >             self.runs = ['m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7',
> > > 'm8', 'm9']
> > >
> > > run.create_composite('super')
> > > for name in self.runs:
> > >
> > >     run.create(name, 'mf')
> > >     composite_add('super',name)
> > >     minimise('newton', run='super')
> > >
> > >
> > > which would minimise all runs in parallel...
> > >
> > > and I understand from chris that we are planning to do
> > >
> > >
> > >        # Set the run names (also the names of preset model-free models).
> > >         if local_tm:
> > >             self.runs = ['tm0', 'tm1', 'tm2', 'tm3', 'tm4', 'tm5',
> > > 'tm6', 'tm7', 'tm8', 'tm9']
> > >         else:
> > >             self.runs = ['m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7',
> > > 'm8', 'm9']
> > >
> > >
> > >      minimise('newton', runs=self.runs)
> > >
> > >
> > > which would also work
> > >
> > >
> > > now comes the tricky bit
> > >
> > >
> > > all the minimisations etc would now become rfnctions to setup
> > > minimsations and say submit them to a queue with a suitable object to
> > > allow the results to be sorted out later.
> > >
> > > then at the end of minimise('newton', runs=self.runs) you would collect
> > > in all the results from all calculations and complete the calculation so
> > > we have something like
> > >
> > > for residue
> > >     for run in runs:
> > >        calculation-instance = setup-calculation(residue,run)
> > >        queue.submit(calculation-instance)
> > > while(queue.not_complete()):
> > >     result.queue.get_result()
> > >     result.record(self.reax.data)
> > >
> > > This will allow the maximum numer of calculations to be conducted in
> > > parallel and will intrisically load balance as well as we can get
> >
> > There are a number of very important issues with this approach.  The
> > most important is that the loop over the data pipes corresponding to
> > the model-free models (the 'runs') is deliberately not part of the
> > relax codebase.  In Chris' implementation of the 'runs' argument
> > (which will need to be renamed) the loop will be at the highest level
> > of the code so that for the generic_fns.minimise code onwards nothing
> > changes.  This high level loop would probably be a very difficult
> > target for MPI as the whole relax data storage object will need to be
> > sent between nodes.  This multi-megabyte transfer per node, per
> > calculation is not ideal.
> >
> no you wouldn't have to if put the whole thing over the wire as long
> as you add calculations to do to a queue at the low level and then
> requested the calculations  be completed at the end  of the high level
> function. In the end of it the user and program see no difference its
> a bit like how an optimising compiler works I guess....

I'm not talking about your suggested implementation but Chris'
implementation (the runs argument) which we have already decided upon.
 Your suggestion affects this decision (as well as the whole relax UI,
I'll get to this later).


> > Secondly, and very importantly, relax doesn't loop over residues in
> > the model-free minimise() function.  relax loops over minimisation
> > instances.  For the 'mf' and 'local_tm' parameter sets, this is a loop
> > over the spin systems (i.e. molecules first, residues second, and spin
> > systems last).  For the 'diff' and 'all' parameters sets the number of
> > minimisation instances is one and hence the loop runs once and then
> > that's it.  Looping over these followed by looping over the data pipes
> > (ex-runs) is insane!  That is essentially first looping over the
> > finest grained level followed by the coarsest.
>
> I do not quite follow where the insanity comes from ;-)
>
> It is not problem...  What is required is to pass as few chunks of
> data with the largest size and best balance of computations over the
> wire...  Essentially  I want to (effectively, not literally) build a
> list of residues and divide the residues out roughly by processor and
> then find all the models required for each residue set them up the
> whole set of calculations chunk the whole list by the number of
> processors say *3 and then put all these calculations on a queue then
> collect the results and put the results where they need to be.
> Basically i am saying that in many cases minimisation instances and
> runs are disjoint sets and so can be calculated at the same time e.g.
> the result of residue3 run tm0 does not affect the result of residue 3
> tm1 etc ....

The insanity is from the fact that the suggestion of the looping over
residues first and then looping over the data pipes breaks the most
fundamental premise of the relax UI (user interface) design - the data
pipes and how the user interacts with them.  I cannot stress how bad
this is!


Actually on further reflection as long as low level and high level
command cooperate very slightly I think it doesn't make a lot of
difference which runs as the outer loop. The main component of what I
am suggesting is that the pipes and everything remain the same but we
effectively use them as a means of scheduling (please feel free to
shoot this down as well ;-))

so basically as long as you have a function of the form

runs =[mf1,mf2...]

minimise('newton',runs)

it can all work with almost no architecture change


all you do is have model_free.minimise doing what it is doing in the
multi branch and adding commands to the multiprocessor queue along
with the attached commands to store the data back to the right place
in the relax data structures when each command has completed
processing.

if you then loop over all pipes in   minimise('newton',runs) and do
all submission and then ask for the processing to occur at the end of
minimise('newton',runs) everything occurs in the right place and with
out inconvenient overlaps etc Now if processor queue wants to reorder
the queue thats its problem, because we know that everything that goes
to minimise is compatible i.e all the calculations are disjoint and no
one call withing minimise can affect the other

please do go ahead and shoot me if needed I don't wish to cause havoc
just to understand what the best way to go is

regards
gary

Re: how to parallelise model_free minimise

Header

Content

Related Messages