mailRe: how to parallelise model_free minimise


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Gary S. Thompson on March 27, 2007 - 18:08:
Edward d'Auvergne wrote:
On 3/27/07, gary thompson <garyt@xxxxxxxxxxxxxxx> wrote:
On 3/26/07, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
> > >> i.e for six residues and 3 nodes
> > >>
> > >> node 1 calculates
> > >> res 1 [m1 m2 m3...]
> > >> res 2 [[m1 m2 m3..]
> > >>
> > >> node 2 calculates
> > >> res 3 [m1 m2 m3...]
> > >> res 4 [[m1 m2 m3..]
> > >>
> > >> node 3 calculates
> > >> res 5 [m1 m2 m3...]
> > >> res 6 [[m1 m2 m3..]
> > >>
> > >> this obviously places some limitations of the design of the
> > >> minimisation function as it might needs to have a set and tear down
> > >> region that cope with this batched data....
> > >
> > > The minimise() main loop is the finest grain parallelisation you can
> > > get without writing a specific parallelised optimisation algorithm.
> > >
> >
> > actually what i was aiming for was as coarse grained and well load
> > balanced a set of calculations as possible (i.e. minimum of
> > communication overhead (i.e. bigger chunks of data with similar
> > computations times going over the wire))
>
> The coarse grained target of the looping over data pipes, which the
> user must specifically ask for, will limit the number of active nodes
> to the number of data pipes, which could be a specific list of
> model-free models (assuming this is the only target). The next level
> is the main loop over the minimisation instances. Finally, the finest
> grain is from the parallelised grid search of grid searches (as
> discussed in the thread starting at
> https://mail.gna.org/public/relax-devel/2007-03/msg00088.html,
> Message-id: <45FEAC46.8020307@xxxxxxxxxxxxxxx>). Can you see anything
> else which could be targeted for parallelisation?
>
no see above if you combine the calculations from the various models
with those by spin etc you get the largest number of calculations that
need/can to be carried out together...

The model-free minimisations for the different model-free models need not be carried out together. There is no efficiency gain there. I don't have any numbers because this was a long time ago, but I have tested this approach out before. Otherwise I would have designed the relax UI to fit this approach.

Why isn't there an efficiency gain? all the calculations are independant and for a limited set of processors you can transfer larger blocks of data so there is less communication overhead. Furthermore there are two other advantages

1. if the rate of calculation between minimisations is not particularily constant you could for example end up with processors idle at the end of minimising model 1 sos starting on model 2 straight away will improve the rate of calculation (especially if your grid computer is hetrogenous
2. if you have an insane number of processors (which will come soon) you can end up with situatiosn where you have 1 spin per processor which wouldb be highly inefficient

> > just as another comment (or two)
> >
> > 1. why does do we send no arguments to the fucntions e.g.
> > -#-num_frq [2]
> > -#-frq [[750800000.0, 599.71900000000005]]
>
> Sorry, again I don't understand the question.  I'll try to answer it
> anyway.  The num_frq value is used to deconvolute the 'remap_table',
> 'frq', 'ri_labels', and 'relax_data'.  This can vary for different
> spin systems.  The frequencies are essential for the calculation of
> the spectral density values (and hence the chi-squared value).
>


what is the difference between num_frq and the number of numbers in frq [[750800000.0, 599.71900000000005]]

Nothing, it's historical baggage (with the added computational benefit of not executing the len() builtin function during calculations).

one len call at the start of a minimisation should be a tiny overhead and I guess that aray store there length in a variable anyway so the overhead should be tiny

> > 2. how does the data return from minimisation work (specifically why is
> > param_vector a instance vaiable of Model_free
> > e.g. we have
> >
> > self.param_vector, self.func, iter, fc, gc, hc, self.warning = results
> >
> > ....
> >
> > # Disassemble the parameter vector.
> > self.disassemble_param_vector(index=index, sim_index=sim_index)
> >
> >
> >
> > inside a tight loop. So even though self.param_vector is an instance
> > variable it doesn't contain state for a Model_free object it just keeps
> > being overwritten by the latest contents of the result
> >
> > so why not
> >
> > param_vector, self.func, iter, fc, gc, hc, self.warning = results
> >
> > ....
> >
> > # Disassemble the parameter vector.
> > self.disassemble_param_vector(param_vector, index=index,
> > sim_index=sim_index)
>
> Yes, this is probably how this should be done. Grep for
> 'self.param_vector' in 'specific_fns/model_free.py' to see why there
> has been an inertia for me to make this change. If necessary for the
> MPI code, please try to minimise the number of changes to this setup
> as possible.


I will

Sorry about the legacy code, but because of the nature
> of the multi_processor branch you may need to work with it.

of course, I have made my bed and (be it nails or feathers)  am having
a good go at lying in it!

Hopefully
> I'll have all these legacy issues completely removed in the 1.3 line
> once the data model redesign is complete.  That being said, if you
> need to, please feel free to make the above changes.
>

I don't need to change this ;-) it was more a question trying to
understand the way things are, ask the obvious question, and check
that I am not missing something deep...

I was planning on fixing all this up anyway. I like clean, readable, and simple code bases to allow anyone to dive straight into it.


regards
gary

Regards,

Edward

.



--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK             Tel. +44-113-3433024
email: garyt@xxxxxxxxxxxxxxx                   Fax  +44-113-2331407
-------------------------------------------------------------------





Related Messages


Powered by MHonArc, Updated Tue Mar 27 18:42:37 2007