mailRe: how to parallelise model_free minimise


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on March 27, 2007 - 17:01:
On 3/27/07, gary thompson <garyt@xxxxxxxxxxxxxxx> wrote:
On 3/26/07, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
> > >> i.e for six residues and 3 nodes
> > >>
> > >> node 1 calculates
> > >> res 1 [m1 m2 m3...]
> > >> res 2 [[m1 m2 m3..]
> > >>
> > >> node 2 calculates
> > >> res 3 [m1 m2 m3...]
> > >> res 4 [[m1 m2 m3..]
> > >>
> > >> node 3 calculates
> > >> res 5 [m1 m2 m3...]
> > >> res 6 [[m1 m2 m3..]
> > >>
> > >> this obviously places some limitations of the design of the
> > >> minimisation function as it might needs to have a set and tear down
> > >> region that cope with this batched data....
> > >
> > > The minimise() main loop is the finest grain parallelisation you can
> > > get without writing a specific parallelised optimisation algorithm.
> > >
> >
> > actually what i was aiming for was as coarse grained and well load
> > balanced a set of calculations as possible (i.e. minimum of
> > communication overhead (i.e. bigger chunks of data with similar
> > computations times going over the wire))
>
> The coarse grained target of the looping over data pipes, which the
> user must specifically ask for, will limit the number of active nodes
> to the number of data pipes, which could be a specific list of
> model-free models (assuming this is the only target).  The next level
> is the main loop over the minimisation instances.  Finally, the finest
> grain is from the parallelised grid search of grid searches (as
> discussed in the thread starting at
> https://mail.gna.org/public/relax-devel/2007-03/msg00088.html,
> Message-id: <45FEAC46.8020307@xxxxxxxxxxxxxxx>).  Can you see anything
> else which could be targeted for parallelisation?
>
no see above if you combine the calculations from the various models
with those by spin etc you get the largest number of calculations that
need/can to be carried out together...

The model-free minimisations for the different model-free models need not be carried out together. There is no efficiency gain there. I don't have any numbers because this was a long time ago, but I have tested this approach out before. Otherwise I would have designed the relax UI to fit this approach.


> > just as another comment (or two)
> >
> > 1. why does do we send no arguments to the fucntions e.g.
> > -#-num_frq [2]
> > -#-frq [[750800000.0, 599.71900000000005]]
>
> Sorry, again I don't understand the question.  I'll try to answer it
> anyway.  The num_frq value is used to deconvolute the 'remap_table',
> 'frq', 'ri_labels', and 'relax_data'.  This can vary for different
> spin systems.  The frequencies are essential for the calculation of
> the spectral density values (and hence the chi-squared value).
>


what is the difference between num_frq and the number of numbers in frq [[750800000.0, 599.71900000000005]]

Nothing, it's historical baggage (with the added computational benefit of not executing the len() builtin function during calculations).


> > 2. how does the data return from minimisation work (specifically why is
> > param_vector  a  instance vaiable of  Model_free
> > e.g. we have
> >
> > self.param_vector, self.func, iter, fc, gc, hc, self.warning = results
> >
> > ....
> >
> > # Disassemble the parameter vector.
> > self.disassemble_param_vector(index=index, sim_index=sim_index)
> >
> >
> >
> > inside a tight loop. So even though self.param_vector is an instance
> > variable it doesn't contain state for a Model_free object it just keeps
> > being overwritten by the latest contents of the result
> >
> > so why not
> >
> > param_vector, self.func, iter, fc, gc, hc, self.warning = results
> >
> > ....
> >
> > # Disassemble the parameter vector.
> > self.disassemble_param_vector(param_vector, index=index,
> > sim_index=sim_index)
>
> Yes, this is probably how this should be done.  Grep for
> 'self.param_vector' in 'specific_fns/model_free.py' to see why there
> has been an inertia for me to make this change.  If necessary for the
> MPI code, please try to minimise the number of changes to this setup
> as possible.

I will

Sorry about the legacy code, but because of the nature
> of the multi_processor branch you may need to work with it.

of course, I have made my bed and (be it nails or feathers)  am having
a good go at lying in it!

Hopefully
> I'll have all these legacy issues completely removed in the 1.3 line
> once the data model redesign is complete.  That being said, if you
> need to, please feel free to make the above changes.
>

I don't need to change this ;-) it was more a question trying to
understand the way things are, ask the obvious question, and check
that I am not missing something deep...

I was planning on fixing all this up anyway. I like clean, readable, and simple code bases to allow anyone to dive straight into it.

Regards,

Edward



Related Messages


Powered by MHonArc, Updated Tue Mar 27 18:23:13 2007