Re: how to parallelise model_free minimise -- March 26, 2007

>> i.e for six residues and 3 nodes
>>
>> node 1 calculates
>> res 1 [m1 m2 m3...]
>> res 2 [[m1 m2 m3..]
>>
>> node 2 calculates
>> res 3 [m1 m2 m3...]
>> res 4 [[m1 m2 m3..]
>>
>> node 3 calculates
>> res 5 [m1 m2 m3...]
>> res 6 [[m1 m2 m3..]
>>
>> this obviously places some limitations of the design of the
>> minimisation function as it might needs to have a set and tear down
>> region that cope with this batched data....
>
> The minimise() main loop is the finest grain parallelisation you can
> get without writing a specific parallelised optimisation algorithm.
>

actually what i was aiming for was as coarse grained and well load
balanced a set of calculations as possible (i.e. minimum of
communication overhead (i.e. bigger chunks of data with similar
computations times going over the wire))


The coarse grained target of the looping over data pipes, which the
user must specifically ask for, will limit the number of active nodes
to the number of data pipes, which could be a specific list of
model-free models (assuming this is the only target).  The next level
is the main loop over the minimisation instances.  Finally, the finest
grain is from the parallelised grid search of grid searches (as
discussed in the thread starting at
https://mail.gna.org/public/relax-devel/2007-03/msg00088.html,
Message-id: <45FEAC46.8020307@xxxxxxxxxxxxxxx>).  Can you see anything
else which could be targeted for parallelisation?

just as another comment (or two)

1. why does do we send no arguments to the fucntions e.g.
-#-num_frq [2]
-#-frq [[750800000.0, 599.71900000000005]]


Sorry, again I don't understand the question.  I'll try to answer it
anyway.  The num_frq value is used to deconvolute the 'remap_table',
'frq', 'ri_labels', and 'relax_data'.  This can vary for different
spin systems.  The frequencies are essential for the calculation of
the spectral density values (and hence the chi-squared value).

2. how does the data return from minimisation work (specifically why is
param_vector  a  instance vaiable of  Model_free
e.g. we have

self.param_vector, self.func, iter, fc, gc, hc, self.warning = results

....

# Disassemble the parameter vector.
self.disassemble_param_vector(index=index, sim_index=sim_index)

inside a tight loop. So even though self.param_vector is an instance
variable it doesn't contain state for a Model_free object it just keeps
being overwritten by the latest contents of the result

so why not

param_vector, self.func, iter, fc, gc, hc, self.warning = results

....

# Disassemble the parameter vector.
self.disassemble_param_vector(param_vector, index=index,
sim_index=sim_index)


Yes, this is probably how this should be done.  Grep for
'self.param_vector' in 'specific_fns/model_free.py' to see why there
has been an inertia for me to make this change.  If necessary for the
MPI code, please try to minimise the number of changes to this setup
as possible.  Sorry about the legacy code, but because of the nature
of the multi_processor branch you may need to work with it.  Hopefully
I'll have all these legacy issues completely removed in the 1.3 line
once the data model redesign is complete.  That being said, if you
need to, please feel free to make the above changes.

Cheers,

Edward

Re: how to parallelise model_free minimise

Header

Content

Related Messages