Edward d'Auvergne wrote:
On 3/24/07, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:
Dear Ed and all
I have started looking at how to parallelise the calls to
model_free.minimise as discussed in our previous message but am having
some problems with the function....
The first is that it is huge and seems to have a large amount of special
casing and checking built in.
Yep, it is quite complex. However this code complexity simplifies the
execution of model-free minimisations and grid searches.
The second one is to work out how many different modes it can be called
in. From waht I can see I need to look for param_set and only
paralellise if self.param_set is one of mf or local_mf (diff and alle
being either to trivial or too hard to optimise respectivley ;-)
There is no need as this param_set is handled by the code to produce
the main loop of that model-free 'minimise()' function. The part to
parallelise of this function is this main loop over the minimisation
instances. You can find the loop on line 2118 of
'specific_fns/model_free.py' of the 'multi_processor' branch (or
search for the comment "# Loop over the minimisation instances.").
Then there comes the parameters passed
first there is Mf and generic_minimse which seem to take a huge number
orf parameters:
self.mf = Mf(init_params=self.param_vector,
param_set=self.param_set, diff_type=diff_type,
diff_params=diff_params,
scaling_matrix=self.scaling_matrix, num_res=num_res,
equations=equations, param_types=param_types,
param_values=param_values,
relax_data=relax_data, errors=relax_error,
bond_length=r, csa=csa, num_frq=num_frq,
frq=frq, num_ri=num_ri,
remap_table=remap_table, noe_r1_table=noe_r1_table,
ri_labels=ri_labels, gx=self.relax.data.gx,
gh=self.relax.data.gh,
g_ratio=self.relax.data.g_ratio,
h_bar=self.relax.data.h_bar,
mu0=self.relax.data.mu0, num_params=num_params,
vectors=xh_unit_vectors)
and generic
results = generic_minimise(func=self.mf.func,
dfunc=self.mf.dfunc, d2func=self.mf.d2func, args=(),
x0=self.param_vector,
min_algor=min_algor, min_options=min_options,
func_tol=func_tol,
grad_tol=grad_tol, maxiter=max_iterations,
full_output=1,
print_flag=print_flag, A=A, b=b)
If it is the main loop over the minimisation instances which is
parallelised for MPI, etc., then this code won't need modification.
The questions here are
1. that are all these parameters either I misreading thing or else not
undertanding because i couldn't find definitions for every thing
All of the arguments for the instantiation of the Mf class are setup
at the start of the main minimisation loop, i.e. between lines 2119
and 2324. This is most of the code of the 'minimise()' function.
indeed but I am not clear in all cases as to what they contain...
2. what is going to change between runs or even over runs of the relax
program.
For each iteration of the main loop, these arguments and parameters
will change.
Not necessarily? certainly things such as remap_table, ri_labels, etc do
not seem to change between passes through the loop
clearly some things don't change at all and it could even be asked why
for example h_bar is a parameter to Mf (ther maybe something deep I am
missing here?)
h_bar in the new 1.3 code need not be sent in. I have created the
module 'physical_constants' from which it can be imported. Every
argument to the Mf instantiation, except for h_bar and mu0, will be
different for each minimisation 'instance'.
thats good
For example if the
param_set is 'all', then the 'relax_data' argument will be the
relaxation data of all selected spin systems. If param_set is 'mf',
then 'relax_data' will be the relaxation data of a single spin.
some other things may only change at specific points in the program For
example the vectors of the molecules should only change when the vecors
function of structure or pdb is called
other things are per residue but which of them???
All but h_bar and mu0.
and other things are differing by type of model and minimisation....
Yep, see above.
as a note typical input parameters for a local tm calculation are
-#-initialise Mf residue - 3 LEU
-#--------------
-#-
-#-init_params [ 1000.]
-#-param_se local_tm
-#-diff_type sphere
-#-diff_params None
-#-scaling_matrix [ [ 1.00000000e-12]]
-#-num_res 1
-#-equations ['mf_orig']
-#-param_types [['local_tm']]
-#-param_values None
-#-relax_data [array([ 0.8293, 12.85 , 0.9528, 12.57 ,
0.0983])]
-#-errors [array([ 0.023 , 0.13 , 0.0253, 0.171 , 0.0278])]
-#-bond_length [1.0200000000000001e-10]
-#-csa= [-0.00017199999999999998]
-#-num_frq [2]
-#-frq [[750800000.0, 599.71900000000005]]
-#-num_ri [5]
-#-remap_table [[0, 0, 1, 1, 1]]
-#-noe_r1_table [[None, None, None, None, 2]]
-#-ri_labels [['R1', 'R2', 'R1', 'R2', 'NOE']]
-#-gx -27126000.0
-#-gh 267522212.0
-#-g_ratio -9.862206444
-#-h_bar 1.05457159642e-34
-#-mu0 1.25663706144e-06
-#-num_params [1]
-#-vectors [None]
-#-
-#-generic minimisation residue - 3 LEU
-#-----------------------------
-#-
-#-constraints 1
-#-func <bound method Mf.func_local_tm of <maths_fns.mf.Mf instance at
0x4079fc0c>> -#-dfunc <bound method Mf.dfunc_local_tm of
<maths_fns.mf.Mf instance a
t 0x4079fc0c>>
-#-d2func <bound method Mf.d2func_local_tm of <maths_fns.mf.Mf instance
at 0x4079fc0c>>
-#-args ()
-#-print x0 [ 1000.]
-#-print min_algor Method of Multipliers
-#-min_options ('newton',)
-#-func_tol 1e-25
-#-grad_tol None
-#-maxiter 10000000
-#-full_output 1
-#-print_flag 1
-#-constrained
-#-A [[ 1.]
-#-b [ 0. -200000.]
As an aside when the redesign of the spin_loops and minimise /model
loops cuts in it would be a good idea 9from the paralle point of view)
to have the spin loop running faster than the minimse/model loop
Sorry I wasn't quite clear here, its not comuptational speed I am
talking about but the speed of the 'loop counter'
e.g.it would be nice to have
for residue in all residues:
for model in models:
do_stuff-(tm)
as opposed to
for model in models: #currently at the user level
for residue in all residues:
do_stuff-(tm)
now that might need something of the form
# Set the run names (also the names of preset model-free models).
if local_tm:
self.runs = ['tm0', 'tm1', 'tm2', 'tm3', 'tm4', 'tm5',
'tm6', 'tm7', 'tm8', 'tm9']
else:
self.runs = ['m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7',
'm8', 'm9']
run.create_composite('super')
for name in self.runs:
run.create(name, 'mf')
composite_add('super',name)
minimise('newton', run='super')
which would minimise all runs in parallel...
and I understand from chris that we are planning to do
# Set the run names (also the names of preset model-free models).
if local_tm:
self.runs = ['tm0', 'tm1', 'tm2', 'tm3', 'tm4', 'tm5',
'tm6', 'tm7', 'tm8', 'tm9']
else:
self.runs = ['m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7',
'm8', 'm9']
minimise('newton', runs=self.runs)
which would also work
now comes the tricky bit
all the minimisations etc would now become rfnctions to setup
minimsations and say submit them to a queue with a suitable object to
allow the results to be sorted out later.
then at the end of minimise('newton', runs=self.runs) you would collect
in all the results from all calculations and complete the calculation so
we have something like
for residue
for run in runs:
calculation-instance = setup-calculation(residue,run)
queue.submit(calculation-instance)
while(queue.not_complete()):
result.queue.get_result()
result.record(self.reax.data)
This will allow the maximum numer of calculations to be conducted in
parallel and will intrisically load balance as well as we can get
That's guaranteed. The speed of the spin loop (the spin_loop()
function of the 'generic_fns.selection' module of the 1.3 line) will
be limited by the internal Python looping speed. The main loop of the
minimise() function is limited by the call to generic_minimise() which
should be many orders of magnitude slower.
so you could split by residue for prallelising but send of all the
required model minimisations for each model at the same time which would
give implicicit load balancing and coarser gains on homogeouds parallel
computers:
The minimise() main loop does all of this. Splitting by residue only
makes sense for the 'mf' and 'local_tm' param_set values. All
residues are involved in the diffusion tensor optimisation 'diff' and
the complete optimisation 'all'.
i.e for six residues and 3 nodes
node 1 calculates
res 1 [m1 m2 m3...]
res 2 [[m1 m2 m3..]
node 2 calculates
res 3 [m1 m2 m3...]
res 4 [[m1 m2 m3..]
node 3 calculates
res 5 [m1 m2 m3...]
res 6 [[m1 m2 m3..]
this obviously places some limitations of the design of the
minimisation function as it might needs to have a set and tear down
region that cope with this batched data....
The minimise() main loop is the finest grain parallelisation you can
get without writing a specific parallelised optimisation algorithm.
actually what i was aiming for was as coarse grained and well load
balanced a set of calculations as possible (i.e. minimum of
communication overhead (i.e. bigger chunks of data with similar
computations times going over the wire))
regards
gary
Cheers,
Edward
.
just as another comment (or two)
1. why does do we send no arguments to the fucntions e.g.
-#-num_frq [2]
-#-frq [[750800000.0, 599.71900000000005]]
2. how does the data return from minimisation work (specifically why is
param_vector a instance vaiable of Model_free
e.g. we have
self.param_vector, self.func, iter, fc, gc, hc, self.warning = results
....
# Disassemble the parameter vector.
self.disassemble_param_vector(index=index, sim_index=sim_index)
inside a tight loop. So even though self.param_vector is an instance
variable it doesn't contain state for a Model_free object it just keeps
being overwritten by the latest contents of the result
so why not
param_vector, self.func, iter, fc, gc, hc, self.warning = results
....
# Disassemble the parameter vector.
self.disassemble_param_vector(param_vector, index=index,
sim_index=sim_index)
regards
gary
--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK Tel. +44-113-3433024
email: garyt@xxxxxxxxxxxxxxx Fax +44-113-2331407
-------------------------------------------------------------------