Re: r3237 - in /branches/multi_processor: multi/mpi4py_processor.py relax -- March 20, 2007

Edward d'Auvergne wrote:

On 3/20/07, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:

Edward d'Auvergne wrote:

> Hi,
>
> Just a quick point, it would be good to either start a new thread for
> these types of questions or changing the subject field (unless you use
> the Gmail web interface like I am at the moment).  People may miss
> important discussions with such scary subject lines!
>
>
> On 3/20/07, Gary S. Thompson <garyt@xxxxxxxxxxxxxxx> wrote:
>
>> garyt@xxxxxxxxxxxxxxx wrote:

[snip]

>> Now a question what is the best way to get an etrenally running relax
>> interpreter I can just fire commands at (for the slaves)?
>
>
> The prompt based interface (as well as the script interface) is only
> one way of invoking relax.  An important question is how should we
> present relax to the user when using MPI.  Should the parent process
> present a functional interpreter or should operation be solely
> dictated by a script?


I already have a prompt running on the master, my idea is that the relax
use should see no difference (apart from perfomance) when using he
parallel version

I've just played around with that and it does look like the best
option for user flexibility.

> Or should a completely different mechanism of
> operation be devised for the interface of the parent.  For the grid
> computing code the parent UI is either the prompt or the script while
> the slaves use the interface started by 'relax --thread'.  The slave
> use none of the user functions and only really invoke the number
> crunching code.

now this is what I couldn't follow. How much of the relax environment is
essential for a slave/thread and if we don't want to do it all by
compete pickles of relax data structures (how big is a complete relax
data structure for a typical set of runs?) where shoudl i start

My post at https://mail.gna.org/public/relax-devel/2007-03/msg00097.html (Message-id: <1174384147.29205.20.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>) hopefully answers these questions.

> For the MPI slaves (I'm assuming these are separate processes with
> different PIDs running on different nodes) we should avoid the
> standard UI interfaces as these are superfluous.  In this case a
> simple MPI interface should probably be devised - it accepts the MPI
> commands and returns data back to the parent.  Is this though stdin,
> stdout, and stderr in MPI?  My knowledge of MPI is very limited.
>


1. via picked objects  which will integrate the results back into the
master copy or

If implemented at the 'minimise_mpi()' model-free method, then this
would probably be the best option.

2. as text strings which are printed with a process number in front (so that complete parallel log files can be grepped out of the main output).

However if you see advantages with this option, then maybe this is better.

 I don't currently assume that the compute node has available or useable
or shared  disk space.  everything comes back to the master


I expected MPI to operate in this way.  This means that none of the
grid computing code will be of use to you.

I only assumed that this would be a restraint for the mpi stuff other code can fit in the framework and access disks via nfs or in other ways if needed ;-)

the place to look at in mpi4py_processor is

lines 59-71  which sends a command from the master and either prints or
exceutes the resulting object (checks for exceptions, repeated feedback,
and command completion still to come)

        for i in range(1,MPI.size):
            if i != 0:
                MPI.COMM_WORLD.Send(buf=command,dest=i)
        for i in range(1,MPI.size):
            buf=[]
            if i !=0:
                elem = MPI.COMM_WORLD.Recv(source=i)
                if type(elem) == 'object':
                    elem.run(relax_instance, relax_instance.processor)
                else:
                    #FIXME can't cope with multiple lines
                    print i,elem

and lines 92-94 where all the command are recieved and executes on the
slaves

         while not self.do_quit:
                command = MPI.COMM_WORLD.Recv(source=0)
                command.run(self.relax_instance,
self.relax_instance.processor)

it is the intention that the command protocol is very minimal

0. send a command plus data to run on the remote slave as an object
which will get its run method executed with the local relax and
processor as aruments master:communicator.run_command

Maybe the run method should be part of the 'minimise()' model-free method?

well actually in the end the model free method would in multiprocesszor mode create objects and send them the commands at the moment appear to be part of the implimentation of mpi4py_processor but they aren't they are generic objects

1. the slave then writes a series of objects to the processsor method
return_object
The model-free parameter vector and optimisation stats?


yep so this is outgoing

2. master recieves data either
   a. objects to execute on the master which will also be given the
master relax and processor instances

Unpack the minimisation data and place it into the appropriate
location in the relax data storage object?

and this is incoming

b.string objects to print
The optimisation print outs?

yes

   c. a command completion indicator back from the slave (a well known
object)

Does the slave then die?

no!

   d. an exception (raising of a local exception ont master which will
do stack trace printing for the master and the slave)

Ah, I didn't think of that one!

e. None a void return

???

a return it there is o result but I guess I could also just use c. a command completion indicato

>> do i modify interpreter.run to take a quit varaiable set to False  so
>> that run_script can be
>> run with quit = false?
>
>
> Avoid the interpreter and wait at the optimisation steps - the only
> serious number crunching code in relax.


I agree!  interpeters are not required on the slave just the relax data
structures in a clean and useable state

If you're working at the level of the model-free 'minimise()'
function, don't bother with the relax data structures!  See my
previous post mentioned above.

I follow now, my only worry is that the processing will be fairly fine grained so causing a greater ratio of network traffic to processing

One other quetion: how well behaved are the relax functions with not gratuitously modifying global state. e.g could I share one relax instance between several threads? The reason I ask is that in the case that they are well behaved many of the data transfer operations in a threaded environment with a single memory space would become noops ;-) nice!
I wouldn't share the state.  Again if you work at the 'minimise()'
model-free method level, copying it and renaming it to
'minimise_mpi()', that new function could be made to not touch the
relax data storage object.  Maybe there should be a
'minimise_mpi_master()' that contains the setup code and a
'minimise_mpi_slave()' which contains the optimisation code and the
unpacking code.  This should be very simple to copy and modify from
the current code!
Cheers,
Edward

actually there shouldn't be anything labelled mpi outside the specific instance of a processor (that uses mpi) everything else whould be generic

create remote command
send remote command
work with results from remote command...


regards
gary

--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK             Tel. +44-113-3433024
email: garyt@xxxxxxxxxxxxxxx                   Fax  +44-113-2331407
-------------------------------------------------------------------

Re: r3237 - in /branches/multi_processor: multi/mpi4py_processor.py relax

Header

Content

Related Messages