Re: Pre-sending data in the multi-processor package. -- March 21, 2012

Hi Gary,

I think I'll start to modify the design of the multi-processor
package.  What is required is a data storage container within each
Processor instance (on each node).  As the Processor is a singleton
and there is only one per node, then this container would be unique.
There would need to be a function within the multi-processor API that
calling code on the master can use to send data to all slaves to be
stored in this data container.  As the parallelisation code is at the
level of the function call, then almost all data used by the slaves is
identical - the only difference being a few parameters.  This could
also be used both at the level of the initialisation of the target
function class to send invariant data once at the start, and then at
the level of the target function call to send data that changes per
function call (i.e. with the model parameters).  The slave_command
objects will then be sent to the slaves, and the slaves can then
access the data within these command objects and the
Processor.data_container objects, again probably via an API function.
If you don't think this is a good idea, or if you can see that you
have implemented something similar that I have missed, please say.

For the API (multi/__init__.py), I am thinking of the following pair
of optional functions:

def data_fetch(name=None):
    """API function for obtaining data from the Processor instance's data 
store.

    This is for fetching data from the data store of the Processor instance.


    @keyword name:  The name of the data structure to fetch.
    @type name:     str
    @return:        The value of the associated data structure.
    @rtype:         anything
    """


def data_upload(name=None, value=None, rank=None):
    """API function for sending data to be stored on the Processor of
the given rank.

    This can be used for transferring data from Processor instance i
to the data store of Processor instance j.


    @keyword name:  The name of the data structure to store.
    @type name:     str
    @keyword value: The data structure.
    @type value:    anything
    @keyword rank:  An optional argument to send data only to the
Processor of the given rank.  If None, then the data will be sent to
all Processor instances.
    @type rank:     None or int
    """

The parallelised model-free code will be unaffected as the
parallelisation is at a much higher level and does not need this
mechanism.  Any feedback would be appreciated.

Cheers,

Edward



On 14 March 2012 16:17, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:

Hi Gary,

Before I start hacking into the multi-processor package, I was
wondering if you know of a way of pre-sending data to slave processors
using the current design?  The reason is because I would like to have
the parallelisation at the lowest level of the target function.  But
there is a massive quantity of data which doesn't change at the target
function level which would be better to transmit to and store on the
slaves prior to optimisation (atomic positions, bond vectors, base NMR
data, missing data flags, etc.).  This is required to keep the data
transmission of the slave_command objects from killing scalability.
Any ideas?

Cheers,

Edward

Re: Pre-sending data in the multi-processor package.

Header

Content

Related Messages