Hi Gary, I think I'll start to modify the design of the multi-processor package. What is required is a data storage container within each Processor instance (on each node). As the Processor is a singleton and there is only one per node, then this container would be unique. There would need to be a function within the multi-processor API that calling code on the master can use to send data to all slaves to be stored in this data container. As the parallelisation code is at the level of the function call, then almost all data used by the slaves is identical - the only difference being a few parameters. This could also be used both at the level of the initialisation of the target function class to send invariant data once at the start, and then at the level of the target function call to send data that changes per function call (i.e. with the model parameters). The slave_command objects will then be sent to the slaves, and the slaves can then access the data within these command objects and the Processor.data_container objects, again probably via an API function. If you don't think this is a good idea, or if you can see that you have implemented something similar that I have missed, please say. For the API (multi/__init__.py), I am thinking of the following pair of optional functions: def data_fetch(name=None): """API function for obtaining data from the Processor instance's data store. This is for fetching data from the data store of the Processor instance. @keyword name: The name of the data structure to fetch. @type name: str @return: The value of the associated data structure. @rtype: anything """ def data_upload(name=None, value=None, rank=None): """API function for sending data to be stored on the Processor of the given rank. This can be used for transferring data from Processor instance i to the data store of Processor instance j. @keyword name: The name of the data structure to store. @type name: str @keyword value: The data structure. @type value: anything @keyword rank: An optional argument to send data only to the Processor of the given rank. If None, then the data will be sent to all Processor instances. @type rank: None or int """ The parallelised model-free code will be unaffected as the parallelisation is at a much higher level and does not need this mechanism. Any feedback would be appreciated. Cheers, Edward On 14 March 2012 16:17, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
Hi Gary, Before I start hacking into the multi-processor package, I was wondering if you know of a way of pre-sending data to slave processors using the current design? The reason is because I would like to have the parallelisation at the lowest level of the target function. But there is a massive quantity of data which doesn't change at the target function level which would be better to transmit to and store on the slaves prior to optimisation (atomic positions, bond vectors, base NMR data, missing data flags, etc.). This is required to keep the data transmission of the slave_command objects from killing scalability. Any ideas? Cheers, Edward