Pickling problems with the relax data storage singleton. -- November 27, 2007

On Nov 26, 2007 9:37 PM, Edward d'Auvergne <edward.dauvergne@xxxxxxxxx> wrote:

On Nov 26, 2007 9:39 AM, gary thompson <garyt.and.sarahb@xxxxxxxxx> wrote:





On Nov 26, 2007 2:32 AM, Chris MacRaild <macraild@xxxxxxxxxxx> wrote:


On Nov 23, 2007 7:14 PM, Gary Thompson <garyt@xxxxxxxxxxxxxxx> wrote:


[snip]

Another (better?) option would be to do saving and restoring state at
a lower level. So instead of simply pickling the whole Data object, we
have save and restore methods of the Data class that do the pickling
in a more controled way. This seems to me more true to the intent of
the singleton pattern, avoiding the complications Gary refers to. Also
the control over what gets saved and how might be useful.

A quick sketch of the sort of thing I'm thinking:

class Data(dict):
   ...
   def _save(file):
       P = Pickler(file)
       dont_save = [] # a list of attributes that don't need saving,
eg. methods
       for name,  attr in self.__dict__.items():
           if not name in dont_save:
               P.dump((name,attr))

   def _restore(file):
       P = Unpickler(file)
       while True:
           try:
               name, attr = U.load()
           except EOFError:
               break
           setAttr(self, name, attr)

Then the user commands save_state and restore_state are just
front-ends to Data._save and Data._restore. Pickle needn't wory itself
with our unusual namespace, because only attributes of Data are
pickled, not Data itself. Save and restore functions are mehtods of
the Singleton object, so there is no risk of breaking the Singleton
pattern. Finally, we have the basis of a mechanism there to control
what gets saved/restored and how.

Cheers,

Chris

This seems good but have some other possibly  interesting questions

1. should data be a singleton? what happens if i need to load two data
hierachies for example for comparison between two runs (If i am barking up
the wrong tree about the current design here please disregard this)


The data storage singleton can handle this.  It is a DictionaryType
object, its keys corresponding to individual PipeContainer objects.
These PipeContainers are the data pipes, the embodiment of the morphed
'runs' concept.  So the change was to put everything into a
PipeContainer and the comparisons will be between the data contents of
any sets of PipeContainers.  As relax runs, the contents of each data
pipe is modified and molded.  The links between these pipes include
functions for switching between them, copying data between them, and
merging the contents together and placing the result into a new pipe
(e.g. model selection, hybridisation, etc.).  These are just very
basic plumbing concepts ;)

2. we have to  be careful we don't pickle any python state (what is python
state can change between versions) How do we identify what is python 
state,
create an empty object?


The saved states are not very portable.  I'm hoping the unit tests
will pull out these issues, as there will be a few saved states in the
shared data directory you created.


Pickling can do nothing other than save python state, which is why
pickle-based saved states are not portable. In principle we could try
to ensure that we only pickle 'standard library' python state (ie.
only objects that are defined in the standard library). Then saved
states would be portable between relax versions (but not necessarily
python versions, or architectures I suspect). This would require a
much more complicated and high-maintenance save/restore code that
recursed down the attributes of the Data object and reduced them to
standard library objects. Better, I think, to accept that saved state
is not the right tool for portable data storage, and therefore not
care too much about pickling state that might change in future relax
(or python) versions.

3. Chris's  will add an extra requirement for maintenance in the future as
all new fields have to be registered for saving


That is true.  Hopefully when new objects are added, a corresponding
unit test is created to catch it.  But yes, this is a danger.  So
rather than having a white list of object to include a black list of
objects to exclude will be better, as then new objects will
automatically be saved.


No. I proposed registering attributes that need *not* be saved,
precisely to avoid this issue. Also keep in mind that the status quo
is to pickle the entire Data object, so even if we do forget to
exclude attributes, we are in no worse position that we currently are.

Chris

Pickling problems with the relax data storage singleton.

Header

Content

Related Messages