mailRe: Pickling problems with the relax data storage singleton.


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on November 26, 2007 - 11:37:
On Nov 26, 2007 9:39 AM, gary thompson <garyt.and.sarahb@xxxxxxxxx> wrote:




On Nov 26, 2007 2:32 AM, Chris MacRaild <macraild@xxxxxxxxxxx> wrote:

On Nov 23, 2007 7:14 PM, Gary Thompson <garyt@xxxxxxxxxxxxxxx> wrote:


[snip]

Another (better?) option would be to do saving and restoring state at
a lower level. So instead of simply pickling the whole Data object, we
have save and restore methods of the Data class that do the pickling
in a more controled way. This seems to me more true to the intent of
the singleton pattern, avoiding the complications Gary refers to. Also
the control over what gets saved and how might be useful.

A quick sketch of the sort of thing I'm thinking:

class Data(dict):
   ...
   def _save(file):
       P = Pickler(file)
       dont_save = [] # a list of attributes that don't need saving,
eg. methods
       for name,  attr in self.__dict__.items():
           if not name in dont_save:
               P.dump((name,attr))

   def _restore(file):
       P = Unpickler(file)
       while True:
           try:
               name, attr = U.load()
           except EOFError:
               break
           setAttr(self, name, attr)

Then the user commands save_state and restore_state are just
front-ends to Data._save and Data._restore. Pickle needn't wory itself
with our unusual namespace, because only attributes of Data are
pickled, not Data itself. Save and restore functions are mehtods of
the Singleton object, so there is no risk of breaking the Singleton
pattern. Finally, we have the basis of a mechanism there to control
what gets saved/restored and how.

Cheers,

Chris





This seems good but have some other possibly  interesting questions

1. should data be a singleton? what happens if i need to load two data
hierachies for example for comparison between two runs (If i am barking up
the wrong tree about the current design here please disregard this)

The data storage singleton can handle this.  It is a DictionaryType
object, its keys corresponding to individual PipeContainer objects.
These PipeContainers are the data pipes, the embodiment of the morphed
'runs' concept.  So the change was to put everything into a
PipeContainer and the comparisons will be between the data contents of
any sets of PipeContainers.  As relax runs, the contents of each data
pipe is modified and molded.  The links between these pipes include
functions for switching between them, copying data between them, and
merging the contents together and placing the result into a new pipe
(e.g. model selection, hybridisation, etc.).  These are just very
basic plumbing concepts ;)


2. we have to  be careful we don't pickle any python state (what is python
state can change between versions) How do we identify what is python state,
create an empty object?

The saved states are not very portable.  I'm hoping the unit tests
will pull out these issues, as there will be a few saved states in the
shared data directory you created.


3. Chris's  will add an extra requirement for maintenance in the future as
all new fields have to be registered for saving

That is true.  Hopefully when new objects are added, a corresponding
unit test is created to catch it.  But yes, this is a danger.  So
rather than having a white list of object to include a black list of
objects to exclude will be better, as then new objects will
automatically be saved.


4. what are the implications for cross version compatability, this may not
be a requirement, but if it is then we will have impliment code to shim old
formats into newer ones

The save state is a temporary construct.  Portability sucks!  It's
awful.  Again hopefully the unit tests I have already written (and
future state unit tests) will catch and identify the issues.  The
results file is the way to go with portability.  It should contain all
the contents of a given PipeContainer.  And then loading the results
file should restore the PipeContainer.  The only issue is the PDB
object files (Scientific, etc) which are not saved in the results
file.  Btw, your unit test framework is awesome for debugging!!!  It
appears to be catching all my typos and mistakes.  And for coding it's
great as well.  Just write the unit tests first to cover all the
behavior of the functions, and then code until the tests pass.  Its
making my life so easy (and much more bug free).


 5. If you put an XP hat on you would go with the simple but crude method
for the moment and put in place a more complicated methodology if and whn we
need it (could we design an interface that could deal with both and then
impliment the more complicated stratergy if and when we need it?

Unfortunately I couldn't make the first simple method work.

Regards,

Edward



Related Messages


Powered by MHonArc, Updated Tue Nov 27 00:24:44 2007