Re: Redesign of the relax data model: A HOWTO for breaking relax. -- October 11, 2006

On Wed, 2006-10-11 at 17:02 +1000, Edward d'Auvergne wrote:

This post is proposal for the redesign the relax data model.  This will
affect how data is input into the program, how data is selected, how
molecular structures are handled, how spin systems are handled, and how
many other parts of relax function.  Importantly the internal structure
of 'self.relax.data' will completely change.  These modifications will
essentially break every part of relax (the isolated code in the
directories 'minimise', 'maths_fns', and 'docs' will be safe from the
carnage, as will a few files in the base directory).  If you have any
ideas for extending or improving the proposed data model, can see any
short-comings, deficiencies, or flaws, are familiar with the PDB
conventions, etc., your input is very much sought after.  The changes
should occur in the 1.3 line of the repository.  1.2 versions will be
unaffected - scripts will remain compatible and the 1.2 line will
continue to be supported with bug fixes, etc.

I have to apologise in advance for the size of this proposal, to
simplify it I have divided the text into numbered sections.  Once this
initial parent message has been sent I will respond to it with the text
of the 4 major sections.  This will allow 4 major threads to branch off
from this message on the mailing list archive
(https://mail.gna.org/public/relax-devel).  If you have an opinion,
idea, etc. about a specific section, could you please post a separate
message in response to the relevant major section post?  Also if you
have unrelated ideas for one of these sections, could you post these as
separate messages as well?  For example if you have separate points
about sections 3.1 and 3.5.1, two different posts responding to the
parent Section 3 post would be appreciated.  Thanks.  This will help to
focus each discussion point into specific threads.

Edward



Redesign of the relax data model

Index:
1.  Why change?
    1.1  The runs
    1.2  The molecules
    1.3  The residues
    1.4  The spins
2.  A new run concept
    2.1  Parcelling up an abstract space
    2.2  The run data model
    2.3  The pipe concept
3.  Molecules, residues, and spins
    3.1  The spin data model
    3.2  The data selection concept - identifying spin systems
        3.2.1  Function arguments
        3.2.2  NH data of a single protein macromolecule
        3.2.3  A single organic molecule (non-polymeric)
        3.2.4  A single RNA or DNA macromolecule
        3.2.5  Complexes
    3.3  Regular expression
    3.4  The spin loop
    3.5  Molecule, sequence, and spin user function classes
        3.5.1  The 'molecule' user function class
        3.5.2  The 'sequence' user function class
        3.5.3  The 'spin' user function class
    3.6  The input and output files
4.  Conclusion




Before reading this post, please read the previous posts:

* The parent message 'Redesign of the relax data model:  A HOWTO for
breaking relax.' located at
https://mail.gna.org/public/relax-devel/2006-10/msg00053.html
(Message-id:
<1160550133.9523.54.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).

* Section 1 'Redesign of the relax data model:  1.  Why change?' located
at https://mail.gna.org/public/relax-devel/2006-10/msg00054.html
(Message-id:
<1160551172.9523.60.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).



2.  A new run concept

2.1  Parcelling up an abstract space

The general idea is to further increase the prominence of the 'run'.
Rather than relax executing in an abstract space where the 'run' is
passed into each user function as necessary, the idea is that relax
executes within a space dedicated to a certain 'run'.  So if at the
relax prompt, you could type a user function such as:

relax> run.current()
'm8'

By working in the 'm8' run space, each user function can be executed
without the need for the 'run' argument.  Other user functions, such as
'run.switch()', can be used to change between runs.


2.2  The run data model

The current run name could be stored in the single data structure
'self.relax.run'.  The relax data structure could then be accessed by
typing 'self.relax.data[self.relax.run]'.  I.e. 'self.relax.data' is a
DictType object (it has key-value pairs) in which the run name key is
associated with a specific data container.  As most data structures in
the current relax data model are associated with a run (e.g.
'self.relax.data.diff[self.run]', 'self.relax.data.res[self.run]',
'self.relax.data.pdb[self.run]', etc), the data model significantly
simplifies.

More information about the data model change is given in the message at
located at https://mail.gna.org/public/relax-devel/2006-05/msg00008.html
(Message-id:
<7f080ed10605232038j5036278dg39136d75a05a9904@xxxxxxxxxxxxxx>) and the
response located at
https://mail.gna.org/public/relax-devel/2006-05/msg00010.html
(Message-id:
<7f080ed10605241912i7c35f574i94f139588c5fa16b@xxxxxxxxxxxxxx>).


2.3  The pipe concept

A single run can be thought of as a pipe where data is input, processed,
or output as user functions are called.  There are different types of
pipe for different analyses, e.g. a reduced spectral density mapping
pipe, a model-free pipe, an exponential curve-fitting pipe, etc.  When
running relax you choose which run (or pipe) you are currently in and
the 'run.switch()' user function allows you to jump between multiple
runs (or pipes).  The modification of user functions in which runs are
combined or branched (which can be thought of as the pipes merging or
splitting) would be straight forward.  For example the
'model_selection()' user function currently accepts the following
arguments:

model_selection(self, method=None, modsel_run=None, runs=None)

In this case the 'modsel_run' can be dropped and the results of model
selection placed into the current run (or pipe).  The 'run' user
function class could contain the following user functions for pipe
manipulation:

run.copy()    # Create a new run (or pipe) with the current contents of
another run (or pipe).
run.create()    # Create a new run (or pipe).  Switch to this pipe by
default.
run.current()    # Print the current run (or pipe).
run.delete()    # Delete the given run (or pipe).
run.delete_all()    # Delete all runs.  Essentially deleting
'self.relax.data'.
run.hybridise()    # Fuse two runs (or pipes) into the current run (or
pipe).  Overlapping data in the two runs must be identical!
run.list()    # Print all runs (or pipes).
run.switch()    # Switch to another run (or pipe).

One evolutionary path of the run concept which could be followed with
this set of proposed changes is to completely replace it with the pipe
concept.  All instances of 'run' in relax would be renamed to 'pipe'.
For example 'run.create()' will become 'pipe.create()',
'self.relax.data[self.relax.run]' will become
'self.relax.data[self.relax.pipe]', etc.  I believe that the name 'pipe'
is a better representation of the run concept than 'run'.  What do you
think of the idea?

The hypothetical ideas of this paragraph are not part of the current
proposals, however they further illustrate the pipe concept.  The pipe
concept is highly amenable for the creation of a Qt GUI.  Program
execution could be directed by a graphical 'pipe' construction (possibly
in 3D using OpenGL).  Elements of the pipe, equivalent to the user
functions of the prompt and script interfaces, could be dragged from
toolbars and dropped into a canvas.  These could be linked together by
moving the element with the mouse and having it click into other
elements.  For example 'run.delete()' (alternatively 'pipe.delete()')
could be represented as a cap added to the end of a pipe - its execution
removes all the data of that pipe from memory.  This pictorial
representation of execution would be very powerful and intuitive.
Scripts could be imported into the GUI and represented as a network of
interconnected pipes and vice versa.  Execution of relax could even be
animated as semi-transparent pipes filling up bit by bit as each user
executes.  Imagination is the only limit!

Re: Redesign of the relax data model: A HOWTO for breaking relax.

Header

Content

Related Messages