Redesign of the relax data model: 4. Conclusion -- October 11, 2006

On Wed, 2006-10-11 at 17:02 +1000, Edward d'Auvergne wrote:

This post is proposal for the redesign the relax data model.  This will
affect how data is input into the program, how data is selected, how
molecular structures are handled, how spin systems are handled, and how
many other parts of relax function.  Importantly the internal structure
of 'self.relax.data' will completely change.  These modifications will
essentially break every part of relax (the isolated code in the
directories 'minimise', 'maths_fns', and 'docs' will be safe from the
carnage, as will a few files in the base directory).  If you have any
ideas for extending or improving the proposed data model, can see any
short-comings, deficiencies, or flaws, are familiar with the PDB
conventions, etc., your input is very much sought after.  The changes
should occur in the 1.3 line of the repository.  1.2 versions will be
unaffected - scripts will remain compatible and the 1.2 line will
continue to be supported with bug fixes, etc.

I have to apologise in advance for the size of this proposal, to
simplify it I have divided the text into numbered sections.  Once this
initial parent message has been sent I will respond to it with the text
of the 4 major sections.  This will allow 4 major threads to branch off
from this message on the mailing list archive
(https://mail.gna.org/public/relax-devel).  If you have an opinion,
idea, etc. about a specific section, could you please post a separate
message in response to the relevant major section post?  Also if you
have unrelated ideas for one of these sections, could you post these as
separate messages as well?  For example if you have separate points
about sections 3.1 and 3.5.1, two different posts responding to the
parent Section 3 post would be appreciated.  Thanks.  This will help to
focus each discussion point into specific threads.

Edward



Redesign of the relax data model

Index:
1.  Why change?
    1.1  The runs
    1.2  The molecules
    1.3  The residues
    1.4  The spins
2.  A new run concept
    2.1  Parcelling up an abstract space
    2.2  The run data model
    2.3  The pipe concept
3.  Molecules, residues, and spins
    3.1  The spin data model
    3.2  The data selection concept - identifying spin systems
        3.2.1  Function arguments
        3.2.2  NH data of a single protein macromolecule
        3.2.3  A single organic molecule (non-polymeric)
        3.2.4  A single RNA or DNA macromolecule
        3.2.5  Complexes
    3.3  Regular expression
    3.4  The spin loop
    3.5  Molecule, sequence, and spin user function classes
        3.5.1  The 'molecule' user function class
        3.5.2  The 'sequence' user function class
        3.5.3  The 'spin' user function class
    3.6  The input and output files
4.  Conclusion




Before reading this post, please read the previous posts:

* The parent message 'Redesign of the relax data model:  A HOWTO for
breaking relax.' located at
https://mail.gna.org/public/relax-devel/2006-10/msg00053.html
(Message-id:
<1160550133.9523.54.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).

* Section 1 'Redesign of the relax data model:  1.  Why change?' located
at https://mail.gna.org/public/relax-devel/2006-10/msg00054.html
(Message-id:
<1160551172.9523.60.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).

* Section 2 'Subject: Redesign of the relax data model:  2.  A new run
concept' located at
https://mail.gna.org/public/relax-devel/2006-10/msg00056.html
(Message-id:
<1160555137.9523.70.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).

* Section 3 'Redesign of the relax data model:  3.  Molecules, residues,
and spins' located at
https://mail.gna.org/public/relax-devel/2006-10/msg00057.html
(Message-id:
<1160557041.9523.74.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>).



4.  Conclusion

This proposal will significantly change how relax operates.  To start
with, the modifications will render the 1.3 repository line completely
dysfunctional.  As the changes are so extensive, it may take a while for
relax to migrate to this new data model and for the 1.3 line to
stabilise.  All suggestions relating to this proposal would be greatly
appreciated.  And if anyone could help in the coding to accelerate the
migration process, that too would be much appreciated :)  The migration
could occur as a sweeping process, each file being migrated one at a
time.  Bugs are highly likely to be encountered and these can be
reported to relax bug tracker to be fixed in both the 1.2 and 1.3 lines.

The following is a summary of the proposal.

The run concept is to be significantly simplified, function more like
pipes, and everything renamed to 'pipes' rather than 'runs'.

Molecule level:
  Data name:
    data.mol[0]
  User function class:
    molecule
  Identifiers (function arguments):
    1. mol_num    # NMR model number.
    2. mol_name   # Chain or segment ID.
  Data structures:
    1. data.mol[0].num
    2. data.mol[0].name

Residue level:
  Data name:
    data.mol[0].res[0]
  User function class:
    sequence
  Identifiers (function arguments):
    1. res_num
    2. res_name
  Data structures:
    1. data.mol[0].res[0].num
    2. data.mol[0].res[0].name

Spin level:
  Data name:
    data.mol[0].res[0].spin[0]
  User function class:
    spin
  Identifiers (function arguments):
    1. atom_num
    2. atom_name
  Data structures:
    1. data.mol[0].res[0].spin[0].num
    2. data.mol[0].res[0].spin[0].name

Redesign of the relax data model: 4. Conclusion

Header

Content

Related Messages