On Wed, 2006-10-11 at 17:02 +1000, Edward d'Auvergne wrote:
This post is proposal for the redesign the relax data model. This will affect how data is input into the program, how data is selected, how molecular structures are handled, how spin systems are handled, and how many other parts of relax function. Importantly the internal structure of 'self.relax.data' will completely change. These modifications will essentially break every part of relax (the isolated code in the directories 'minimise', 'maths_fns', and 'docs' will be safe from the carnage, as will a few files in the base directory). If you have any ideas for extending or improving the proposed data model, can see any short-comings, deficiencies, or flaws, are familiar with the PDB conventions, etc., your input is very much sought after. The changes should occur in the 1.3 line of the repository. 1.2 versions will be unaffected - scripts will remain compatible and the 1.2 line will continue to be supported with bug fixes, etc. I have to apologise in advance for the size of this proposal, to simplify it I have divided the text into numbered sections. Once this initial parent message has been sent I will respond to it with the text of the 4 major sections. This will allow 4 major threads to branch off from this message on the mailing list archive (https://mail.gna.org/public/relax-devel). If you have an opinion, idea, etc. about a specific section, could you please post a separate message in response to the relevant major section post? Also if you have unrelated ideas for one of these sections, could you post these as separate messages as well? For example if you have separate points about sections 3.1 and 3.5.1, two different posts responding to the parent Section 3 post would be appreciated. Thanks. This will help to focus each discussion point into specific threads. Edward Redesign of the relax data model Index: 1. Why change? 1.1 The runs 1.2 The molecules 1.3 The residues 1.4 The spins 2. A new run concept 2.1 Parcelling up an abstract space 2.2 The run data model 2.3 The pipe concept 3. Molecules, residues, and spins 3.1 The spin data model 3.2 The data selection concept - identifying spin systems 3.2.1 Function arguments 3.2.2 NH data of a single protein macromolecule 3.2.3 A single organic molecule (non-polymeric) 3.2.4 A single RNA or DNA macromolecule 3.2.5 Complexes 3.3 Regular expression 3.4 The spin loop 3.5 Molecule, sequence, and spin user function classes 3.5.1 The 'molecule' user function class 3.5.2 The 'sequence' user function class 3.5.3 The 'spin' user function class 3.6 The input and output files 4. Conclusion
Before reading this post, please read the previous posts: * The parent message 'Redesign of the relax data model: A HOWTO for breaking relax.' located at https://mail.gna.org/public/relax-devel/2006-10/msg00053.html (Message-id: <1160550133.9523.54.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>). * Section 1 'Redesign of the relax data model: 1. Why change?' located at https://mail.gna.org/public/relax-devel/2006-10/msg00054.html (Message-id: <1160551172.9523.60.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>). * Section 2 'Subject: Redesign of the relax data model: 2. A new run concept' located at https://mail.gna.org/public/relax-devel/2006-10/msg00056.html (Message-id: <1160555137.9523.70.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>). * Section 3 'Redesign of the relax data model: 3. Molecules, residues, and spins' located at https://mail.gna.org/public/relax-devel/2006-10/msg00057.html (Message-id: <1160557041.9523.74.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>). 4. Conclusion This proposal will significantly change how relax operates. To start with, the modifications will render the 1.3 repository line completely dysfunctional. As the changes are so extensive, it may take a while for relax to migrate to this new data model and for the 1.3 line to stabilise. All suggestions relating to this proposal would be greatly appreciated. And if anyone could help in the coding to accelerate the migration process, that too would be much appreciated :) The migration could occur as a sweeping process, each file being migrated one at a time. Bugs are highly likely to be encountered and these can be reported to relax bug tracker to be fixed in both the 1.2 and 1.3 lines. The following is a summary of the proposal. The run concept is to be significantly simplified, function more like pipes, and everything renamed to 'pipes' rather than 'runs'. Molecule level: Data name: data.mol[0] User function class: molecule Identifiers (function arguments): 1. mol_num # NMR model number. 2. mol_name # Chain or segment ID. Data structures: 1. data.mol[0].num 2. data.mol[0].name Residue level: Data name: data.mol[0].res[0] User function class: sequence Identifiers (function arguments): 1. res_num 2. res_name Data structures: 1. data.mol[0].res[0].num 2. data.mol[0].res[0].name Spin level: Data name: data.mol[0].res[0].spin[0] User function class: spin Identifiers (function arguments): 1. atom_num 2. atom_name Data structures: 1. data.mol[0].res[0].spin[0].num 2. data.mol[0].res[0].spin[0].name