Re: To all relax developers: presenting relax at the ENC conference and preliminary BMRB NMR-STAR file creation for model-free results. -- March 23, 2009

Hi,

A few comments below. I wonder if it would be best to model the data asit exists now, but try to do it in a way that can be extended in thefuture. Do you think you will be able to send another example output? Dothe current saveframes and tags fit the existing data (for example, theXML output sent earlier)?


Cheers,
Eldon


Edward d'Auvergne wrote:

Hi,

I'm pretty sure that we cannot finalise everything before the ENC, but
the design is close and relax should be producing the NMR-STAR
formatted files soon afterwards.  There are still a number of issues
which would be good to sort out.


On Wed, Mar 18, 2009 at 5:34 AM, Eldon Ulrich <elu@xxxxxxxxxxxxx> wrote:

Hi,

Sorry to be late on getting this to you. Directly below is a list of the
current relevant tags for the '_Model_free' loop (table) as I have the data
modeled. This is still open to discussion and refinement. I have tried to
answer your latest comments further below. I will be traveling the next two
days, but will try to stay in contact.

Eldon

loop_
_Model_free.ID
_Model_free.Assembly_ID
_Model_free.Assembly_atom_ID
_Model_free.Entity_assembly_ID
_Model_free.Entity_ID
_Model_free.Comp_index_ID
_Model_free.Comp_ID
_Model_free.Obs_atom_ID
_Model_free.Obs_atom_type
_Model_free.Obs_atom_isotope_number
_Model_free.Attached_atom_ID
_Model_free.Attached_atom_type
_Model_free.Attached_atom_isotope_number


This is a useful addition, but might be too relax specific - this atom
may not necessarily be directly attached (in almost all cases it will
be, but who knows).  I don't know what a better name would be though,
as this atom could cause dipolar, quadrapolar, or other relaxation
(but not CSA).  Also, it might be useful to note that Pavel Kaderavek
and Petr Novak are developing relax for RNA work and are adding
multiple dipole relaxation support.  I'm not exactly sure what this
entails yet, but I would assume you would need multiple of these
Attached_atom _* values and have these associated with vectors like in
the 'bond_vectors' tags.  Oh, these Attached_atom _* tags might be
better in the relaxation data saveframes (or additionally added
there).

Would '_Model_free.Effector_atom_ID' be a more appropriate tag? I amwondering, as you indicate, if this information is more specific to therelaxation data used in calculating the order parameters and may notbe needed here?

_Model_free.S2_val
_Model_free.S2_val_err
_Model_free.S2f_val
_Model_free.S2f_val_err
_Model_free.S2s_val
_Model_free.S2s_val_err
_Model_free.Local_tau_c_val
_Model_free.Local_tau_c_val_err
_Model_free.Tau_e_val
_Model_free.Tau_e_val_err
_Model_free.Tau_f_val
_Model_free.Tau_f_val_err
_Model_free.Tau_s_val
_Model_free.Tau_s_val_err
_Model_free.Rex_val
_Model_free.Rex_val_err


This Rex_val tag needs an associated spectrometer frequency (MHz for
proton, T, etc.) and it's units (unless assumed to be rad/s).  This is
very important.

How would this 'spectrometer frequency' value differ from the onecaptured with the relaxation data sets (the input data sets)?

Comment #1: The 'General_relaxation' saveframe is intended for standard
two
spin relaxation data, but would include more than just R1, R1rho, and R2
as
you can see below. I think it would be best to reserve this saveframe for
rates and not include time constants. It would not include
cross-correlation
data.

I don't know if any programs would generate time constants, T1, T2,
etc., but some people are still very old school.  I thought the units
tag covered this though?

The units tag could cover this, but in the end, if all agree, I think it
would be best to separate the time constant data from the rate data.


That's no problem for me.  relax can handle it either way.

Then, I suggest that relax output separate 'General_relaxation'saveframes for each set of standard two spin relaxation data sets.

save__General_relaxation_list.Relaxation_coherence_type

  loop_
    _item_enumeration_value
    _item_enumeration_description

 Iz                  'R1              longitudinal'
 Sz                  'R1              longitudinal'

 I+                  'R1rho           transverse'
 I-                  'R1rho           transverse'
 S+                  'R1rho           transverse'
 S-                  'R1rho           transverse'
 (I+)+(I-)           'R1rho           transverse'
 (S+)+(S-)           'R1rho           transverse'

 I+                  'R2              transverse'
 I-                  'R2              transverse'
 S+                  'R2              transverse'
 S-                  'R2              transverse'
 (I+)+(I-)           'R2              transverse'
 (S+)+(S-)           'R2              transverse'

 I-S+             'ZQ relaxation'
 I+S-             'ZQ relaxation'
 (I+S-)+(I-S+)    'ZQ relaxation'

 IzSz             'longitudinal spin order'

 I+Sz             'single quantum antiphase'
 I-Sz             'single quantum antiphase'
 IzS+             'single quantum antiphase'
 IzS-             'single quantum antiphase'
 ((I+)+(I-))Sz     'single quantum antiphase'
 Iz((S+)+(S-))     'single quantum antiphase'

 I+S+             'DQ relaxation'
 I-S-             'DQ relaxation'
 (I+S+)+(I-S-)      'DQ relaxation'

  stop_

save_

This is missing the NOE?  This could either be sigma_NOE, the
cross-relaxation rate, or the measured steady state NOE:

NOE = 1 + gH/gN * sigma_NOE/R1

Should General_relaxation include NOE relaxation rather than leaving
it in the Heteronucl_NOEs saveframe category?

I have mixed feelings on this. Heteronucl and Homonucl NOEs usually
represent relaxation between two specified nuclei. Tags are needed to define
the two nuclei. In most cases, R1, R1rho, and R2 values represent the
relaxation of one nucleus as influenced by one or more usually unspecified
nuclei. This is more or less the reason separate saveframes were orginally
created. Sometimes trying to stuff too many things into one box causes
confusion even when the items are clearly related. I am open to a discussion
on this.

I'm not sure if having separate saveframe categories is necessary, as
this introduces a split in the relaxation data that no one else makes.
 It would work but wouldn't be very elegant or flexible if someone
does collect some non-standard relaxation data.  This might be more
elegant if 'General_relaxation' is renamed to 'Auto_relaxation' and a
new category duplicating but extending 'General_relaxation' is made
and called 'Cross_relaxation'.  The 'Cross_relaxation' category may
not be necessary as people really only measure the heteronuclear
steady-state NOE for this.

As for the Heternuclear vs. homonuclear split, this again seems
artificial.  For dipolar relaxation, there are two spin 1/2 nuclei.
They could be 1H-1H, 1H-15N, 15N-15N, 15N-13C, 13C-13C, etc.  This
only needs the 2 nuclei specified in 2 tags, and then the split
disappears.  One nuclei is the observed spin of interest
'_Model_free.Obs_atom_*'.  The other nuclei is the one causing dipolar
or quadrapolar relaxation, and could in many cases consist of a group
of nuclei surrounding the spin of interest in space.  So maybe have
one tag such as:

_General_relaxation_list.Obs_atom_type 1H

Is this tag actually for the observed atom or to describe the type ofatoms that are responsible for the relaxation of the observed atom?

and something matching the proposed 'Bond_vector' saveframe for the
different relaxing nuclei.  Then maybe 'Bond_vector' could cover
multiple relaxing spins in one model, as well as multiple models?

If it is useful to point to the 'Bond_vector' list from more than onerelaxation or model_free saveframe then it maybe best to put this loop_construct in the 'Assembly' saveframe where the atoms, bonds, bondangles, and dihedral angles in the assembly can be provided.

I think the current split is confusing as both the steady state NOE
and the deconvoluted Nz->Hz sigma_NOE cross relaxation rate could be
reported.  The R1 auto rate is Iz to Iz relaxation and is calculated
using Iz to Iz in the relaxation superoperator.  The sigma_NOE is
simply Iz to Sz relaxation in the same superoperator.

Therefore with tags for the observed nucleus and the surrounding n
nuclei and tags for the 2 operators involved in relaxation (and tags
specifying the specifics for the collected data, e.g. rotating frame
relaxation instead of lab frame, steady-state NOE instead of the
sigma_NOE, etc.), then 'homonucl_NOEs', 'heteronucl_NOEs',
'heteronucl_T1_relaxation', 'heteronucl_T1rho_relaxation',
'heteronucl_T2_relaxation', and 'heteronucl_T2rho_relaxation' all
collapse into one 'General_relaxation' saveframe category.

A similar collapse of the cross correlated rates could also be possible.

I agree, there are many issues that need to be addressed here. It woulddefinitely be best to reduce confusion.

Note that none of this is really important for data storage, or
conversion back and from relax, with the current state of the relax,
Modelfree4, and Dasha programs.  But it may increase the flexibility
of the BMRB for storing strange data (that may not be so strange in
the future).

As for the diffusion tensor, I've looked into the BMRB for saveframes
and there appears to be nothing I can use.  This is actually quite
complex, as the diffusion tensor can be spherical (with just a tau_c),
spheroidal, and ellipsoidal.  The ellipsoidal diffusion tensor is
composed of 6 parameters, {Diso, Da, Dr, alpha, beta, gamma}.
Alternatively you can specify this using the eigenvalues {Dx, Dy, Dz,
alpha, beta, gamma}.  The Euler angles in relax are in the zyz
notation (not always though), and are defined from 0 from their axes
(this changes things too).  They are folded between:

0 <= alpha <= 2pi,
0 <= beta <= pi,
0 <= gamma <= 2pi.

The spheroidal tensor is composed of 4 parameters, {Diso, Da, theta,
phi} (or {Dpar, Dper, theta, phi}).  Theta and phi are the polar
angles, and are defined in relax starting at 0 to pi and 2pi
respectively.  This is all described in the relax manual on page 163
(http://download.gna.org/relax/manual/1.3/relax.pdf).  relax can dump
all of these parameters, including all permutations of parameters, if
you wish.

Things can get more complicated though because of what I have called
hybrid diffusion models in relax.  A good example would be a 2-domain
protein.  Each domain would have it's own diffusion tensor, each being
one of the ellipsoid, spheroid, and sphere.  Alternatively, a stretch
of residues could be described by the local_tau_m parameter, and the
rest belonging to a rigid core of the protein being described by a
single or multiple diffusion tensors.  Therefore for the global
diffusion tensor info, quite a bit needs to be stored, including a
list of the spin systems that the tensor covers.  I can also guarantee
that much more complex global diffusion models will exist in the
future (I'll talk to you again later once I develop these).

We should discuss this at the ENC and see if there is a reasonable way to
model all of these options.


Ok.  Next week then!

 loop_
   _Model_free_input.Saveframe_category_ID
   _Model_free_input.Saveframe_label
   _Model_free_experiment.Model_free_list_ID

 stop_

I'm still not sure what I'm supposed to put into the Model_free_input
saveframe?

I think I did not construct the above loop correctly and it should look like
this with example values for the tags:

 loop_
   _Model_free_input.Saveframe_category
   _Model_free_input.Saveframe_ID
   _Model_free_input.Saveframe_label
   _Model_free_input.Model_free_list_ID

  'general_relaxation'   1   $general_relaxation_R1_600MHz    1
  'general_relaxation'   2   $general_relaxation_R2_600MHz    1
  'Heteronucl_NOE'       1   $Heteronucl_NOE_600MHz           1
  'general_relaxation'   3   $general_relaxation_R1_800MHz    1
  'general_relaxation'   4   $general_relaxation_R2_800MHz    1
  'Heteronucl_NOE'       2   $Heteronucl_NOE_800MHz           1

 stop_

Entries may have many sets (saveframes) of R1, R2, heteroNOE, and other
data. A particular set of order parameters will be derived from specific
sets of input data. There maybe more than one set of reported order
parameters. It is not appropriate to assume that the order parameters are
derived from all of the data in the entry, and therefore the above loop or
table is needed to define which sets of relaxation data were used to derive
the specific set of order parameters. Of course, this assumes that a set of
results are derived from sets of input data and it could be argued that this
should be modeled at the level of data items and not data sets. This could
be done, but becomes much more complex requiring at least two additional
loops or tables.

In the above example the combination of 'Saveframe_category' and
'Saveframe_ID' is redundant with 'Saveframe_label'. You can choose the one
you would prefer to program.

I also have assumed that the NOE data will continue to be recorded in a
unique type of saveframe and not be recorded in a 'general_relaxation'
saveframe. This point is still up for discussion.


Discussed above, hopefully.

Yes, I am becoming more agreeable to your suggestion, but I think thereare a number of details to work out.

   _Model_free.Bond_length_val
   _Model_free.Bond_length_val_err
   _Model_free.S2_val
   _Model_free.S2_val_err
   _Model_free.S2f_val
   _Model_free.S2f_val_err
   _Model_free.S2s_val
   _Model_free.S2s_val_err
   _Model_free.Local_tau_c_val
   _Model_free.Local_tau_c_val_err
   _Model_free.Tau_e_val
   _Model_free.Tau_e_val_err
   _Model_free.Tau_f_val
   _Model_free.Tau_f_val_err
   _Model_free.Tau_s_val
   _Model_free.Tau_s_val_err
   _Model_free.Rex_val
   _Model_free.Rex_val_err
   _Model_free.Chi_squared_val
   _Model_free.Model_fit
   _Model_free.Model_free_list_ID


 stop_

 loop_
   _Bond_vector.ID
   _Bond_vector.Model_free_ID
   _Bond_vector.X_val
   _Bond_vector.Y_val
   _Bond_vector.Z_val
   _Bond_vector.Model_free_list_ID


 Stop_

Do I put this Bond_vector saveframe into the Model_free saveframe
category, or is it its own saveframe category?

If you prefer this 'Bond_vector' table with the '.Model_free_ID' foreign
key, then this would be a loop within the Model_free saveframe and the
'_Model_free.Bond_vector_?_val' tags would be removed.


I think this still needs a bit of refinement.  The bond length value
could also change in RNA or other systems where multiple surrounding
spins are causing dipolar relaxation.  So maybe the length
_Model_free.Bond_length_val could go into here too?  Would you like
multiple vectors per spin in one model and multiple vectors per model
to be in 1 or 2 different saveframes?

If we need to deal with many-to-many relationships then the modelingbecomes more complex, but it can be done.

Regarding the enumerations below for the '.Model_fit' tag, we have found
that if we do not list as complete a list of enumerations as possible,
depositors seem to think that if their data is not in the enumerated list
BMRB will not accept it. I duplicated the RELAX strings with those you
provided for Art's ModelFree program only because I was afraid that the
order and spelling of the variables in the string might be software
dependent and therefore affect the interpretation of the deposited data
sets.


That is true.  A list that would be completely software independent,
matching the model-free theories exactly and 100%, would then be:

No internal model-free motions:
        ''                                                .
        'Rex'                                              .

Original Lipari-Szabo (Lipari and Szabo, 1982), plus additional params
such as Rex:
        'S2'                                               .
        'S2, te'                                           .
        'S2, Rex'                                          .
        'S2, te, Rex'                                      .

Extended model-free (Clore et al., 1990), plus additional params such as Rex:
        'S2f, S2, ts'                                       .
        'S2f, S2s, ts'                                       .
        'S2f, tf, S2, ts'                                  .
        'S2f, tf, S2s, ts'                                  .
        'S2f, S2, ts, Rex'                                 .
        'S2f, S2s, ts, Rex'                                 .
        'S2f, tf, S2, ts, Rex'                             .
        'S2f, tf, S2s, ts, Rex'                             .

This software independent list then allows different software packages
BMRB depositions to be compatible with each other.  The above list
covers every model used by relax, Modelfree4, Dasha, Tensor2,
DYNAMICS, etc.  If other non-model-free parameters are added in the
future by any software package, then the list could be extended in a
software independent way.  An if someone invents a new and different
model-free equation, then this list could again be easily extended.


Ok, we can go with these enumerations for the 'model_fit' tag.

Regards,

Edward

Re: To all relax developers: presenting relax at the ENC conference and preliminary BMRB NMR-STAR file creation for model-free results.

Header

Content

Related Messages