mailRe: Inconsistencies in the v2.1 (or v2.1.1) BMRB model-free entries.


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on February 03, 2011 - 14:01:
Hi,

Actually, it wasn't so bad.  Only the bmr4096.str entry has a sequence
mismatch.  The residue 65 should either be Asn or Gln - both residue
can be found in the entry.  The same with 146.  I have assumed the
monomeric_polymer (entity) saveframe is correct and that both should
be Gln.  The diff is below.

Cheers,

Edward



[edau@localhost bmr2.1_files_mod1]$ cat diff
diff -ur ./bmr4096.str ../bmr2.1_files/bmr4096.str
--- ./bmr4096.str       2011-01-28 05:56:00.000000000 +0100
+++ ../bmr2.1_files/bmr4096.str 2011-02-03 13:22:27.000000000 +0100
@@ -1031,7 +1031,7 @@
       3JHNHA  60 ALA H  60 ALA HA 3.96 0.03
       3JHNHA  63 LEU H  63 LEU HA 3.13 0.08
       3JHNHA  64 ALA H  64 ALA HA 3.50 0.03
-      3JHNHA  65 ASN H  65 ASN HA 4.94 0.03
+      3JHNHA  65 GLN H  65 GLN HA 4.94 0.03
       3JHNHA  66 ILE H  66 ILE HA 4.18 0.05
       3JHNHA  67 GLY H  67 GLY HA 4.82 0.04
       3JHNHA  68 VAL H  68 VAL HA 4.36 0.03
@@ -1108,7 +1108,7 @@
       3JHNHA 143 SER H 143 SER HA 3.33 0.03
       3JHNHA 144 GLY H 144 GLY HA 4.40 0.06
       3JHNHA 145 LEU H 145 LEU HA 3.06 0.09
-      3JHNHA 146 ASN H 146 ASN HA 6.99 0.02
+      3JHNHA 146 GLN H 146 GLN HA 6.99 0.02
       3JHNHA 147 SER H 147 SER HA 5.56 0.02

    stop_
@@ -1222,7 +1222,7 @@
        62 VAL N 0.88 0.03
        63 LEU N 0.85 0.03
        64 ALA N 0.85 0.02
-       65 ASN N 0.85 0.02
+       65 GLN N 0.85 0.02
        66 ILE N 0.88 0.03
        67 GLY N 0.8  0.03
        68 VAL N 0.91 0.02
@@ -1302,7 +1302,7 @@
       143 SER N 0.86 0.02
       144 GLY N 0.91 0.03
       145 LEU N 0.88 0.04
-      146 ASN N 0.84 0.02
+      146 GLN N 0.84 0.02
       147 SER N 1    0.01

    stop_
@@ -1416,7 +1416,7 @@
        62 VAL N 13.3  0.25
        63 LEU N 15.64 0.29
        64 ALA N 16.38 0.22
-       65 ASN N 15.93 0.25
+       65 GLN N 15.93 0.25
        66 ILE N 14.28 0.29
        67 GLY N 15.87 0.3
        68 VAL N 16.72 0.25
@@ -1496,7 +1496,7 @@
       143 SER N 17.48 0.23
       144 GLY N 15.64 0.27
       145 LEU N 16.42 0.48
-      146 ASN N 15.59 0.24
+      146 GLN N 15.59 0.24
       147 SER N 12.32 0.08

    stop_
@@ -1605,7 +1605,7 @@
        62 VAL N 14.24 0.33
        63 LEU N 14.94 0.41
        64 ALA N 15.47 0.26
-       65 ASN N 15    0.32
+       65 GLN N 15    0.32
        66 ILE N 14.23 0.39
        67 GLY N 14.62 0.35
        68 VAL N 14.26 0.28
@@ -1684,7 +1684,7 @@
       143 SER N 14.75 0.28
       144 GLY N 14.06 0.34
       145 LEU N 15.19 0.51
-      146 ASN N 15.11 0.34
+      146 GLN N 15.11 0.34
       147 SER N 12.35 0.14

    stop_
@@ -1800,7 +1800,7 @@
        62 VAL N   0.85 .
        63 LEU N   0.86 .
        64 ALA N   0.86 .
-       65 ASN N   0.86 .
+       65 GLN N   0.86 .
        66 ILE N   0.85 .
        67 GLY N   0.88 .
        68 VAL N   0.85 .
@@ -1881,7 +1881,7 @@
       143 SER N   0.84 .
       144 GLY N   0.85 .
       145 LEU N   0.87 .
-      146 ASN N   0.83 .
+      146 GLN N   0.83 .
       147 SER N   0.72 .

    stop_
@@ -1981,7 +1981,7 @@
        62 VAL N 0.9138 0.0186   .       .      .      .     10.3361 S2
        63 LEU N 0.9735 0.0195   .       .      .      .      1.1268 S2
        64 ALA N 0.9355 0.0394   .       .     1.8352 0.9506  2.772  S2,Rex
-       65 ASN N 0.9732 0.017    .       .      .      .      4.3623 S2
+       65 GLN N 0.9732 0.017    .       .      .      .      4.3623 S2
        66 ILE N 0.9249 0.0197   .       .      .      .      2.8354 S2
        67 GLY N 0.9543 0.0187   .       .      .      .      4.6845 S2
        68 VAL N 0.9474 0.0158   .       .      .      .      1.5478 S2
@@ -2060,7 +2060,7 @@
       143 SER N 0.9746 0.0139   .       .      .      .      3.6387 S2
       144 GLY N 0.9384 0.0179   .       .      .      .      0.4807 S2
       145 LEU N 0.9867 0.0185   .       .      .      .      3.3007 S2
-      146 ASN N 0.9742 0.0187   .       .      .      .      5.23   S2
+      146 GLN N 0.9742 0.0187   .       .      .      .      5.23   S2
       147 SER N 0.8163 0.0094 23.7407  4.1317  .      .     17.4326 S2,te

    stop_
@@ -2173,7 +2173,7 @@
        62 VAL H 8.68E-07 .        .        1.63E-03
        63 LEU H .        .        1.00E-08 .
        64 ALA H 3.90E-07 .        .        3.19E-03
-       65 ASN H 3.90E-07 .        .        3.19E-03
+       65 GLN H 3.90E-07 .        .        3.19E-03
        66 ILE H 1.92E-06 .        .        7.71E-04
        67 GLY H .        .        1.00E-08 .
        68 VAL H 5.83E-06 .        .        1.88E-04
@@ -2240,7 +2240,7 @@
       143 SER H 2.00E-05 .        .        7.11E-07
       144 GLY H 4.69E-06 .        .        2.58E-04
       145 LEU H .        .        1.00E-08 .
-      146 ASN H 1.38E-03 .        .        4.02E-05
+      146 GLN H 1.38E-03 .        .        4.02E-05
       147 SER H 1.90E-02 .        .        9.51E-04

    stop_
@@ -2316,7 +2316,7 @@
        61 LYS H 3.87E-02 1.47E+04 .
        62 VAL H 1.15E-02 7.95E+05 .
        64 ALA H 2.63E-02 4.05E+06 .
-       65 ASN H 4.86E-02 7.48E+06 .
+       65 GLN H 4.86E-02 7.48E+06 .
        66 ILE H 1.29E-02 4.02E+05 .
        68 VAL H 1.29E-02 1.33E+05 .
        71 SER H 7.16E-02 1.38E+04 .
@@ -2359,7 +2359,7 @@
       140 ALA H 6.25E-02 4.10E+04 .
       143 SER H 5.83E-02 1.75E+05 .
       144 GLY H 1.56E-01 1.99E+06 .
-      146 ASN H 3.02E-02 1.32E+03 .
+      146 GLN H 3.02E-02 1.32E+03 .
       147 SER H 3.22E-03 1.02E+01 .

    stop_





On 3 February 2011 13:25, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
Hi Eldon,

I've now implemented support for reading the molecular_polymer
saveframe of NMR-STAR v2.1.  I have assumed that this is equivalent to
the entity saveframe in v3.x.  This is pulling out some more sequence
problems in the v2.1 files.  I'll send these inconsistencies as a diff
once I have them all sorted out.

Regards,

Edward



On 2 February 2011 14:35, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
Hi Eldon,

I don't know if this is the best channel for this information.  Is
there a BMRB mailing list where it would be better for this
information?

Ok, this is how I have found these inconsistencies.  I have used relax
to read in the BMRB NMR-STAR formatted files.  This uses bmrblib which
I wrote (http://gna.org/projects/bmrblib/).  This library is pretty
close to complete for relaxation data and model-free data, and would
be very easy to extend to handle the entirety of the NMR-STAR
dictionary.  It can both read and write valid NMR-STAR formatted files
in versions 2.1, 3.0, and 3.1 (a little debugging might be still
required, and expansion to different revisions such as 2.1.1 is also
possible).  This Python library is an abstraction of the underlying
file format.  The very low level reading and writing of the STAR
format is handled by Jurgen F. Doreleijers' pystarlib (jurgenfd att
gmail dott com, http://code.google.com/p/pystarlib/).

For reading the entire BMRB model-free data content, I have performed
the following.  I have downloaded all of the files from
http://www.bmrb.wisc.edu/search/query_grid/query_1_46.html using the
link http://www.bmrb.wisc.edu/ftp/pub/bmrb/compress/query_1_46.tar.gz.
 These are all in the version 2.1 or 2.1.1 format.  Then using the
file names, I have downloaded all of the corresponding v3.1 files from
http://www.bmrb.wisc.edu/ftp/pub/bmrb/entry_lists/nmr-star3.1/.  It
looks like maybe 30% of the old formatted files have been converted to
the newer format so far.  I will write 2 subsequent emails with
explanations of the problems with the version 2.1 files and the 3.1
files separately.

In this mail, I would like to describe general problems.  The first is
that pystarlib cannot handle the semi-colon notation in non-free
looping tag categories, e.g.:

 loop_
    _Vendor.Name
    _Vendor.Address
    _Vendor.Electronic_address
    _Vendor.Entry_ID

    _Vendor.Software_ID

    'J. Patrick Loria' .
;
http://xbeams.chem.yale.edu/~loria/
patrick.loria@xxxxxxxx
; 15097 1

This is in the v3.1 file bmr15097.str.  The basic pystarlib
functionality probably needs to be fixed, assuming this construct is a
valid STAR format.  The second is that the bmr4970.str entry is not
parsable.  This file has multiple 15N S2_parameters saveframes:

save_S2_parameters_15N_22C
save_S2_parameters_15N_35C
save_S2_parameters_15N_47C
save_S2_parameters_15N_60C
save_S2_parameters_15N_73C


But these all have:

  loop_
     _Sample_label

     $sample_one

  stop_

  _Sample_conditions_label    $sample_conditions_one

They might be the same sample, but the sample conditions are different
as the temperature is changing.  By eye, this is obvious, but for the
automatic parsing of this data, the file has to be blacklisted and
skipped.

Cheers,

Edward





Related Messages


Powered by MHonArc, Updated Thu Feb 03 15:00:10 2011