Re: [model-free] handling of missing data -- October 04, 2012

Hi Martin,

I'll answer below:

what is the best way of handling relaxation data that have no corresponding 
residues in the crystal structure?


You answered that yourself with the 3 points below :)

My first step is to read the pdb and load the spins (e.g. resdidues 7 to 
61):

structure.read_pdb(file='./mypdb.pdb', dir=None, read_mol=None, 
set_mol_name=None, read_model=None, set_model_num=None, parser='internal')
structure.load_spins(spin_id='@N*', ave_pos=True)
structure.load_spins(spin_id='@H*', ave_pos=True)


If I try to load my relaxation data sets (ranging from residue 6 to 62), 
relax complains that there is no spin set up for residue 6 (and 62):

relax_data.read(ri_id='R1_600', ri_type='R1', frq=600179910.0, 
file='./600/rx_t1.out', res_num_col=2, res_name_col=3, spin_num_col=4, 
spin_name_col=5, data_col=6, error_col=7)
Opening the file './600/rx_t1.out' for reading.
RelaxError: The spin ':6@N' does not exist.


This is a problem!

I can imagine the following workarounds:

a) delete spin relaxation data for res. 6 and 62 from all data files,


You can do this, but this is an ugly solution which can cause problems
in the future.  I would never recommend to ever do this, relax should
be able to handle this data (otherwise I would classify this as a
bug).

b) create residues 6 and 62 by hand (residue.create, spin.create) and 
deselect them, or


I would suggest this method, but without deselecting them.  They will
be automatically deselected during the optimisation for the diffusion
models requiring bond vector orientations, but will be active for the
local tm models and spherical diffusion.  You could in the end use the
results from the local tm model for just these spins, combining this
with the solution from whichever model-free protocol you use.

c) read a sequence file containing all missing N spins, and "None" spins 
for all the spins I want to load from the pdb file.


If b) is too much work, then a sequence file containing just the spins
missing from the PDB could be used.  I would assume that relax would
complain if you try to read the spin data twice.

So far I have been successful with a) but I don't like fiddling with my 
input files too much, as it quickly becomes a mess of different versions 
etc.


This option is not the best way.

Option b) gave me weird errors after t_m fitting ("deepcopy(spin2) 
IndexError: list assignment index out of range") I guess are related to the 
additional spins.


This is quite strange.  Your bug report (https://gna.org/bugs/?20213)
is the correct way to handle this.  I'll respond to that next.

I tried using c) but I ran into problems with the spin setup: Obviously 
relax set up spins of type "0" and wants me to set up the isotope type of 
it.


For option c), you will have to specify the isotope, but you'll to do
this for the data from a PDB file as well.  You would also need to
specify the element:

spin.element(element='N', spin_id='@N*')

Though you might need this for PDB files as well because often the PDB
element column is missing.

So, what kind of approach do you have?


Both b) and c) should be fine.  If they don't work, then that is
likely to be due to a bug.  Unless there is a RelaxError complaining
about missing data, in which case you might need the spin.isotope
and/or spin.element user functions.

Regards,

Edward

Re: [model-free] handling of missing data

Header

Content

Related Messages