Code sharing in the automatic analyses. -- January 24, 2010

Hi Michael,

I've noticed that in the GUI, almost all of the code from the sample
scripts has been duplicated.  This was obviously the best first
approach.  However I am going to work on an alternative, as code
should never be duplicated (unless the second copy is heavily modified
and has a different purpose).  The reason for avoiding code
duplication is best demonstrated by an example.  For the
'full_analysis.py' script, AIC model selection is currently being
performed.  This is crucial for solving the universal solution
equation I laid out in:

d'Auvergne E. J., Gooley P. R. (2007). Set theory formulation of the
model-free problem and the diffusion seeded model-free paradigm. Mol.
Biosyst., 3(7), 483-494. (http://dx.doi.org/10.1039/b702202f).

As a side note, BIC model selection cannot be used as it does not
solve, and cannot solve this equation!!!  But there is a far more
advanced model selection technique which may be better than AIC.  This
is the information complexity criterion ICOMP.  Therefore in the
future, the analysis might switch to ICOMP.  And this switch, in
properly designed code, should only occur in one place - and both the
'full_analysis.py' script and the GUI should automatically use this
change.  We should not have to change this complex protocol in 2
places - this is really bad form.

Therefore I will make the following changes.  The code part of
'full_analysis.py' will be shifted into a relax module called
auto_analyses.dauvergne_protocol
(auto_analyses/dauvergne_protocol.py).  The Dauvergne_protocol in this
can be imported and then all the variables currently at the top of the
script passed into it such as:

Dauvergne_protocol(diff_model=DIFF_MODEL, mf_models=MF_MODELS,
local_tm_models=LOCAL_TM_MODELS, pdb_file=PDB_FILE, seq_args=SEQ_ARGS,
het_name=HET_NAME, relax_data=RELAX_DATA, unres=UNRES,
exclude=EXCLUDE, bond_length=BOND_LENGTH, csa=CSA, hetnuc=HETNUC,
proton=PROTON, grid_inc=GRID_INC, min_algor=MIN_ALGOR, mc_num=MC_NUM,
conv_loop=CONV_LOOP)

Both the script and the GUI will then initialise this class for their
implementation of the automatic model-free analysis.  I will make the
change to the full_analysis.py script and then eliminate the
duplicated code from the GUI modules.  I would like the same thing to
happen with all the automatic analyses implemented both in the sample
scripts and the GUI.  This could also be useful if the protocol is
parallelised, as then both the sample script and the GUI can instantly
make better use of running on clusters.

Note that the code in auto_analyses will go through the user function
interface (not as a script though) to provide print outs from the user
function calls.  For better GUI interaction in the future, these will
be easily replaceable.

Regards,

Edward
Code sharing in the automatic analyses.

Header

Content

Related Messages