mailRe: AIC or BIC


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on May 22, 2007 - 10:45:
Hi,

The answer to your question is a matter of opinion.  Model selection
is a large statistical field and there may be even better techniques
for model-free analysis.  For instance the information complexity
(ICOMP) techniques may perform even better, although I never had a
chance to test these.  The important question is, how do you measure
the performance of these techniques?

In my paper on model-free model selection, I found that AIC and BIC
perform equally well when data at 2 field strengths is used
(specifically tested on 500 and 600 MHz data).  When single field
strength data was used, AIC performed slightly better than BIC (note
the word slightly!).  The way I measured performance was to compare
the results of model selection to the theoretical 'expected
discrepancy' EDelta.  This theoretical value, which can never be
measured, is what all the frequentist model selection techniques try
to estimate (AIC, BIC, cross validation, bootstrap, AICc, ICOMP, etc).
Because I used synthetic data I knew what the true model-free
dynamics behind the relaxation data was and hence I could directly
calculate EDelta and use it for model selection.  This then gives the
gold standard for frequentist model selection comparison (but not
Baysian model selection (that does not include BIC which is a
frequentist technique) or hypothesis testing model selection).

In the Wright paper (Chen et al., 2004) AIC was compared to BIC which
was compared to hypothesis testing model selection.  It should be
noted that the hypothesis testing model selection utilised was not
that of Mandel et al. (1995).  The details are in the first column on
page 247 of the paper.  The alpha critical levels chosen are in the
second column of page 250.  The technique has been significantly
modified to prevent under-fitting and hence probably, yet
unintentionally, forced to closely replicate the results of AIC model
selection.

Hypothesis testing model selection is very subjective in that by
careful construction of the sequence in which tests are carried out,
careful selection of the alpha critical levels, and where chi-squared
verses F-tests are used - many different results can be had.  For
example by using a step up procedure - starting the tests at the
simplest model and ending at the most complex - the final results will
deliberately under-fit.  If you use a step down procedure - starting
at the most complex model and finishing at the simplest - the results
will deliberately be over-fit.  By careful construction of the
hypothesis testing selection procedure I could closely replicate the
results of many of the frequentist model selection techniques.  This
could be one of the reasons why many people say that you can tweak
statistics to pull out any result you want.

In the Chen et al. (2004) paper a 10 ns MD simulation was assumed to
be the true, and hence known, dynamics of the system and this data was
used for validation.  BIC was reasoned to be better than AIC and
hypothesis testing.  This conclusion is mainly from Figure 6 and Table
1.  In this case, the modified hypothesis testing was used as the
standard by which the techniques are compared!  It should be noted
form Figure 6a that the differences aren't huge.  The fact that the
hypothesis testing model selection is closely replicating the results
of AIC is quite likely due to the implementation details of that model
selection scheme.  It would be interesting to see how the original
technique of Mandel et al. (1995) compares in this study as this is
the technique which everyone is using.

Now, later work where I found that model-free models had failed
(requiring model elimination) and where optimisation had failed
preventing the true dynamics to be found will both influence model
selection.  My original work and that of Chen et al. (2004) are both
biased by these issues and hence the subtle differences in the
conclusions could completely be due to these problems rather than
anything to do with the subtle performance differences of the
techniques within model-free analysis!  So, using AIC or BIC is a
matter of opinion and is completely your choice (relax will do both).

Sorry for making this more complicated than you were probably expecting.

Regards,

Edward



On 5/22/07, Hongyan Li <hylichem@xxxxxxxxxxxx> wrote:
Dear relax-users,
I wonder if BIC mode should be chosen for model selection for relaxation data
obtained in a single field, while AIC for those data obtained at two field
strengths when using RELAX program. Just read Wright PE paper on Journal of
Biomolecular NMR 29: 243–257, 2004.

Cheers,

Hongyan

Dr. Hongyan Li
Department of Chemistry
The University of Hong Kong
Pokfulam Road
Hong Kong


_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users




Related Messages


Powered by MHonArc, Updated Tue May 22 11:00:22 2007