mailRe: r25491 - /trunk/specific_analyses/relax_disp/data.py


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Edward d'Auvergne on September 01, 2014 - 12:07:
Hi,

For the DPL model, the symbolic gradient, Hessian, and Jacobian
matrices are very basic.  These could be added to
lib.dispersion.dpl94.  And then maybe as the target function class
methods dfunc_DPL94(), d2func_DPL94(), and jacobian_DPL94().  For a
permanent reference, it would be great to have these added to the
relax manual.  Maybe as a new section in Chapter 14, "Optimisation of
relaxation data - values, gradients, and Hessians".  The symbolic
derivatives for all other analytic models should also not be too
complicated, I hope.  Anyway, if you are interested in having this
functionality we can incorporate Numdifftools into relax to provide
slow numerical estimates for the other dispersion models
(https://code.google.com/p/numdifftools/).  One day I might
incorporate this into minfx as well, so that minfx can use numerical
gradients and Hessians automatically for optimisation when the user
does not provide them themselves.

Cheers,

Edward



On 1 September 2014 11:49, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> 
wrote:
Hi Edward.

I think you dont randomize for R1.

This should be a bug.
Ugh.

Do you submit this?


If R1 is fitted, then one can just take the fitted values.

I have just made the Jacobian and Hessian for DPL94.
wiki.nmr-relax.com/DPL94_derivatives

When the Jacobians are defined like this, the only thing necessary is:
-------------------------------------------------
    def func_chi2_grad(self, params=None, times=None, values=None, 
errors=None):
        """Target function for the gradient (Jacobian matrix) to
minfx, for exponential fit .

        @param params:  The vector of parameter values.
        @type params:   numpy rank-1 float array
        @keyword times: The time points.
        @type times:    numpy array
        @param values:  The measured values.
        @type values:   numpy array
        @param errors:  The standard deviation of the measured
intensity values per time point.
        @type errors:   numpy array
        @return:        The Jacobian matrix with 'm' rows of function
derivatives per 'n' columns of parameters, which have been summed
together.
        @rtype:         numpy array
        """

        # Get the back calc.
        back_calc = self.func(params=params)

        # Get the Jacobian, with partial derivative, with respect to
r2eff and i0.
        grad = self.func_grad(params=params)

        # Transpose back, to get rows.
        grad_t = transpose(grad)

        # n is number of fitted parameters.
        n = len(params)

        # Define array to update parameters in.
        jacobian_chi2_minfx = zeros([n])

        # Update value elements.
        dchi2(dchi2=jacobian_chi2_minfx, M=n, data=values,
back_calc_vals=back_calc, back_calc_grad=grad_t, errors=errors)

        # Return Jacobian matrix.
        return jacobian_chi2_minfx
----------------------------------------------------




2014-09-01 10:51 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:
Hi,

Please see below:

On 1 September 2014 10:20, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> 
wrote:
Yes.

That was a seriously hard bug to find.

Especially when you consider the MC simulations as the "Golden Standard".
And then the  "Golden Standard" is wrong...

Ouch.

True!  But there are bugs everywhere and you should never assume that
parts of relax, or any software in general is bug free.  Never trust
black-boxes!  This is a good lesson ;)  All software has bugs, and
this will not be the last in relax.


Should we set the GUI to have exp_sim = -1?
There is no assumption, that 100000 simulations of exponential fits
are better than the co-variance.

Just in case someone has much harder to optimise peak intensity data,
for example if the data is extremely noisy or if it is not exactly
mono-exponential, then this is not a good idea.  It is better to spend
time to obtain the best result rather than obtaining a quick result
which in most cases works, but is known to theoretically fail.  You
don't want to be in that theoretical failure group and not know about
it.  So the user can set it themselves but they should compare it to
MC simulations anyway to be sure.

Note that the curvature of the optimisation space for the dispersion
models is far more complicated than the 2-parameter exponential
curves.  For the dispersion models, the covariance matrix approach
will not be anywhere near as good.  For these models, most big names
in the field would never consider the covariance matrix approach.
Many people are wary of the edge-case failures of this technique.  For
the best results you would always pick the best technique which, by
statistical theory, is Monte Carlo simulations by far.


Btw.

Can we check Monte-Carlo simulations for the dispersion models?

That's a great idea!  This is probably also untested in the test
suite.  The covariance matrix approach is perfect for checking that
the Monte Carlo results are reasonable.  However you do require the
Jacobian matrix which is not derived for any dispersion model.  There
are no gradients derived, though it could be done numerically in the
test_suite/shared_data/dispersion directories using the very useful
Numdifftools package (https://pypi.python.org/pypi/Numdifftools).

Or an even better way would be to create the
error_analysis.covariance_matrix user function which, like the
pipe_control.monte_carlo module, uses the specific API to obtain, in
this case the Jacobian and weighting matrix via one new method, calls
lib.statistics.multifit_covar() to create the covariance matrix, and
then calls the API again to set the errors via the already existing
api.set_error() API method.  Then you can use the covariance matrix
approach for all the dispersion models.  Due to the licencing of
Numdifftools, we could even bundle that with relax in the extern
package and use numerical Jacobian integration so that even the
numeric dispersion models can have a covariance matrix.


Where is that performed?

The specific analysis API.  See the functions in the
pipe_control.monte_carlo module.  The API object is obtained as:

    # The specific analysis API object.
    api = return_api()

Then you can see the methods called in
specific_analyses.relax_disp.api as, for example in the
pipe_control.monte_carlo module:

        # Create the Monte Carlo data.
        if method == 'back_calc':
            data = api.create_mc_data(data_index)

You will therefore find the create_mc_data() method in the dispersion
API module.  If you search for all of the api.*() calls, then you'll
find all those methods in the API object (or the default with a
RelaxImplementError in the API base class).  It's rather simple.


Do you randomize only R1rho' or do you also include randomize for R1?

This is performed in the pipe_control.monte_carlo.create_data()
function.  See if you can trace the API method calls back and find the
source of this!  It would be good if you check as well.

Cheers,

Edward



Related Messages


Powered by MHonArc, Updated Mon Sep 01 12:20:10 2014