Re: How is the R2eff data collected and processed for clustered analysis? -- June 05, 2014

Re: How is the R2eff data collected and processed for clustered analysis?

To: Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx>

Date: Thu, 5 Jun 2014 15:30:38 +0200

Cc: "relax-devel@xxxxxxx" <relax-devel@xxxxxxx>

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=3XrRs+kOaANVszWK1jpIzS7A5hW0uhCjhJttbGsurgw=; b=ylmOiugXVYLvzDRiayFto9I6iBnm4nDD5lUGaJdbbzZwnb3fJaUmxAhLlUfyt0M9rf nT5VgEGmH86EQ1iDThsfv9hvawG/ASDLenEJdFhK0Q/Clo0MqYxG4TYsifD10zCsUSPu fAhf0Zg86D9L4eFziB/FPsXD1DypZ2d8r6uaZQcAM+4IpFf+zvDoezZVEdUqPINsncMH 8yzP/9WGbN7wfwJhly11BKQCVuMjWPV3wE53ZU092yUeYhlIo20t+vkg8vvc0QjVw0aD 7mhsaPzQrxHFzTSQssd4jHGKj6l+68y76J98F/IORLNaiR5p7188J5sm5DvsB3P8V0zU v1DQ==

Message-id: <CAED9pY-wULJp0dQBD08ZMtW4sRdH00WorYhHvuWM7mS74cQHvw@mail.gmail.com>

References: <CA+CBx2Mkmn7AyX1O=CVFbG68BSq9nEMBcZUUMnMBJ8vAAkSm1w@mail.gmail.com> <CA+CBx2NCC_GGBpMBma6-tiC1ReoEywG=DazwxmBUEZ2pLGEh4g@mail.gmail.com> <CAED9pY_t+Zw96Xiw-18rpqzTxQyTG3w=omLOLmRp9jmNceuCpA@mail.gmail.com> <CA+CBx2PKFSm9bNv+dLsJ+1wLQ1xHTD5mNvg67wH-R1Km0aq0cQ@mail.gmail.com> <CAED9pY8JFpwfRjuXgO_F4SMen-7cRXNGgkusy_Wi4QmtUao+xg@mail.gmail.com> <CAED9pY_o1gC8jgnxHqFyW3-GObKiQ9dcQ=M1T9GKyZe=2_vR=g@mail.gmail.com> <CA+CBx2PH2QOtDS7HnnDkHZwM9NeegFZNuE5e1mTGXFxcM=EkXA@mail.gmail.com> <CAED9pY_D62Btc7oDtvK+jysTk6z4O3PV-1wK3m43kWv9eY4Tjw@mail.gmail.com> <CA+CBx2OkLkN0L3KyX6ATY+V2W7SPjAMVaFx8AD6_6_-e6W2hpA@mail.gmail.com> <CAED9pY9FjHtEK2hLTQXJeHeVV5NGLNwTng7_kBLyqaTD+Rj4Bg@mail.gmail.com> <CA+CBx2Mp-L5gwcZ=snL3jcKdh9ugnxrbOJqz2bDwsZ+VkhUDfg@mail.gmail.com>

Posted by Edward d'Auvergne on June 05, 2014 - 15:31:

On 5 June 2014 15:15, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Hi Edward.

So, I have tried to implement directly the infrastructure data format for NO
* NM * NS * NE.
And the speed up is 4.1x-4.5x times faster.

I think that is a very nice message to the release list.

It is obvious, that the largest speed-up will be gained by getting rid of
the NS loop.

Could one just re-shape the numpy arrays in the target function?


Yes!  Ok, you will need a little more than that.  The reshape will be
inside the lib.dispersion modules where we have the code:

back_calc[:] = R2eff

This would need to be replaced with:

back_calc[:] = R2eff.reshape(NE, NS, NM, NO, ND)

Or maybe keep the experiments separate, i.e. don't delete that loop in
target_functions.relax_disp, as different experiments are sometimes
associated with different lib.dispersion models, and then use:

back_calc[:] = R2eff.reshape(NS, NM, NO, ND)

As in your script
(http://thread.gmane.org/gmane.science.nmr.relax.devel/6022/focus=6028),
you will need to increase the dimensionality of some data structures:

g_ncyc = array(ncyc_list*100)

So, just as you have done in your script, you calculate a large 1D
array of R2eff values for all spins, all magnetic field strengths, all
offsets, and all dispersion points.

Difficulties might occur for cases with missing data, but that is why
I have implemented a number of system tests checking what happens when
data is missing :)  The large speedups are not just for large spin
clusters.  You also have large speedups for R1rho experiments where
many offsets are collected.  And there is a nice speedup when you have
data at 2 or 3 magnetic fields.  Anyway, for testing your script can
be expanded for multi-field and multi-offset cases.  Or maybe make a
new script for that profiling in
test_suite/shared_data/dispersion/profiling/.

Regards,

Edward

Re: How is the R2eff data collected and processed for clustered analysis?

Header

Content

Related Messages