Re: r24282 - /branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py -- June 24, 2014

Hi Troels,

I just checked this change using the master profiling script (1
iteration and only the 'NS MMQ 2-site' model):

"""
$ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2
/data/relax/branches/disp_spin_speed
[snip]

New relax version:  relax repository checkout r24288
svn+ssh://bugman@xxxxxxxxxxx/svn/relax/branches/disp_spin_speed
Old relax version:  relax repository checkout r24274
svn+ssh://bugman@xxxxxxxxxxx/svn/relax/branches/disp_spin_speed

Execution iteration 1

$ python profiling_ns_mmq_2site.py /data/relax/branches/disp_spin_speed2
     1000    0.026    0.000   25.812    0.026
relax_disp.py:1450(func_ns_mmq_2site)
       10    0.003    0.000   25.587    2.559
relax_disp.py:1450(func_ns_mmq_2site)
$ python profiling_ns_mmq_2site.py /data/relax/branches/disp_spin_speed
     1000    0.025    0.000   25.860    0.026
relax_disp.py:1427(func_ns_mmq_2site)
       10    0.002    0.000   25.504    2.550
relax_disp.py:1427(func_ns_mmq_2site)

[snip]

100 single spins analysis:
NS MMQ 2-site:             258.600+/-0.000 -> 258.120+/-0.000,   1.002x 
faster.

Cluster of 100 spins analysis:
NS MMQ 2-site:             255.040+/-0.000 -> 255.870+/-0.000,   0.997x 
faster.
"""

This is strange!  I guess that the numpy.einsum() function is
performing its own internal Python looping, not using the highly
optimised and multi-threaded BLAS/LAPACK libraries, and hence this
change does not influence the speed.  This is nevertheless very useful
and important for the eventual elimination of the Python loops in the
lib.dispersion.ns_mmq_2site module.  But it is interesting that
numpy.einsum() is performing the same as Python looping in this case.
I guess that future numpy versions will have far greater optimisation
for numpy.einsum() so that this change will be significant later on.

Cheers,

Edward


On 24 June 2014 15:27,  <tlinnet@xxxxxxxxxxxxx> wrote:

Author: tlinnet
Date: Tue Jun 24 15:27:51 2014
New Revision: 24282

URL: http://svn.gna.org/viewcvs/relax?rev=24282&view=rev
Log:
Speeded up ns mmq 2site, by moving the forming of evolution matrix out of 
the for loops, and preform it.

Task #7807 (https://gna.org/task/index.php?7807): Speed-up of dispersion 
models for Clustered analysis.

Modified:
    branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py

Modified: branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py
URL: 
http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py?rev=24282&r1=24281&r2=24282&view=diff
==============================================================================
--- branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py     (original)
+++ branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py     Tue Jun 24 
15:27:51 2014
@@ -51,7 +51,7 @@

 # Python module imports.
 from math import floor
-from numpy import array, conj, complex64, dot, float64, log, multiply, sum
+from numpy import array, conj, complex64, dot, einsum, float64, log, 
multiply, sum

 # relax module imports.
 from lib.float import isNaN
@@ -197,8 +197,20 @@
     M2_mat = matrix_exponential_rank_NS_NM_NO_ND_x_x(m2_mat, 
dtype=complex64)

     # The complex conjugates M1* and M2*
+    # Equivalent to D+*.
     M1_star_mat = conj(M1_mat)
+    # Equivalent to Z-*.
     M2_star_mat = conj(M2_mat)
+
+    # Repetitive dot products (minimised for speed).
+    M1_M2_mat = einsum('...ij,...jk', M1_mat, M2_mat)
+    M2_M1_mat = einsum('...ij,...jk', M2_mat, M1_mat)
+    M1_M2_M2_M1_mat = einsum('...ij,...jk', M1_M2_mat, M2_M1_mat)
+    M2_M1_M1_M2_mat = einsum('...ij,...jk', M2_M1_mat, M1_M2_mat)
+    M1_M2_star_mat = einsum('...ij,...jk', M1_star_mat, M2_star_mat)
+    M2_M1_star_mat = einsum('...ij,...jk', M2_star_mat, M1_star_mat)
+    M1_M2_M2_M1_star_mat = einsum('...ij,...jk', M1_M2_star_mat, 
M2_M1_star_mat)
+    M2_M1_M1_M2_star_mat = einsum('...ij,...jk', M2_M1_star_mat, 
M1_M2_star_mat)

     # Loop over spins.
     for si in range(NS):
@@ -210,79 +222,68 @@

                 # Loop over the time points, back calculating the R2eff 
values.
                 for i in range(num_points_i):
-                    # The M1 and M2 matrices.
-                    # Equivalent to D+.
-                    M1_i = M1_mat[si, mi, oi, i]
-                    # Equivalent to Z-.
-                    M2_i = M2_mat[si, mi, oi, i]
-
-                    # The complex conjugates M1* and M2*
-                    # Equivalent to D+*.
-                    M1_star_i = M1_star_mat[si, mi, oi, i]
-                    # Equivalent to Z-*.
-                    M2_star_i = M2_star_mat[si, mi, oi, i]
-
-                    # Repetitive dot products (minimised for speed).
-                    M1_M2 = dot(M1_i, M2_i)
-                    M2_M1 = dot(M2_i, M1_i)
-                    M1_M2_M2_M1 = dot(M1_M2, M2_M1)
-                    M2_M1_M1_M2 = dot(M2_M1, M1_M2)
-                    M1_M2_star = dot(M1_star_i, M2_star_i)
-                    M2_M1_star = dot(M2_star_i, M1_star_i)
-                    M1_M2_M2_M1_star = dot(M1_M2_star, M2_M1_star)
-                    M2_M1_M1_M2_star = dot(M2_M1_star, M1_M2_star)
+                    # Extract data from array.
+                    power_i = power[si, mi, oi, i]
+                    M1_M2_i = M1_M2_mat[si, mi, oi, i]
+                    M1_M2_star_i = M1_M2_star_mat[si, mi, oi, i]
+                    M2_M1_i = M2_M1_mat[si, mi, oi, i]
+                    M2_M1_star_i = M2_M1_star_mat[si, mi, oi, i]
+                    M1_M2_M2_M1_i = M1_M2_M2_M1_mat[si, mi, oi, i]
+                    M2_M1_M1_M2_star_i = M2_M1_M1_M2_star_mat[si, mi, oi, 
i]
+                    M2_M1_M1_M2_i = M2_M1_M1_M2_mat[si, mi, oi, i]
+                    M1_M2_M2_M1_star_i = M1_M2_M2_M1_star_mat[si, mi, oi, 
i]

                     # Special case of 1 CPMG block - the power is zero.
-                    if power[si, mi, oi, i] == 1:
+                    if power_i == 1:
                         # M1.M2.
-                        A = M1_M2
+                        A = M1_M2_i

                         # M1*.M2*.
-                        B = M1_M2_star
+                        B = M1_M2_star_i

                         # M2.M1.
-                        C = M2_M1
+                        C = M2_M1_i

                         # M2*.M1*.
-                        D = M2_M1_star
+                        D = M2_M1_star_i

                     # Matrices for even number of CPMG blocks.
-                    elif power[si, mi, oi, i] % 2 == 0:
+                    elif power_i % 2 == 0:
                         # The power factor (only calculate once).
-                        fact = int(floor(power[si, mi, oi, i] / 2))
+                        fact = int(floor(power_i / 2))

                         # (M1.M2.M2.M1)^(n/2).
-                        A = square_matrix_power(M1_M2_M2_M1, fact)
+                        A = square_matrix_power(M1_M2_M2_M1_i, fact)

                         # (M2*.M1*.M1*.M2*)^(n/2).
-                        B = square_matrix_power(M2_M1_M1_M2_star, fact)
+                        B = square_matrix_power(M2_M1_M1_M2_star_i, fact)

                         # (M2.M1.M1.M2)^(n/2).
-                        C = square_matrix_power(M2_M1_M1_M2, fact)
+                        C = square_matrix_power(M2_M1_M1_M2_i, fact)

                         # (M1*.M2*.M2*.M1*)^(n/2).
-                        D = square_matrix_power(M1_M2_M2_M1_star, fact)
+                        D = square_matrix_power(M1_M2_M2_M1_star_i, fact)

                     # Matrices for odd number of CPMG blocks.
                     else:
                         # The power factor (only calculate once).
-                        fact = int(floor((power[si, mi, oi, i] - 1) / 2))
+                        fact = int(floor((power_i - 1) / 2))

                         # (M1.M2.M2.M1)^((n-1)/2).M1.M2.
-                        A = square_matrix_power(M1_M2_M2_M1, fact)
-                        A = dot(A, M1_M2)
+                        A = square_matrix_power(M1_M2_M2_M1_i, fact)
+                        A = dot(A, M1_M2_i)

                         # (M1*.M2*.M2*.M1*)^((n-1)/2).M1*.M2*.
-                        B = square_matrix_power(M1_M2_M2_M1_star, fact)
-                        B = dot(B, M1_M2_star)
+                        B = square_matrix_power(M1_M2_M2_M1_star_i, fact)
+                        B = dot(B, M1_M2_star_i)

                         # (M2.M1.M1.M2)^((n-1)/2).M2.M1.
-                        C = square_matrix_power(M2_M1_M1_M2, fact)
-                        C = dot(C, M2_M1)
+                        C = square_matrix_power(M2_M1_M1_M2_i, fact)
+                        C = dot(C, M2_M1_i)

                         # (M2*.M1*.M1*.M2*)^((n-1)/2).M2*.M1*.
-                        D = square_matrix_power(M2_M1_M1_M2_star, fact)
-                        D = dot(D, M2_M1_star)
+                        D = square_matrix_power(M2_M1_M1_M2_star_i, fact)
+                        D = dot(D, M2_M1_star_i)

                     # The next lines calculate the R2eff using a two-point 
approximation, i.e. assuming that the decay is mono-exponential.
                     A_B = dot(A, B)
@@ -353,6 +354,11 @@
     A_pos_mat =  matrix_exponential_rank_NS_NM_NO_ND_x_x(m1_mat, 
dtype=complex64)
     A_neg_mat =  matrix_exponential_rank_NS_NM_NO_ND_x_x(m2_mat, 
dtype=complex64)

+    # The evolution for one n.
+    evol_block_mat = einsum('...ij,...jk', A_neg_mat, A_pos_mat)
+    evol_block_mat = einsum('...ij,...jk', A_neg_mat, evol_block_mat)
+    evol_block_mat = einsum('...ij,...jk', A_pos_mat, evol_block_mat)
+
     # Loop over spins.
     for si in range(NS):
         # Loop over the spectrometer frequencies.
@@ -364,15 +370,12 @@

                 # Loop over the time points, back calculating the R2eff 
values.
                 for i in range(num_points_i):
-                    # The A+/- matrices.
-                    A_pos_i = A_pos_mat[si, mi, oi, i]
-                    A_neg_i = A_neg_mat[si, mi, oi, i]
-
-                    # The evolution for one n.
-                    evol_block = dot(A_pos_i, dot(A_neg_i, dot(A_neg_i, 
A_pos_i)))
+                    # Extract data from array.
+                    power_i = power[si, mi, oi, i]
+                    evol_block_i = evol_block_mat[si, mi, oi, i]

                     # The full evolution.
-                    evol = square_matrix_power(evol_block, power[si, mi, 
oi, i])
+                    evol = square_matrix_power(evol_block_i, power_i)

                     # The next lines calculate the R2eff using a two-point 
approximation, i.e. assuming that the decay is mono-exponential.
                     Mx = dot(F_vector, dot(evol, M0))


_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-commits mailing list
relax-commits@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-commits

Re: r24282 - /branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py

Header

Content

Related Messages