Re: Why does one have to loop over the dispersion points? -- May 09, 2014

Before I do something, because i dont understand: "keep the returning
of the data structure for the
branch!".

I will:
Kill the looping in the library function of B14, and CR72, and TSMFK01.
Return one large array with 1e100, if any violation of the math domain.

I will make a return in the library function, and then updating the
class object:

Best
troels



2014-05-09 16:07 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

That might work - but it will require testing.  I really don't know
what will happen.  The current test suite should be sufficient for the
testing.  Also, keep the returning of the data structure for the
branch!

Regards,

Edward

On 9 May 2014 16:04, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Would you agree on discarding the whole loop, return one array with
all values = 1e100,
rather than some elements have 1e100, but other rather high values.?

Best
Troels

2014-05-09 15:58 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

Hi,

To set up a branch, just read the 3 small sections of the 'Branches'
section of the user manual
(http://www.nmr-relax.com/manual/Branches.html).  Using git here is
fatal.  But all the commands you need are listed there.  Try the CR72
optimisations first though.  Those will then make the API changes much
easier.

Regards,

Edward



On 9 May 2014 15:34, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

How do I setup a branch? :-)

Best
Troels

2014-05-09 15:31 GMT+02:00 Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx>:

Hi Edward.

I really think for my case, that 25 speed up is a deal breaker !
I have so much data to crunch, that 25 speed is absolutely perfect.

I would only optimise this for CR72, and TSMFK01, since these are the
ones I need now.
And the change of code is only 3-5 lines?

And i was thinking of one thing more.

CR72 always go over loop.

-----------
    # Loop over the time points, back calculating the R2eff values.
    for i in range(num_points):
        # The full eta+/- values.
        etapos = etapos_part / cpmg_frqs[i]
        etaneg = etaneg_part / cpmg_frqs[i]

        # Catch large values of etapos going into the cosh function.
        if etapos > 100:
            back_calc[i] = 1e100
            continue

        # The arccosh argument - catch invalid values.
        fact = Dpos * cosh(etapos) - Dneg * cos(etaneg)
        if fact < 1.0:
            back_calc[i] = r20_kex
            continue

        # The full formula.
        back_calc[i] = r20_kex - cpmg_frqs[i] * arccosh(fact)
------------
I would rather do:
etapos = etapos_part / cpmg_frqs

And then check for nan values.
If any of these are there, just return the whole array with 1e100,
instead of single values.
That would replace a loop with a check.

Best
Troels


2014-05-09 14:58 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

Hi,

This approach can add a little speed.  You really need to stress test
and have profile timings to understand.  You should also try different
Python versions (2 and 3) because each implementation is different.
You can sometimes have a speed up in Python 2 which does nothing in
Python 3 (due to Python 3 being more optimised).  There can also be
huge differences between numpy versions.  Anyway, here is a powerful
test which shows 3 different implementation ideas for the
back-calculated R2eff data in the dispersion functions:

"""
import cProfile as profile
from numpy import array, cos, float64, sin, zeros
import pstats

def in_place(values, bc):
    x = cos(values) * sin(values)
    for i in range(len(bc)):
        bc[i] = x[i]

def really_slow(values, bc):
    for i in range(len(bc)):
        x = cos(values[i]) * sin(values[i])
        bc[i] = x

def return_bc(values):
    return cos(values) * sin(values)

def test_in_place(inc=None, values=None, values2=None, bc=None):
    for i in range(inc):
        in_place(values, bc[0])
        in_place(values2, bc[1])
    print(bc)

def test_really_slow(inc=None, values=None, values2=None, bc=None):
    for i in range(inc):
        really_slow(values, bc[0])
        really_slow(values2, bc[1])
    print(bc)

def test_return_bc(inc=None, values=None, values2=None, bc=None):
    for i in range(inc):
        bc[0] = return_bc(values)
        bc[1] = return_bc(values2)
    print(bc)

def test():
    values = array([1, 3, 0.1], float64)
    values2 = array([0.1, 0.2, 0.3], float64)
    bc = zeros((2, 3), float64)
    inc = 1000000
    test_in_place(inc=inc, values=values, values2=values2, bc=bc)
    test_really_slow(inc=inc, values=values, values2=values2, bc=bc)
    test_return_bc(inc=inc, values=values, values2=values2, bc=bc)

def print_stats(stats, status=0):
    pstats.Stats(stats).sort_stats('time', 'name').print_stats()
profile.Profile.print_stats = print_stats
profile.runctx('test()', globals(), locals())
""""


Try running this in Python 2 and 3.  If the cProfile import does not
work on one version, try simply "import profile".  You should create
such scripts for testing out code optimisation ideas.  Knowing how to
profile is essential.  For Python 3, I see:

"""
$ python3.4 edward.py
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
         10001042 function calls (10001036 primitive calls) in 39.303 
seconds

   Ordered by: internal time, function name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  2000000   19.744    0.000   19.867    0.000 edward.py:10(really_slow)
  2000000    9.128    0.000    9.286    0.000 edward.py:5(in_place)
  2000000    5.966    0.000    5.966    0.000 edward.py:15(return_bc)
        1    1.964    1.964    7.931    7.931 
edward.py:30(test_return_bc)
        1    1.119    1.119   20.987   20.987 
edward.py:24(test_really_slow)
        1    1.099    1.099   10.385   10.385 
edward.py:18(test_in_place)
  4000198    0.281    0.000    0.281    0.000 {built-in method len}
       27    0.001    0.000    0.001    0.000 {method 'reduce' of
'numpy.ufunc' objects}
"""

Here you can see that test_return_bc() is 80% the speed of
test_in_place().  The 'cumtime' is the important number, this is the
total amount of time spent in that function.  So the speed up is not
huge.  For Python 2:

"""
$ python2.7 edward.py
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
         14000972 function calls (14000966 primitive calls) in 38.625 
seconds

   Ordered by: internal time, function name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  2000000   18.373    0.000   19.086    0.000 edward.py:10(really_slow)
  2000000    8.798    0.000    9.576    0.000 edward.py:5(in_place)
  2000000    5.937    0.000    5.937    0.000 edward.py:15(return_bc)
        1    1.839    1.839    7.785    7.785 
edward.py:30(test_return_bc)
  4000021    1.141    0.000    1.141    0.000 {range}
        1    1.086    1.086   10.675   10.675 
edward.py:18(test_in_place)
        1    1.070    1.070   20.165   20.165 
edward.py:24(test_really_slow)
  4000198    0.379    0.000    0.379    0.000 {len}
"""

Hmmm, Python 2 is faster than Python 3 for this example!  See for
yourself.  If you really think that making the code 1.25 times faster,
as shown in these tests, is worth your time, then this must be done in
a subversion branch (http://svn.gna.org/viewcvs/relax/branches/).
That way we can have timing tests between the trunk and the branch.
As this affects all dispersion models, the changes will be quite
disruptive.  And if the implementation is not faster or if it breaks
everything, then the branch can be deleted.  What ever you do, please
don't use a git-svn branch.

Regards,

Edward




On 9 May 2014 14:07, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> 
wrote:

Hi Edward.

How about this script?
Here I try to pass the back the r2eff values, and then set them in the
back_calculated class object.
Will this work ??

Or else I found this post about updating values.
http://stackoverflow.com/questions/14916284/in-python-class-object-how-to-auto-update-attributes
They talk about
@property
and setter, which I dont get yet. :-)

Best
Troels


---------------


def loop_rep(x, nr):
    y = [98, 99]
    for i in range(nr):
        x[i] = y[i]

def not_loop_rep(x, nr):
    y = [98, 99]
    x = y

def not_loop_rep_new(x, nr):
    y = [98, 99]
    x = y
    return x


class MyClass:
    def __init__(self, x):
        self.x = x
        self.nr = len(x)

    def printc(self):
        print self.x, self.nr

    def calc_loop_rep(self, x=None, nr=None):
        loop_rep(x=self.x, nr=self.nr)

    def calc_not_loop_rep(self, x=None, nr=None):
        not_loop_rep(x=self.x, nr=self.nr)

    def calc_not_loop_rep_new(self, x=None, nr=None):
        self.x = not_loop_rep_new(x=self.x, nr=self.nr)

print("For class where we loop replace ")
"Create object of class"
t_rep = MyClass([0, 1])
"Print object of class"
t_rep.printc()
"Calc object of class"
t_rep.calc_loop_rep()
" Then print"
t_rep.printc()

print("\nFor class where we not loop replace ")
" Now try with replace "
t = MyClass([3, 4])
t.printc()
t.calc_not_loop_rep()
t.printc()

print("\nFor class where we not loop replace ")
t_new = MyClass([5, 6])
t_new.printc()
t_new.calc_not_loop_rep_new()
t_new.printc()

2014-05-05 19:07 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

:)  It does slow it down a little, but that's unavoidable.  It's also
unavoidable in C, Fortran, Perl, etc.  As long as the number of
operations in that loop is minimal, then it's the best you can do.  
If
it worries you, you could run a test where you call the target
function say 1e6 times, with and without the loop to see the timing
difference.  Or simply running in Python 2:

for i in xrange(1000000):
 x = 1

Then try:

for i in xrange(100000000):
 x = 2

These two demonstrate the slowness of the Python loop.  But the 
second
case is extreme and you won't encounter that much looping in these
functions.  So while it is theoretically slower than C and Fortran
looping, you can probably see that no one would care :)  Here is
another test, with Python 2 code:

"""
import cProfile as profile

def loop_1e6():
    for i in xrange(int(1e6)):
        x = 1

def loop_1e8():
    for i in xrange(int(1e8)):
        x = 1

def sum_conv():
    for i in xrange(100000000):
        x = 2 + 2.

def sum_normal():
    for i in xrange(100000000):
        x = 2. + 2.

def test():
    loop_1e6()
    loop_1e8()
    sum_normal()
    sum_conv()

profile.runctx('test()', globals(), locals())
"""

Running this on my system shows:

"""
         7 function calls in 6.707 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall 
filename:lineno(function)
        1    0.000    0.000    6.707    6.707 <string>:1(<module>)
        1    2.228    2.228    2.228    2.228 aaa.py:11(sum_conv)
        1    2.228    2.228    2.228    2.228 aaa.py:15(sum_normal)
        1    0.000    0.000    6.707    6.707 aaa.py:19(test)
        1    0.022    0.022    0.022    0.022 aaa.py:3(loop_1e6)
        1    2.228    2.228    2.228    2.228 aaa.py:7(loop_1e8)
        1    0.000    0.000    0.000    0.000 {method 'disable' of
'_lsprof.Profiler' objects}
"""

That should be self explanatory.  The better optimisation targets are
the repeated maths operations and the maths operations that can be
shifted into the target function or the target function
initialisation.  Despite the numbers above which prove my int to 
float
speed argument as utter nonsense, it might still good to remove the
int to float conversions, if only to match the other functions.

Regards,

Edward




On 5 May 2014 18:45, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> 
wrote:

The reason why I ask, is that I am afraid that this for loop slows
everything down.

What do you think?

2014-05-05 18:41 GMT+02:00 Edward d'Auvergne <edward@xxxxxxxxxxxxx>:

This is not Python specific though :)  As far as I know, C uses
pass-by-value for arguments, unless they are arrays or other funky
objects/functions/etc..  This is the same behaviour as Python.
Pass-by-reference and pass-by-value is something that needs to be
mastered in all languages, whether or not you have pointers to play
with.

Regards,

Edward



On 5 May 2014 18:30, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> 
wrote:

This reminds me:

http://combichem.blogspot.dk/2013/08/you-know-what-really-grinds-my-gears-in.html

2014-05-05 17:52 GMT+02:00 Edward d'Auvergne 
<edward@xxxxxxxxxxxxx>:

Hi,

This is an important difference.  In the first case 
(back_calc[i] =
Minty[i]), what is happening is that your are copying the data 
into a
pre-existing structure.  In the second case (back_calc = Minty), 
the
existing back_calc structure will be overwritten.  Therefore the
back_calc structure in the calling code will be modified in the 
first
case but not the second.  Here is some demo code:

def mod1(x):
    x[0] = 1

def mod2(x):
    x = [3, 2]

x = [0, 2]
print(x)
mod1(x)
print(x)
mod2(x)
print(x)

I don't know of a way around this.

Regards,

Edward


On 5 May 2014 17:42, Troels Emtekær Linnet 
<tlinnet@xxxxxxxxxxxxx> wrote:

Hi Edward.

In the library function of b14.py, i am looping over
the dispersion points to put in the data.

    for i in range(num_points):

        # The full formula.
        back_calc[i] = Minty[i]

Why can I not just set:
back_calc = Minty

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@xxxxxxx

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: Why does one have to loop over the dispersion points?

Header

Content

Related Messages