Re: r27845 - /trunk/multi/processor.py -- June 08, 2015

On 27 May 2015 at 03:09,  <tlinnet@xxxxxxxxxxxxx> wrote:

Author: tlinnet
Date: Wed May 27 03:09:59 2015
New Revision: 27845

URL: http://svn.gna.org/viewcvs/relax?rev=27845&view=rev
Log:
Suggestion for fix 2, where jobs are continously replenished when other 
jobs are finished.

Bug #23618: (https://gna.org/bugs/index.php?23618): queuing system for 
multi processors is not well designed.

Modified:
    trunk/multi/processor.py

Modified: trunk/multi/processor.py
URL: 
http://svn.gna.org/viewcvs/relax/trunk/multi/processor.py?rev=27845&r1=27844&r2=27845&view=diff
==============================================================================
--- trunk/multi/processor.py    (original)
+++ trunk/multi/processor.py    Wed May 27 03:09:59 2015
@@ -585,6 +585,8 @@

         running_set = set()
         idle_set = set([i for i in range(1, self.processor_size()+1)])
+        all_jobs = list(reversed(xrange(1, len(queue)+1)))
+        completed_jobs = []

         if self.threaded_result_processing:
             result_queue = Threaded_result_queue(self)
@@ -606,8 +608,9 @@
             while len(running_set) != 0:
                 # Debugging printout.
                 if verbosity.level():
-                    print('\nIdle set:    %s' % idle_set)
-                    print('Running set: %s' % running_set)
+                    print('\n')
+                    print('Running nr of jobs: %i' % len(running_set))
+                    print('Completed jobs: %s' % len(completed_jobs))

                 # Get the result.
                 result = self.master_receive_result()
@@ -616,6 +619,13 @@
                 if result.completed:
                     idle_set.add(result.rank)
                     running_set.remove(result.rank)
+                    completed_jobs.append(all_jobs.pop())
+                    if len(queue) != 0:
+                        # Add new to que
+                        command = queue.pop()
+                        dest = result.rank
+                        self.master_queue_command(command=command, 
dest=dest)
+                        running_set.add(dest)

                 # Add to the result queue for instant or threaded 
processing.
                 result_queue.put(result)


Hi Troels,

Are you sure these changes to Gary's multi-processor code have the
intended result?  From my timings before and after this change, with
the bug.py and bug.bz2 files attached to https://gna.org/bugs/?23618
and the command "mpirun -np 6 /data/relax/relax-trunk/relax -d
--multi='mpi4py' bug.py", there are no real time differences.  But
that is probably because all my 8 CPU cores run at the same speed.
Maybe a better test than MC simulations would be for a per-residue
parallelisation where each calculation for each residue takes a
different amount of time to complete.  Does this work if the chunked
operation is restored (
http://thread.gmane.org/gmane.science.nmr.relax.scm/25596/focus=7593
)?

Note a few more points:

- The xrange() function should not be used, as this kills the
multi-processor on Python 3.
- The print("\n") also introduces 2 newlines, which is probably not
the intent here.
- I find that seeing the running and idle set printed out in debugging
mode to be very useful.
- Maybe change "Running nr of jobs:..." to "Running jobs" to match the
syntax of "Completed jobs".

Gary might have some memory as to why the running set is not
replenished until after all results in the set are complete.  There
might be other reasons for this behaviour.

Cheers,

Edward

Re: r27845 - /trunk/multi/processor.py

Header

Content

Related Messages