Thanks for confirming this! I'll go ahead and release the new 1.3.14 version of relax (http://www.nmr-relax.com/download.html) with the fix. I don't think I can do much more to decrease the memory usage, but I'll play around a little anyway as this should increase the scaling efficiency of relax on a cluster or grid of computers running in the mpi4py multi-processor mode. Cheers, Edward On 16 March 2012 17:30, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:
Hi Edward, Just confirming that this has indeed fixed the problem. Thank you very much. Hugh On 15 March 2012 19:25, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Ok, I think I have fixed the problem. You'll either need a new release of relax, when I make one, or the subversion copy of the main 1.3 line of relax for this. If you would like the changes without waiting for a new release of relax, the new code can be checked out from the repository by typing: $ svn co svn://svn.gna.org/svn/relax/1.3 relax-1.3 or if this doesn't work: $ svn co http://svn.gna.org/svn/relax/1.3 relax-1.3 If you already have a checked out copy, try typing: $ svn up This requires installation of the subversion program (http://subversion.tigris.org/). Regards, Edward On 15 March 2012 19:19, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Hi, I'm debugging this right now. From what I can see, the multi-processor code is clearly at fault. The issue is in the generic_fns/minimise.py file and the lines: # Get the Processor box singleton (it contains the Processor instance). processor_box = Processor_box() # Execute the queued elements. processor_box.processor.run_queue() This is repeated twice, but the problem is in the minimise() function. The problem is that the run_queue() call causes all the calculations to execute. Prior to this, all calculations will be queued up, either on the multi-processor or uni-processor stack. Previously, each calculation was being executed serially, i.e. one after the other. The serial calculations are not a problem as, at the end of each, the Python garbage collector would destroy all the data structures used by that calculation, freeing up the memory. But with the multi-processor code, this memory-freeing operation can only occur at the end of all calculations. The merger of the multi-processor relax branch eliminated the serial calculations. The solution I am working on is shifting the run_queue() call higher up into the model-free code. That is so the calculations for each Monte Carlo simulation is run serially. The parallelisation occurs at the residue level, not the Monte Carlo simulation level. Say there are 200 residues, then I am aiming at having only 200 calculations queued at once (there is one per spin), rather than 200 * 500 = 100,000 calculations queued. I am in the testing phase, so I have to make sure my fix doesn't break any other parts of relax. But you can test it youself if you like. You can delete all of the 'processor_box' lines from the generic_fns/minimise.py file. Then at the very end of the specific_fns/model_free/mf_minimise.py file, the last lines should look like: # Pass in the data and optimisation parameters. command.store_data(deepcopy(data_store), deepcopy(opt_params)) # Set up the model-free memo and add it to the processor queue. memo = MF_memo(model_free=self, model_type=data_store.model_type, spin=spin, sim_index=sim_index, scaling=scaling, scaling_matrix=data_store.scaling_matrix) processor.add_to_queue(command, memo) # Get the Processor box singleton (it contains the Processor instance). processor_box = Processor_box() # Execute the queued elements. processor_box.processor.run_queue() This fix is temporary and incomplete, but might just solve your problem. With these changes, my testing indicates that virtual memory usage drops from 1871 Mb down to 648 Mb! I will continue with the testing, and hopefully tomorrow I should have a permanent fix for you. Regards, Edward On 15 March 2012 18:46, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:This is really encouraging, thank you for your continued interest! I'm intrigued as to why this has not been seen before, do you have any feeling for this? It doesn't seem like this is a particularly large data set. Unfortunately running 450 sims seems to have succumbed as well, so I will try your temporary fix and await something more permenant. Cheers Hugh On 15 March 2012 17:25, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Hi, Using this results file, I can reproduce the problem. Well, I can at least see where the problem is. Running on my machine, a 64-bit laptop, the total memory usage (virtual memory) reaches 1871 Mb. As my laptop only has 2 Gb of RAM, this results in a lot of swapping to calculate the parameter errors but it does not cause the crashes you are seeing. The swapping is a very, very slow process and the relax CPU usage is almost 0% during this time. You can see this in the program top whereby memory is switching between 'RES' and 'SWAP'. So I see the same memory and CPU usage that you reported. For the testing, I am running the simple script: ----- state.load('results') fix('diff') # Simulations. monte_carlo.setup(number=500) monte_carlo.create_data() monte_carlo.initial_values() minimise('newton', max_iterations=1) eliminate() monte_carlo.error_analysis() # Create results files and plots of the data. state.save('mc', force=True) ----- On your 32-bit systems, this might be too much memory usage. I'm not sure why relax is using so much memory at this stage. You don't have much control of memory in the Python programming language which make things more difficult. But I'll have a look and see if I can decrease the memory footprint. Thanks for sending the results file, this should allow me to fix, or at least minimise this problem. Cheers, Edward P. S. If you are desparate to get results out, you might be able to avoid running out of memory by running your computer with the minimum number of programs running. You could switch to a VT (Ctrl-Alt-F1) and run 'init 3' as root. This will kill X and all X programs. You could then run relax in the terminal. This might just give you enough memory to survive. Otherwise, you could wait until I have some fixes. On 15 March 2012 17:11, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Hi Edward, Attached is the results.bz2 file from ellipsoid/round_14/opt/. I will try to see what happens if I set the number of simulations at 450. Hugh On 15 March 2012 11:14, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Hi, This is a very perculiar problem. As I cannot reproduce it, I can only guess what is going on. How big is the results.bz2 file for the final round of the selected diffusion tensor? Maybe if you send it in a private mail, I could then try to replicate the problem on a 32-bit virtual machine image. That way I'd be able to chase down the source of the problem. It could be the RAM, the virtual memory on 32-bit, the operating system, the specific python version, or relax. Although I am pretty sure that the problem lies outside of relax, it would be good to track down which exact operation down to the source code line number in relax is triggering the problem. That way I might be able to come up with some tricks for minimising the memory usage. For a temporary alternative, maybe you could try 450 simulations to see if you can get the results out. The results would be less precise, but still reasonable. I'm not sure if the results would be reasonable if you decreased the number much lower than that though. In the mean time, I might try to look into minimising memory usage at the Monte Carlo simulation optimisation stage. Your results file would also significantly help in optimisating this. Regards, Edward On 14 March 2012 18:41, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Hi Edward, A couple of comments - as I've said before this problem occurs on the 3 different machines that I have tried it on, so I don't think that any memory corruption or other computer-specific issues are the cause of this. You mentioned our running on the 32-bit version of the OS, I guess this is because we have various machines (some new, some old) networked together all running the same system. Perhaps some of the older machines can't handle the 64-bit version. I tried decreasing the number of MC simulations to 250, and surprisingly this works! Dissatisfying a solution as it may be, just a quick fix for me as I need to start writing my thesis v. soon, is there a way of running this twice, loading in both sets of simulations and doing the error analysis on the combined back-calculated datasets? I have only run relax using dauvergne_protocol so I may need guiding through this! For a more proper fix, I have monitored the memory usage in top, as you suggested. The CPU usage during the minimisations is 100%, and the memory usage slowly climbs to, eg. 11% by sim 200, 22% by sim 500. At this point, the memory usage increases at about triple the rate it was doing during the minimisations, up to a point that depends on how many simulations have been done - 18% if 100 sims, 33% if 200, etc. If the program is not destined to crash (no. of sims < 250), at this stage the CPU usage is still 100% and stays very high until the program finishes. As soon as the program enters the "MC sim elimination" phase the memory usage stops rising and stays constant until the program has finished. If I set the no. of sims to 500, the memory usage following minimisation of all 500 simulations increases until it gets to ~90% where it fluctuates between 87-91% for a long time until eventually the program crashes. Crucially, during this time the CPU usage is almost nothing. At no point does the program enter the elimination phase, so something is clearly happening at the end of the minimisation - and the rising memory usage even when the no. of sims is low would also suggest this. This result is backed up by the behavior of the program when lines of dauvergne_protocol.py in auto_analyses are commented out. If you comment out the entire MC simulation part, the program works fine. Additionally, removal of the self.interpreter.eliminate(), self.interpreter.monte_carlo.error_analysis(), or final self.write_results() lines (individually or at the same time) are insufficient to rescue the program. Only if the self.interpreter.minimise command (and therefore the subsequent lines concerning MC simulations) are commented out does the program finish. So, as I said above, it is something taking place at the end of the minimisation which is causing the problem. Below is the output if only the self.interpreter.eliminate() line is removed:- Simulation 490 Simulation 491 Simulation 492 Simulation 493 Simulation 494 Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 debug> Execution lock: Release by 'script UI' ('script' mode). debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 136, in run Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/relax", line 7, in <module> relax.start() File "/home1/hugh/installs/relax-1.3/relax.py", line 103, in start processor.run() File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 139, in run self.callback.handle_exception(self, e) File "/home1/hugh/installs/relax-1.3/multi/__init__.py", line 227, in default_handle_exception _traceback.print_exc(file=_sys.stderr) File "/usr/lib/python2.6/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 125, in print_exception print_tb(tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 69, in print_tb line = linecache.getline(filename, lineno, f.f_globals) File "/usr/lib/python2.6/linecache.py", line 14, in getline lines = getlines(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 40, in getlines return updatecache(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 136, in updatecache lines = fp.readlines() MemoryError 2745.731u 413.433s 5:12:02.27 16.8% 0+0k 169003312+64io 4488658pf+0w -------- If both the self.interpreter.eliminate() and self.interpreter.monte_carlo.error_analysis() lines are removed:- Simulation 490 Simulation 491 Simulation 492 Simulation 493 Simulation 494 Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 debug> Execution lock: Release by 'script UI' ('script' mode). debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 136, in run File "/home1/hugh/installs/relax-1.3/multi/__init__.py", line 240, in default_init_master File "/home1/hugh/installs/relax-1.3/relax.py", line 174, in run File "/home1/hugh/installs/relax-1.3/prompt/interpreter.py", line 300, in run File "/home1/hugh/installs/relax-1.3/prompt/interpreter.py", line 610, in run_script File "/home1/hugh/installs/relax-1.3/prompt/interpreter.py", line 495, in interact_script File "/home1/hugh/installs/relax-1.3/prompt/interpreter.py", line 383, in exec_script File "/usr/lib/python2.6/runpy.py", line 140, in run_module File "/usr/lib/python2.6/runpy.py", line 34, in _run_code File "/home1/hugh/data/pgm298bq/relax/dauvergne_protocol_repos.py", line 216, in <module> File "/home1/hugh/installs/relax-1.3/auto_analyses/dauvergne_protocol.py", line 223, in __init__ File "/home1/hugh/installs/relax-1.3/auto_analyses/dauvergne_protocol.py", line 701, in execute File "/home1/hugh/installs/relax-1.3/prompt/minimisation.py", line 294, in minimise File "/home1/hugh/installs/relax-1.3/generic_fns/minimise.py", line 221, in minimise File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 149, in run_queue File "/home1/hugh/installs/relax-1.3/specific_fns/model_free/multi_processor_commands.py", line 130, in run File "/home1/hugh/installs/relax-1.3/specific_fns/model_free/multi_processor_commands.py", line 113, in optimise File "/usr/local/lib/python2.6/dist-packages/minfx/generic.py", line 321, in generic_minimise File "/usr/local/lib/python2.6/dist-packages/minfx/newton.py", line 44, in newton File "/usr/local/lib/python2.6/dist-packages/minfx/base_classes.py", line 271, in minimise File "/usr/local/lib/python2.6/dist-packages/minfx/newton.py", line 215, in update_newton File "/home1/hugh/installs/relax-1.3/maths_fns/mf.py", line 925, in d2func_mf File "/home1/hugh/installs/relax-1.3/maths_fns/jw_mf.py", line 2309, in calc_S2_te_d2jw_dte2 Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/relax", line 7, in <module> relax.start() File "/home1/hugh/installs/relax-1.3/relax.py", line 103, in start processor.run() File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 139, in run self.callback.handle_exception(self, e) File "/home1/hugh/installs/relax-1.3/multi/__init__.py", line 227, in default_handle_exception _traceback.print_exc(file=_sys.stderr) File "/usr/lib/python2.6/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 126, in print_exception lines = format_exception_only(etype, value) MemoryError 2702.244u 398.632s 5:12:15.23 16.5% 0+0k 165775728+64io 4501002pf+0w ------------ And finally if the self.interpreter.eliminate(), self.interpreter.monte_carlo.error_analysis(), and final self.write_results() lines are all removed:- Simulation 490 Simulation 491 Simulation 492 Simulation 493 Simulation 494 Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 debug> Execution lock: Release by 'script UI' ('script' mode). debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/relax", line 7, in <module> relax.start() File "/home1/hugh/installs/relax-1.3/relax.py", line 103, in start processor.run() File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 139, in run self.callback.handle_exception(self, e) File "/home1/hugh/installs/relax-1.3/multi/__init__.py", line 227, in default_handle_exception _traceback.print_exc(file=_sys.stderr) File "/usr/lib/python2.6/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 125, in print_exception print_tb(tb, limit, file) MemoryError 2684.091u 374.487s 5:00:55.96 16.9% 0+0k 163383960+64io 4258400pf+0w ----------- If there are any other tests you would like me to do then let me know but as I've said this is rather time consuming! I hope this helps. Hugh On 7 March 2012 11:33, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Hi, This is a very difficult problem to debug, as it currently appears to be triggered outside of relax. For another test, could you remove the lines: # Write the final results. ########################## # Create results files and plots of the data. self.write_results() after the Monte Carlo simulation user function calls in the auto_analyses/dauvergne_protocol.py module and retest? The strange thing is that it is printing out simulation number 500, which should mean that the simulation code has terminated. But it hasn't printed out the results.write user function message, which means that it hasn't even started to create the results file (well, thats what the above check tests). If this is the case, the error occurs precisely between the self.interpreter.monte_carlo.error_analysis() and self.write_results() in the analysis, and not during the simulations or results file creation. But the strange thing is that relax is doing absolutely nothing at that point! Another test for this would be to load your molecule twice, but as two different molecules, and duplicate all the relaxation data loading, effectively doubling your data size, and seeing if relax dies at around simulation number 250. If it is the Monte Carlo code causing the problem of running out of memory, then you should not be able to reach simulation 500. But that is only if the problem is relax using too much memory, which needs to be checked via top, etc. Hopefully we can get to the bottom of this problem and design a work around. Another interesting test would be to deactive your swap partition (as system administrator) and re-run relax. The command 'free -m' should show that the swap size is zero or non-existant. This will cut the amount of virtual memory available, hopefully triggering the MemoryError prior to simulation 500. The slowing down of your computer towards the end is caused by Linux heavily swapping out relax from RAM to the swap partition and back again, so if you have no swap, this slow down will be avoided. That should speed up the process. Another speed up would be to change line 701 of auto_analyses/dauvergne_protocol.py from: self.interpreter.minimise(self.min_algor, func_tol=self.opt_func_tol, max_iterations=self.opt_max_iterations) to self.interpreter.minimise(self.min_algor, func_tol=self.opt_func_tol, max_iterations=3, constraints=False) This will massively speed up the simulations, though the resultant errors will be underestimated. This is no issue for these tests though. Regards, Edward On 7 March 2012 11:46, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Hi Edward, Haven't yet asked your questions on our set-up, but I have run another couple of tests if I remove the Monte Carlo simulation commands from the dauvergne_protocol.py in auto_analyses then the program runs fine without any problems at all. If I leave those commands in, the error message I get with the repository version of the program is below:- Simulation 490 Simulation 491 Simulation 492 Simulation 493 Simulation 494 Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 debug> Execution lock: Release by 'script UI' ('script' mode). debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 136, in run Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/relax", line 7, in <module> relax.start() File "/home1/hugh/installs/relax-1.3/relax.py", line 103, in start processor.run() File "/home1/hugh/installs/relax-1.3/multi/uni_processor.py", line 139, in run self.callback.handle_exception(self, e) File "/home1/hugh/installs/relax-1.3/multi/__init__.py", line 227, in default_handle_exception _traceback.print_exc(file=_sys.stderr) File "/usr/lib/python2.6/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 125, in print_exception print_tb(tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 69, in print_tb line = linecache.getline(filename, lineno, f.f_globals) File "/usr/lib/python2.6/linecache.py", line 14, in getline lines = getlines(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 40, in getlines return updatecache(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 136, in updatecache lines = fp.readlines() MemoryError 8539.337u 606.393s 10:03:45.92 25.2% 0+0k 247727448+0io 7012268pf+0w ---------- Hugh On 6 March 2012 17:47, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Hi Edward, Thank you for your suggestions. Unfortunately I am not in a position to answer many of your questions and so I will respond once I have had time to consult somebody with more understanding of our system. The behavior of the machine fits with your initial suggestion of a recursive exception, so monitoring the memory use during the analysis is definitely a good idea. In the meantime I will try to remove the Monte Carlo commands from the dauvergne_protocol script and see what happens. Given that the error does not occur when the number of simulations is reduced I am confident that it is the simulations which are causing the error. I will get back to you on this. Unfortunately given the speed of the machine it is necessary to wait for one run to give an error before starting another, and this can take several hours! Hugh On 6 March 2012 17:39, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:According to http://en.wikipedia.org/wiki/Pentium_4, the dual-core Pentium 4's should be 64-bit. Is there a reason why the computer is running the 32-bit version of Ubuntu rather than the 64-bit version? Is this running directly on the CPU or in a virtual machine? It could be that you are simply limited to 4 Gb of total memory as this is a 32-bit machine (2**32 is 4 Gb). Can you monitor relax at around simulation 495 until the crash using top? Or maybe monitor the virtual memory size with: $ grep VmSize /proc/xxxx/status replacing 'xxxx' with the PID of the relax process. Cheers, Edward On 6 March 2012 17:59, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Ah, that's better. Well you have a total of 8 Gb of memory for relax to sit in, which is plenty for relax, so this is quite strange. I've loaded much more data than that before on systems with less total memory, and it hasn't been a problem. The Python MemoryError message appears to only occur once you have run out of total memory, and should kick in before the Linux OOM killer kills the program. Could there be any programs or settings limiting the amount of memory individual programs or users are allowed on your machine? If you run 'dmesg', is there anything at the end about memory issues? And does it always say "Simulation 500" followed by the Traceback messages? I wonder if this could be a memory corruption issue. This is quite a strange error. Do you have RAM testing programs available? For anther test, could you open 'auto_analyses/dauvergne_protocol.py' and delete the lines: # Simulations. self.interpreter.monte_carlo.setup(number=self.mc_sim_num) self.interpreter.monte_carlo.create_data() self.interpreter.monte_carlo.initial_values() self.interpreter.minimise(self.min_algor, func_tol=self.opt_func_tol, max_iterations=self.opt_max_iterations) self.interpreter.eliminate() self.interpreter.monte_carlo.error_analysis() With these removed, it will test if it the Monte Carlo simulations triggering the MemoryError. Cheers, Edward On 6 March 2012 17:43, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:I certainly thought I had run it in the right directory, but having done it again it has just outputted that info.py has been updated, so perhaps not, my apologies! free -m gives the following output:- total used free shared buffers cached Mem: 2011 1958 53 0 2 41 -/+ buffers/cache: 1914 97 Swap: 5889 927 4962 ----------- And relax -i gives the following:- relax repository checkout Molecular dynamics by NMR data analysis Copyright (C) 2001-2006 Edward d'Auvergne Copyright (C) 2006-2012 the relax development team This is free software which you are welcome to modify and redistribute under the conditions of the GNU General Public License (GPL). This program, including all modules, is licensed under the GPL and comes with absolutely no warranty. For details type 'GPL' within the relax prompt. Assistance in using the relax prompt and scripting interface can be accessed by typing 'help' within the prompt. ImportError: relaxation curve fitting is unavailable, the corresponding C modules have not been compiled. Processor fabric: Uni-processor. Hardware information: Machine: i686 Processor: Endianness: little Total RAM size: 2011 Mb Total swap size: 5889 Mb Operating system information: System: Linux Release: 2.6.32-38-generic Version: #83-Ubuntu SMP Wed Jan 4 11:13:04 UTC 2012 GNU/Linux version: Ubuntu 10.04 lucid Distribution: Ubuntu 10.04 lucid Full platform string: Linux-2.6.32-38-generic-i686-with-Ubuntu-10.04-lucid Python information: Architecture: 32bit ELF Python version: 2.6.5 Python branch: tags/r265 Python build: r265:79063, Apr 16 2010 13:09:56 Python compiler: GCC 4.4.3 Libc version: glibc 2.4 Python implementation: CPython Python revision: 79063 Python executable: /usr/bin/python Python flags: sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, tabcheck=0, verbose=0, unicode=0, bytes_warning=0) Python float info: sys.floatinfo(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.2204460492503131e-16, radix=2, rounds=1) Python module path: ['', '/home1/hugh/installs/relax-1.3', '/home1/hugh/programs/mattfit', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/usr/lib/pymodules/python2.6', '/usr/lib/python2.6/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/local/lib/python2.6/dist-packages', '/home1/hugh/installs/relax-1.3/extern/scientific_python/linux2'] Python packages (most are optional): Package Installed Version Path minfx True Unknown /usr/local/lib/python2.6/dist-packages/minfx bmrblib False numpy True 1.3.0 /usr/lib/python2.6/dist-packages/numpy scipy True 0.7.0 /usr/lib/python2.6/dist-packages/scipy wxPython False mpi4py False epydoc False optparse True 1.5.3 /usr/lib/python2.6/optparse.pyc readline True /usr/lib/python2.6/lib-dynload/readline.so profile True /usr/lib/python2.6/profile.pyc bz2 True /usr/lib/python2.6/lib-dynload/bz2.so gzip True /usr/lib/python2.6/gzip.pyc os.devnull True /usr/lib/python2.6/os.pyc Compiled relax C modules: Relaxation curve fitting: False ----------- Alas the computer is running extremely slowly so I expect I will get the MemoryError at some point again. Cheers Hugh On 6 March 2012 16:19, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Did you run 'svn up' in the base relax directory? And did you see a message that some files were updated? What happens if you type 'free -m' on your system? Cheers, Edward On 6 March 2012 15:59, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Okay, made those changes, and it is now running. Thanks. I've run "svn up" and got the message " Skipped '.' ", which I guess I can ignore. Output of relax -i:- relax repository checkout Molecular dynamics by NMR data analysis Copyright (C) 2001-2006 Edward d'Auvergne Copyright (C) 2006-2012 the relax development team This is free software which you are welcome to modify and redistribute under the conditions of the GNU General Public License (GPL). This program, including all modules, is licensed under the GPL and comes with absolutely no warranty. For details type 'GPL' within the relax prompt. Assistance in using the relax prompt and scripting interface can be accessed by typing 'help' within the prompt. ImportError: relaxation curve fitting is unavailable, the corresponding C modules have not been compiled. Processor fabric: Uni-processor. Hardware information: Machine: i686 Processor: Endianness: little Operating system information: System: Linux Release: 2.6.32-38-generic Version: #83-Ubuntu SMP Wed Jan 4 11:13:04 UTC 2012 GNU/Linux version: Ubuntu 10.04 lucid Distribution: Ubuntu 10.04 lucid Full platform string: Linux-2.6.32-38-generic-i686-with-Ubuntu-10.04-lucid Python information: Architecture: 32bit ELF Python version: 2.6.5 Python branch: tags/r265 Python build: r265:79063, Apr 16 2010 13:09:56 Python compiler: GCC 4.4.3 Libc version: glibc 2.4 Python implementation: CPython Python revision: 79063 Python executable: /usr/bin/python Python flags: sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, tabcheck=0, verbose=0, unicode=0, bytes_warning=0) Python float info: sys.floatinfo(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.2204460492503131e-16, radix=2, rounds=1) Python module path: ['', '/home1/hugh/installs/relax-1.3', '/home1/hugh/programs/mattfit', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/usr/lib/pymodules/python2.6', '/usr/lib/python2.6/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/local/lib/python2.6/dist-packages', '/home1/hugh/installs/relax-1.3/extern/scientific_python/linux2'] Python packages (most are optional): Package Installed Version Path minfx True Unknown /usr/local/lib/python2.6/dist-packages/minfx bmrblib False numpy True 1.3.0 /usr/lib/python2.6/dist-packages/numpy scipy True 0.7.0 /usr/lib/python2.6/dist-packages/scipy wxPython False mpi4py False epydoc False optparse True 1.5.3 /usr/lib/python2.6/optparse.pyc readline True /usr/lib/python2.6/lib-dynload/readline.so profile True /usr/lib/python2.6/profile.pyc bz2 True /usr/lib/python2.6/lib-dynload/bz2.so gzip True /usr/lib/python2.6/gzip.pyc os.devnull True /usr/lib/python2.6/os.pyc Compiled relax C modules: Relaxation curve fitting: False --------- Hugh On 6 March 2012 14:54, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Oh, that's a recent change as well. There are a few changes in the main line which will require small changes to the relax input scripts. I have standardised the value.set and related user functions across the different analysis types. The 'bond_length' parameter needs to be replaced with 'r'. You might encounter a few of these: 'bond_length' -> 'r' 'heteronucleus' -> 'heteronuc_type' 'proton' -> 'proton_type' I think that's all you'll need to modify in the script, the rest should be internally handled within relax. As for the memory error, I have updated the relax information print out to show more details. Could you run 'svn up' and resend the output of 'relax -i'? Cheers, Edward On 6 March 2012 15:29, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Thanks for that, relax does now open properly. However, I now get the following error when trying to run the dauvergne_protocol.py script:- relax> relax_data.read(ri_id='NOE_600', ri_type='NOE', frq=600133000.0, file='noe_600', dir=None, spin_id_col=None, mol_name_col=None, res_num_col=1, res_name_col=None, spin_num_col=None, spin_name_col=None, data_col=3, error_col=4, sep=None, spin_id=None) Opening the file 'noe_600' for reading. relax> value.set(val=1.0200000000000001e-10, param='bond_length', spin_id=None) debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): File "/home1/hugh/installs/relax-1.3/prompt/interpreter.py", line 383, in exec_script runpy.run_module(module, globals) File "/usr/lib/python2.6/runpy.py", line 140, in run_module fname, loader, pkg_name) File "/usr/lib/python2.6/runpy.py", line 34, in _run_code exec code in run_globals File "/home1/hugh/data/pgm298bq/relax/dauvergne_protocol.py", line 205, in <module> value.set(1.02 * 1e-10, 'bond_length') File "/home1/hugh/installs/relax-1.3/prompt/value.py", line 239, in set value.set(val=val, param=param, spin_id=spin_id) File "/home1/hugh/installs/relax-1.3/generic_fns/value.py", line 356, in set set_param_values(param=param, value=val, spin_id=spin_id, force=force) File "/home1/hugh/installs/relax-1.3/specific_fns/model_free/main.py", line 2316, in set_param_values raise RelaxError("The parameter '%s' is unknown." % mf_params[i]) RelaxError: RelaxError: The parameter 'bond_length' is unknown. Hugh On 6 March 2012 13:43, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Ok, the message is taking longer than normal to appear. The float128 problem was a recent change in relax. I had just forgotten that float128 is absent from 32-bit numpy. So on your machine you have no access to such high precision values. This is only a problem if you do your own code and analysis development within relax, as no part of relax currently uses float128. Regards, Edward On 6 March 2012 14:34, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Hi Hugh, Just quickly, I fixed the float128 import problem. See my commit message at https://mail.gna.org/public/relax-commits/2012-03/msg00025.html (you might have to wait a few min for the post to be archived and the link to work). Just type 'svn up' and the problem will be gone. I'll look at the other problem now. Regards, Edward On 6 March 2012 14:19, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Hi Edward, I have removed the suggested files from uni_processor.py and the error given is below:- Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): File "/progs/Linux/bin/relax13", line 7, in <module> relax.start() File "/progs/relax-1.3.13/relax.py", line 100, in start processor.run() File "/progs/relax-1.3.13/multi/uni_processor.py", line 135, in run self.callback.init_master(self) File "/progs/relax-1.3.13/multi/processor.py", line 263, in default_init_master self.master.run() File "/progs/relax-1.3.13/relax.py", line 171, in run self.interpreter.run(self.script_file) File "/progs/relax-1.3.13/prompt/interpreter.py", line 300, in run return run_script(intro=self.__intro_string, local=locals(), script_file=script_file, quit=self.__quit_flag, show_script=self.__show_script, raise_relax_error=self.__raise_relax_error) File "/progs/relax-1.3.13/prompt/interpreter.py", line 610, in run_script return console.interact(intro, local, script_file, quit, show_script=show_script, raise_relax_error=raise_relax_error) File "/progs/relax-1.3.13/prompt/interpreter.py", line 495, in interact_script exec_script(script_file, local) File "/progs/relax-1.3.13/prompt/interpreter.py", line 383, in exec_script runpy.run_module(module, globals) File "/usr/lib/python2.6/runpy.py", line 140, in run_module fname, loader, pkg_name) File "/usr/lib/python2.6/runpy.py", line 34, in _run_code exec code in run_globals File "/home1/hugh/data/pgm298bq/relax/dauvergne_protocol.py", line 216, in <module> dAuvergne_protocol(pipe_name=name, diff_model=DIFF_MODEL, mf_models=MF_MODELS, local_tm_models=LOCAL_TM_MODELS, grid_inc=GRID_INC, min_algor=MIN_ALGOR, mc_sim_num=MC_NUM, conv_loop=CONV_LOOP) File "/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 230, in __init__ status.exec_lock.release() MemoryError 3171.454u 7.344s 53:10.23 99.6% 0+0k 16400+0io 14pf+0w --------------- At the same time (as the computer hangs for hours each time I try to test this), I thought I would try to run the most up-to-date of relax, but this has proved problematic. We have installed subversion and downloaded the latest repository as you described. We then had to install "minfx" which was not required for the release version 1.3.13. Is this correct? After this, when trying to run relax, arg_check.py returns an error trying to import "float128" from numpy. It may be that we are running an old version, I will look into this this afternoon. Hugh On 6 March 2012 12:04, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:One other point is that I've recently been working on cleaning up, simplifying, and fixing a few IO stream bugs the multi-processor package in the 1.3 line of the relax repository since I tagged and released the 1.3.13 version. So there is a slight chance that I may accientally have fixed the problem already. But you'll need to check out the most up to date repository code with the subversion program to test this. Regards, Edward On 6 March 2012 12:58, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Actually, looking the code, it appears as though the multi-processor error handling is failing. Which means that there are probably two bugs here. One is causing the program to fail, the second in the multi-processor error handling is causing the memory error, hiding the frist problem. Could you replace the run() function in multi/uni_processor.py code? The original code should be: def run(self): try: self.pre_run() self.callback.init_master(self) self.post_run() except Exception, e: self.callback.handle_exception(self, e) Could you replace it with: def run(self): self.pre_run() self.callback.init_master(self) self.post_run() and see what the error message is? If what I said above is correct, then this should uncover the first bug (which then triggers the second). By the way, how long does it take to test this problem? Cheers, Edward On 6 March 2012 12:49, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:Hi, Thank you for all the details. That really helps in narrowing down the bug! From all the info, the bug is without doubt within the multi-processor package. Cheers. If you have a little time, we can work together and fix this. The changes/fixes will go into the repository version, so you'll need a copy of that for testing. Do you have the subversion program installed? If so, you can obtain the most up to date copy from the repository by typing: $ svn co svn://svn.gna.org/svn/relax/1.3 relax-1.3 or if this doesn't work: $ svn co http://svn.gna.org/svn/relax/1.3 relax-1.3 If you already have a checked out copy, you can update to the newest copy by typing: $ svn up I'll look at the second bug you've identifed later. It would be appreciated if you created a second bug report for that problem too. I would not recommend reverting to earlier relax versions due to the number of bug fixes and other problems solved since then. This should not affect the model-free results, but the bugs could bite elsewhere. Hopefully I can fix this problem quickly. Cheers, Edward P. S. For reference, the bug report is https://gna.org/bugs/?19528. On 6 March 2012 12:18, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Hi Edward, Your description sounds very likely the cause of the problem, during the time where no output is being produced, the computer gets gradually more and more slow before finally giving up. The error is reproducible such that I have tried it on a couple of different machines and it has failed several times at the same stage. The error messages tend to vary a little, however. Here are another 2 of the outputs given when the program has failed (I should clarify all of these messages came from runs done on the same machine, and the second was run with option "-d" but it hasn't helped very much):- Simulation 492 Simulation 493 Simulation 494 Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 Traceback (most recent call last): File "/usr/local/relax-1.3.13/multi/uni_processor.py", line 136, in run self.callback.init_master(self) File "/usr/local/relax-1.3.13/multi/processor.py", line 263, in default_init_m aster Traceback (most recent call last): File "/usr/local/bin/relax", line 7, in <module> relax.start() File "/usr/local/relax-1.3.13/relax.py", line 100, in start processor.run() File "/usr/local/relax-1.3.13/multi/uni_processor.py", line 139, in run self.callback.handle_exception(self, e) File "/usr/local/relax-1.3.13/multi/processor.py", line 250, in default_handle _exception traceback.print_exc(file=sys.stderr) File "/usr/lib/python2.6/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 125, in print_exception print_tb(tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 69, in print_tb line = linecache.getline(filename, lineno, f.f_globals) File "/usr/lib/python2.6/linecache.py", line 14, in getline lines = getlines(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 40, in getlines return updatecache(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 136, in updatecache lines = fp.readlines() MemoryError 9203.219u 258.488s 8:05:09.46 32.5% 0+0k 90962440+0io 2215895pf+0w ------------------ Simulation 489 Simulation 490 Simulation 491 Simulation 492 Simulation 493 Simulation 494 Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 debug> Execution lock: Release by 'script UI' ('script' mode). debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): File "/progs/Linux/bin/relax13", line 7, in <module> relax.start() File "/progs/relax-1.3.13/relax.py", line 100, in start processor.run() File "/progs/relax-1.3.13/multi/uni_processor.py", line 139, in run self.callback.handle_exception(self, e) File "/progs/relax-1.3.13/multi/processor.py", line 250, in default_handle_exc eption traceback.print_exc(file=sys.stderr) File "/usr/lib/python2.6/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) MemoryError 8006.268u 542.873s 8:34:11.81 27.7% 0+0k 225824840+0io 6192344pf+0w ------------------ If the number of MC simulations is dropped even as little as 100, the program finishes the fitting successfully, though I then get an error message to do with the grace files (i've not been using them so I'm not bothered about this though it will be of interest to you no doubt):- Data pipe 'final': The ts value of 2.6285e-08 is greater than 1.9714e-08, elimi nating simulation 94 of spin system ':218@N'. Data pipe 'final': The ts value of 2.6285e-08 is greater than 1.9714e-08, elimi nating simulation 95 of spin system ':218@N'. relax> monte_carlo.error_analysis(prune=0.0) relax> results.write(file='results', dir='/ld10c/home1/hugh/data/pgm298bq/relax/ final', compress_type=1, force=True) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/results.bz2' for w riting. relax> grace.write(x_data_type='spin', y_data_type='s2', spin_id=None, plot_data ='value', file='s2.agr', dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace' , force=True, norm=False) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/s2.agr' for writing. relax> grace.write(x_data_type='spin', y_data_type='s2f', spin_id=None, plot_dat a='value', file='s2f.agr', dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grac e', force=True, norm=False) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/s2f.agr' for writing. relax> grace.write(x_data_type='spin', y_data_type='s2s', spin_id=None, plot_dat a='value', file='s2s.agr', dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grac e', force=True, norm=False) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/s2s.agr' for writing. relax> grace.write(x_data_type='spin', y_data_type='te', spin_id=None, plot_data ='value', file='te.agr', dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace' , force=True, norm=False) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/te.agr' for writing. relax> grace.write(x_data_type='spin', y_data_type='tf', spin_id=None, plot_data ='value', file='tf.agr', dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace' , force=True, norm=False) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/tf.agr' for writing. relax> grace.write(x_data_type='spin', y_data_type='ts', spin_id=None, plot_data ='value', file='ts.agr', dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace' , force=True, norm=False) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/ts.agr' for writing. relax> grace.write(x_data_type='spin', y_data_type='rex', spin_id=None, plot_dat a='value', file='rex.agr', dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grac e', force=True, norm=False) Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/rex.agr' for writing. debug> Execution lock: Release by 'script UI' ('script' mode). debug> Execution lock: Release by 'script UI' ('script' mode). Traceback (most recent call last): File "/ld10c/progs/relax-1.3.13/prompt/interpreter.py", line 383, in exec_scri pt runpy.run_module(module, globals) File "/usr/lib/python2.6/runpy.py", line 140, in run_module fname, loader, pkg_name) File "/usr/lib/python2.6/runpy.py", line 34, in _run_code exec code in run_globals File "/ld10c/home1/hugh/data/pgm298bq/relax/dauvergne_protocol_lessMC.py", lin e 216, in <module> dAuvergne_protocol(pipe_name=name, diff_model=DIFF_MODEL, mf_models=MF_MODEL S, local_tm_models=LOCAL_TM_MODELS, grid_inc=GRID_INC, min_algor=MIN_ALGOR, mc_s im_num=MC_NUM, conv_loop=CONV_LOOP) File "/ld10c/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 223 , in __init__ self.execute() File "/ld10c/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 710 , in execute self.write_results() File "/ld10c/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 837 , in write_results self.interpreter.grace.write(x_data_type='spin', y_data_type='rex', file='re x.agr', dir=dir, force=True) File "/ld10c/progs/relax-1.3.13/prompt/grace.py", line 103, in write grace.write(x_data_type=x_data_type, y_data_type=y_data_type, spin_id=spin_i d, plot_data=plot_data, file=file, dir=dir, force=force, norm=norm) File "/ld10c/progs/relax-1.3.13/generic_fns/grace.py", line 366, in write write_xy_header(sets=len(data[0]), file=file, data_type=[x_data_type, y_data _type], seq_type=seq_type, set_names=set_names, norm=norm) File "/ld10c/progs/relax-1.3.13/generic_fns/grace.py", line 600, in write_xy_h eader units = return_units(data_type[i]) File "/ld10c/progs/relax-1.3.13/specific_fns/model_free/main.py", line 2394, i n return_units raise RelaxNoSpinSpecError RelaxNoSpinSpecError: RelaxError: The spin system must be specified. 3510.479u 20.741s 59:07.76 99.5% 0+0k 0+3368io 0pf+0w ------------------ Finally, this is the output from relax --info as requested:- relax 1.3.13 Molecular dynamics by NMR data analysis Copyright (C) 2001-2006 Edward d'Auvergne Copyright (C) 2006-2011 the relax development team This is free software which you are welcome to modify and redistribute under the conditions of the GNU General Public License (GPL). This program, including all modules, is licensed under the GPL and comes with absolutely no warranty. For details type 'GPL' within the relax prompt. Assistance in using the relax prompt and scripting interface can be accessed by typing 'help' within the prompt. Processor fabric: Uni-processor. Hardware information: Machine: i686 Processor: System information: System: Linux Release: 2.6.32-37-generic Version: #81-Ubuntu SMP Fri Dec 2 20:35:14 UTC 2011 GNU/Linux version: Ubuntu 10.04 lucid Distribution: Ubuntu 10.04 lucid Full platform string: Linux-2.6.32-37-generic-i686-with-Ubuntu-10.04-lucid Software information: Architecture: 32bit ELF Python version: 2.6.5 Python branch: tags/r265 Python build: r265:79063, Apr 16 2010 13:09:56 Python compiler: GCC 4.4.3 Python implementation: CPython Python revision: 79063 Numpy version: 1.3.0 Libc version: glibc 2.4 Python packages (most are optional): Package Installed Version Path minfx True Unknown /ld10c/progs/relax-1.3.13/minfx bmrblib True Unknown /ld10c/progs/relax-1.3.13/bmrblib numpy True 1.3.0 /usr/lib/python2.6/dist-packages/numpy scipy True 0.7.0 /usr/lib/python2.6/dist-packages/scipy wxPython False mpi4py False epydoc False optparse True 1.5.3 /usr/lib/python2.6/optparse.pyc readline True /usr/lib/python2.6/lib-dynload/readline.so profile True /usr/lib/python2.6/profile.pyc bz2 True /usr/lib/python2.6/lib-dynload/bz2.so gzip True /usr/lib/python2.6/gzip.pyc os.devnull True /usr/lib/python2.6/os.pyc Compiled relax C modules: Relaxation curve fitting: True ------------------ Apologies for all the detail but I'm not really sure what to do here. If it is the multi-processor part of it that is failing, is installing relax 1.3.11 an option? I previously has 1.3.10 installed and the commands seem to have changed quite a lot since then. What is your opinion on the validity of error estimates based on 100 simulations? Thanks Hugh On 5 March 2012 08:33, Edward d'Auvergne <edward.dauvergne@xxxxxxxxx> wrote:Hi Hugh, I'm pretty sure this error has not been encountered before. It at least hasn't been reported. I've never seen anything close to this before, but I would guess that this is an infinitely recursive exception (the error is being caught but, in the process, the error occurs again, being caught a second time, then the 3rd error occurs, is caught a 3rd time, with this continuing until your computer runs out of RAM and swap space and relax is killed by the operating system). The error seems to occur within the error handing portion of Gary Thompson's multi-processor framework (you are using the uni-processor fabric of the framework here), so maybe Gary might know a solution? Is this error reproducible? For testing, can you drop the number of Monte Carlo simulations down to say 5? Running relax with the debug flag might also help: $ relax --debug or: $ relax -d Are you using the GUI or scripting user interface? The output of: $ relax --info might also be useful. As for your data set being too large, relax has been used on much bigger systems before so this should not be an issue. One last thing, would you be able to create a bug report for this error (https://gna.org/bugs/?func=additem&group=relax)? All of the info/log files can then be pasted/attached there, and it is a useful future reference for anyone who encounters the same or a similar bug. Cheers, Edward On 2 March 2012 12:33, Hugh RW Dannatt <h.dannatt@xxxxxxxxxxxxxxx> wrote:Dear All, Having completed the fitting of 1 dataset without any problems, I am now moving onto another. Everything has worked fine until I change the DIFF_MODEL to "final" and try to run the program again to get error estimates on my fitted parameters. The program successfully re-opens all the results file and selects the diffusion model. Then all 500 simulations are done without issue, but as soon as the program has finished this, it stops outputting anything to the screen for a long time (>12 hrs). During this time, the CPU and Memory use is very high and the computer runs slowly. Eventually I get a "Memory Error" and a whole load of messages outputted to the screen, which I have pasted below. I should emphasize that all the stages of running this program with different diffusion models have run fine, and the computer I'm using is a relatively fast machine (dual core Pentium 4, 2 GB RAM). Has anyone had a similar problem? This dataset is larger than the previous one which fit without issue (current one has 6 measurements per 176 residues), but I can't imagine this being the cause of this problem. Thanks Hugh ---- Simulation 485 Simulation 486 Simulation 487 Simulation 488 Simulation 489 Simulation 490 Simulation 491 Simulation 492 Simulation 493 Simulation 494 Simulation 495 Simulation 496 Simulation 497 Simulation 498 Simulation 499 Simulation 500 Traceback (most recent call last): File "/progs/relax-1.3.13/multi/uni_processor.py", line 136, in run self.callback.init_master(self) File "/progs/relax-1.3.13/multi/processor.py", line 263, in default_init_master self.master.run() File "/progs/relax-1.3.13/relax.py", line 171, in run self.interpreter.run(self.script_file) File "/progs/relax-1.3.13/prompt/interpreter.py", line 300, in run return run_script(intro=self.__intro_string, local=locals(), script_file=script_file, quit=self.__quit_flag, show_script=self.__show_script, raise_relax_error=self.__raise_relax_error) File "/progs/relax-1.3.13/prompt/interpreter.py", line 610, in run_script return console.interact(intro, local, script_file, quit, show_script=show_script, raise_relax_error=raise_relax_error) File "/progs/relax-1.3.13/prompt/interpreter.py", line 495, in interact_script exec_script(script_file, local) File "/progs/relax-1.3.13/prompt/interpreter.py", line 383, in exec_script runpy.run_module(module, globals) File "/usr/lib/python2.6/runpy.py", line 140, in run_module fname, loader, pkg_name) File "/usr/lib/python2.6/runpy.py", line 34, in _run_code exec code in run_globals File "/home1/hugh/data/pgm298bq/relax/dauvergne_protocol.py", line 216, in <module> dAuvergne_protocol(pipe_name=name, diff_model=DIFF_MODEL, mf_models=MF_MODELS, local_tm_models=LOCAL_TM_MODELS, grid_inc=GRID_INC, min_algor=MIN_ALGOR, mc_sim_num=MC_NUM, conv_loop=CONV_LOOP) File "/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 223, in __init__ Traceback (most recent call last): File "/progs/Linux/bin/relax13", line 7, in <module> relax.start() File "/progs/relax-1.3.13/relax.py", line 100, in start processor.run() File "/progs/relax-1.3.13/multi/uni_processor.py", line 139, in run self.callback.handle_exception(self, e) File "/progs/relax-1.3.13/multi/processor.py", line 250, in default_handle_exception traceback.print_exc(file=sys.stderr) File "/usr/lib/python2.6/traceback.py", line 227, in print_exc print_exception(etype, value, tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 125, in print_exception print_tb(tb, limit, file) File "/usr/lib/python2.6/traceback.py", line 69, in print_tb line = linecache.getline(filename, lineno, f.f_globals) File "/usr/lib/python2.6/linecache.py", line 14, in getline lines = getlines(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 40, in getlines return updatecache(filename, module_globals) File "/usr/lib/python2.6/linecache.py", line 136, in updatecache lines = fp.readlines() MemoryError 9078.655u 666.933s 10:55:29.66 24.7% 0+0k 241482000+0io 6665721pf+0w _______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list relax-users@xxxxxxx To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729 _______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list relax-users@xxxxxxx To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729-- Hugh Dannatt PhD Student Researcher Prof. Jon Waltho Lab Department of Molecular Biology & Biotechnology University of Sheffield Firth Court Western Bank Sheffield S10 2TN 0114 222 2729