mailRe: crash during multi-processor grid search for 'sphere' model


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Jack Skinner on August 02, 2013 - 18:44:
I was able to get some more details about the problem by following your suggestion to create a mini dataset.

First of all, the problem doesn't occur when using a single processor and only occurs for DIFF_MODEL = 'sphere' when using mpirun.

I also didn't get the problem for mpirun when I used a subset of my data, suggesting that the problem is with my input. I found an anomaly in my 500 MHz noe data that appears to be the culprit. Here is what that file looks like for the subset that did cause a crash:
# Parameter description:  The NOE.
# mol_name    res_num    res_name    spin_num    spin_name    value                   error                   
None          5          MET         None        None         None                    None
DesG          7          THR         15          N               0.718432367479729      0.0326572234546261
DesG          8          TYR         29          N                0.65401446462264       0.033499533240881
DesG          5          MET         -1          N              -0.420574006604573       0.037900692900274

This happened because of how I loaded my spins. The pdb file I used is missing the N-terminus, so I adding the MET using residue.create(res_num=5, res_name='MET'), which I later realized should be residue.create(res_num=5, res_name='MET',mol_name='DesG').

So now it's working for me. I have created bug report #21001 with relevant files attached in case you want to track down the "bug" to prevent future user error.


On Fri, Aug 2, 2013 at 8:12 AM, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:
Hi Jack,

Welcome to the relax mailing lists!  Wow, it's rare to see bugs in the
model-free part of relax nowadays!  Though this problem has been seen
before by Martin Ballaschk, see the thread at:

http://thread.gmane.org/gmane.science.nmr.relax.user/1291

The thread continues at:

http://thread.gmane.org/gmane.science.nmr.relax.user/1312

Unfortunately you will see that no solution was found in the end,
apart from a problem with the input relaxation data.  But you will see
that the problem is identical to what you see.  Would you be able to
perform the checks I mention in my response at:

http://thread.gmane.org/gmane.science.nmr.relax.user/1291/focus=1292

If you can create a mini data set of 1 or 2 residues, as mentioned in
that link, which reproduce the bug, I will then be able to add that to
the relax test suite and eliminate the bug within a very short time.
Files can be attached directly to your bug report
(http://gna.org/bugs/?21001).  With a good truncated data set and
script, it usually takes me about 5-10 min to eliminate the bug.  If I
can replicate it, I can quickly eliminate it.

Cheers,

Edward



On 1 August 2013 22:30, Jack Skinner <skinnerj@xxxxxxxxxxxx> wrote:
> I am having a problem with the dauvergne_protocol.py script crashing when I
> set DIFF_MODEL = 'sphere'
> All other models run fine except for 'final', which complains about the
> missing 'sphere' results.
>
> I am running this script with 8 processors using mpirun
> mpirun -np 8 relax --multi='mpi4py' -n 7 --tee rnd2.log relax_rnd2.py
>
> The error message starts:
> relax> grid_search(lower=None, upper=None, inc=11, constraints=True,
> verbosity=1)
>
> Over-fit spin deselection:
> No spins have been deselected.
> Only diffusion tensor parameters will be used.
> Parallelised diffusion tensor grid search.
> Traceback (most recent call last):
>   File "/export/home/skinnerj/mypackages/relax-2.2.5/multi/processor.py",
> line 479, in run
> ...many lines that might not be helpful...
> then:
> Capturing_exception:
>   File "/export/home/skinnerj/mypackages/relax-2.2.5/multi/processor.py",
> line 522, in run
>     command.run(self, completed)
>   File
> "/export/home/skinnerj/mypackages/relax-2.2.5/specific_fns/model_free/multi_processor_commands.py",
> line 129, in run
>     results = self.optimise()
>   File
> "/export/home/skinnerj/mypackages/relax-2.2.5/specific_fns/model_free/multi_processor_commands.py",
> line 175, in optimise
>     results = grid_point_array(func=self.mf.func, args=(),
> points=self.opt_params.subdivision, verbosity=self.opt_params.verbosity)
>   File "/export/home/skinnerj/mypackages/relax-2.2.5/minfx/grid.py", line
> 264, in grid_point_array
>     n = len(points[0])
>
> Nested Exception from sub processor
> Rank: 1 Name: kff4-pid14220
> Exception type: IndexError
> Message: index out of bounds
>
>
> Any suggestions will be appreciated. Thanks!
>
>
> John "Jack" Skinner, Ph. D. | Postdoctoral Fellow | University of Chicago
> Lab: 773.834.0658 | GCIS Room W107E, 929 E. 57th St. Chicago, IL 60637
>
> _______________________________________________
> relax (http://www.nmr-relax.com)
>
> This is the relax-users mailing list
> relax-users@xxxxxxx
>
> To unsubscribe from this list, get a password
> reminder, or change your subscription options,
> visit the list information page at
> https://mail.gna.org/listinfo/relax-users
>



--
John "Jack" Skinner, Ph. D. | Postdoctoral Fellow | University of Chicago
Lab: 773.834.0658 | GCIS Room W107E, 929 E. 57th St. Chicago, IL 60637

Related Messages


Powered by MHonArc, Updated Thu Aug 08 15:20:05 2013