mailRe: Using multi-processor for model_free


Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Header


Content

Posted by Mahdi, Sam on September 30, 2016 - 01:12:
Hi Troels,

I will upload a bug report with the pdb file, the script I use, and the
data I'm using.

Sincerely,
Sam

On Thu, Sep 29, 2016 at 2:25 PM, Troels Emtekær Linnet <
tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam.

Hm...

I had a look in: pipe_control/minimise.py

The trouble start with line
"elif values[i] in [None, {}, []]:"

Where the index "i" is running out of bounds.

The index is drawn from range(n).
n = len(names)
names = api.get_param_names(model_info)
values = api.get_param_values(model_info)

So "something" is not aligned well in the data structures.
It seems that the index of parameter names exceeds parameter values.


One guess is, that the selection of
* diff_model
* mf_models
* local_tm_models

is not correctly set. But I reach my limit of being able to help you.
Edward is the expert here, but he is on paternity leave.


Another possibility is that some of the spins are in "select" mode, where
they maybe should be in "deselect" mode.

Maybe the spins do not carry any data from before, and somehow relax
expect this.

It's very tricky to figure out!

A bug report, some minimum data, and a script which make the bug occur can
solve this.

Then I can write a systemtest, if it relax which is failing.

Best
Troels




2016-09-29 0:23 GMT+02:00 Mahdi, Sam <sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,
Update on both proteins: So for protein 1, I can upload all the spins (H
and N), but then I recieve an error. This is the error I recieved for
protein 2 as well. These are both dimer pdb files. Meaning they have 2 sets
(set A) and set (B) (e.g. http://www.rcsb.org/pdb/explor
e/explore.do?structureId=1DJ8 this pdb protein has 4 sets, A,B,C, and D
ours only have A and B). For both these proteins I recieve this error
 File "/home/crowlab/relax-4.0.2/multi/processor.py", line 494, in run
    self.callback.init_master(self)
  File "/home/crowlab/relax-4.0.2/multi/__init__.py", line 318, in
default_init_master
    self.master.run()
  File "/home/crowlab/relax-4.0.2/relax.py", line 199, in run
    self.interpreter.run(self.script_file)
  File "/home/crowlab/relax-4.0.2/prompt/interpreter.py", line 279, in
run
    return run_script(intro=self.__intro_string, local=locals(),
script_file=script_file, show_script=self.__show_script,
raise_relax_error=self.__raise_relax_error)
  File "/home/crowlab/relax-4.0.2/prompt/interpreter.py", line 585, in
run_script
    return console.interact(intro, local, script_file,
show_script=show_script, raise_relax_error=raise_relax_error)
  File "/home/crowlab/relax-4.0.2/prompt/interpreter.py", line 484, in
interact_script
    exec_script(script_file, local)
  File "/home/crowlab/relax-4.0.2/prompt/interpreter.py", line 363, in
exec_script
    runpy.run_module(module, globals)
  File "/usr/lib64/python2.7/runpy.py", line 180, in run_module
    fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/crowlab/relax-4.0.2/RGS4_modelfree_sample_script.py", line
31, in <module>
    dAuvergne_protocol(pipe_name=name,pipe_bundle=pipe_bundle,di
ff_model=DIFF_MODEL,mf_models=MF_MODELS,local_tm_models=LOCA
L_TM_MODELS,grid_inc=GRID_INC,min_algor=MIN_ALGOR,mc_sim_
num=MC_NUM,conv_loop=CONV_LOOP)
  File "/home/crowlab/relax-4.0.2/auto_analyses/dauvergne_protocol.py",
line 246, in __init__
    self.execute()
  File "/home/crowlab/relax-4.0.2/auto_analyses/dauvergne_protocol.py",
line 600, in execute
    self.multi_model(local_tm=True)
  File "/home/crowlab/relax-4.0.2/auto_analyses/dauvergne_protocol.py",
line 888, in multi_model
    self.interpreter.minimise.grid_search(inc=self.grid_inc)
  File "/home/crowlab/relax-4.0.2/prompt/uf_objects.py", line 225, in
__call__
    self._backend(*new_args, **uf_kargs)
  File "/home/crowlab/relax-4.0.2/pipe_control/minimise.py", line 172,
in grid_search
    model_lower, model_upper, model_inc = grid_setup(lower, upper, inc,
verbosity=verbosity, skip_preset=skip_preset)
  File "/home/crowlab/relax-4.0.2/pipe_control/minimise.py", line 341,
in grid_setup
    elif values[i] in [None, {}, []]:
IndexError: index out of bounds

Which from my understanding basically means, the co-ordinates of the
spins are out of the acceptable range for relax. I've checked all the
co-ordinates for both, nothing is extreme or outlandish (all within a range
of -20 to 20).
Is relax unable to process pdb files that are dimers (with 2 sets A and
B).? Furthermore, is it unable to process trimers and tetramers?

Sincerely,
Sam

On Wed, Sep 28, 2016 at 1:44 PM, Mahdi, Sam <sam.mahdi.846@xxxxxxxxxxx>
wrote:

Hey Troels,

I ran the relax -x and recieve this error at the GUI tests
=============
= GUI tests =
=============

........................**
Gtk:ERROR:gtkfilesystemmodel.c:746:gtk_file_system_model_sort:
assertion failed: (r == n_visible_rows)
Abort (core dumped)
crowlab: [~/relax-4.0.2]>


On Wed, Sep 28, 2016 at 1:30 PM, Mahdi, Sam <sam.mahdi.846@xxxxxxxxxxx>
wrote:

Hi Troels,

An update on protein number 1: I have successfully resolved the
problem. Initially the pdb file had HN instead of just H for the backbone
hydrogens. So it couldn't read it. I changed all the HN to H. Then I
recieved the error
RelaxError: Multiple alternate location indicators are present in the
PDB file, but the desired coordinate set has not been specified
By removing the extra N, all the text for the 3D location (the
co-ordinates) for the HN were shifted a space (no longer aligned). Once I
aligned them all, relax was able to read all the spins. So its working 
now.
I'm currently running the test suite as well.

Sincerely,
Sam

On Wed, Sep 28, 2016 at 11:45 AM, Troels Emtekær Linnet <
tlinnet@xxxxxxxxxxxxx> wrote:

To test the speed difference between script and GUI,
you could try to run the full test-suite through the terminal or
inside the GUI.

That should give you a clue about time difference.


2016-09-28 20:32 GMT+02:00 Troels Emtekær Linnet <
tlinnet@xxxxxxxxxxxxx>:

If you get different results, for the same setup, this is not good.
Not at all !

Have you run the full relax test suite after installation?

http://wiki.nmr-relax.com/Installation_test

run it with:
relax -x

This takes about 1 Hour to run, and should not be used with multiple
processors.

Relax will test itself with thousands of unit tests and system tests,
and confer that all
results are the same.

If the system tests do not pass on each system, something fishy is
going on.

This is the best line of defence against "systems" acting weird due
to software/packages etc. etc.

Best
Troels



2016-09-28 9:44 GMT+02:00 Mahdi, Sam <sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

I wanted to give a bit of feedback on the results I've obtained
throughout the few weeks I've been using model free on relax. First 
off,
thank you guys (both you and Edward) immensly for your patience and 
help as
I attempted to understand and work relax. Secondly, I have noticed a
difference between using the gui and the terminal (using scripts to 
run
relax). I've currently finished about 3 runs using the gui, and 3 runs
using the terminal (all the same data sets, same pdb files, same 
settings,
etc.). The gui takes about a week to finish, where the terminal takes
approximately 24 hours. I've tried this on 2 proteins, both had the 
same
results. The terminal is by far, much faster than the gui. Finally, 
I've
run 1 protein on 2 different computers (one using the multi-processor
platform, and on another computer, single-processor). The data sets 
were
all the same, the same pdb file, etc. , but the results I obtained 
from the
computers were slightly different. For the most part, most of the
difference in the data was similar, slightly different, but within the
error. But there were about 7 or 8 data points that appeared in one 
run on
one computer, and were absent in another run on another computer. This
happened in both the S^2 I analyzed and the Rex.
I.e. On the fedora 20 (single processor), say I had S^2 values for
amino acid 24,25 and 26 in the sequence, but not for 28,29, and 30. 
On the
fedora 24 (multi-processor), I might be missing a value for amino 
acid 24,
but I would have S^2 values for 28,29 and 30. Note the data sets are 
all
the same, the pdb files the same, settings the same, I used the same 
script
for both. The only difference between these runs is they were run on
different computer and one was single processor well another was 
multi.
I don't know why I obtained different data from 2 different runs,
when the input was all the same, just on different computers.
However the S^2 values do make sense. The Rex values were incredibly
small (1x10^-20), but there are some similarities (in terms of big Rex
values) between the Rex I obtained from relax, and CPMG data analyzed 
by
glove. So I have been able to obtain some reasonable data and results 
from
model_free using relax.

Sincerely,
Sam



On Mon, Sep 26, 2016 at 2:59 PM, Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx> wrote:

Hi Troels,


I have attempted the fix for running on a multi-processor platform
by creating the script you told me too, and I still got the same 
result. I
have uploaded a screenshot that shows again, relax is running in the
background, but there is no output for relax, nor can I input any 
commands.
The only output I recieve is this:
Running relax with NP=$NPROC+8|bc in multi-processor mode

And any command I type in after that gets no response.

I've also checked the spins via script. For 2 scenarios. Scenario
1- All hydrogens are kept as HN and Scenario 2- I change all the HN 
spins
to H.
The output from Scenario one is, it read all the Nitrogen spins
accordingly :
Objects:
  element: 'N'
  isotope: '15N'
  name: 'N'
  num: 1304
  pos: array([ 13.196999999999999,  15.218            ,
3.192            ])
  select: True
 hRGS4 178 THR #hRGS4:178@1304
Class containing all the spin system specific data.


Objects:
  element: 'N'
  isotope: '15N'
  name: 'N'
  num: 2617
  pos: array([ 22.696000000000002,  10.683999999999999,
-4.15             ])
  select: True
 hRGS4 178 THR #hRGS4:178@2617

But no hydrogens.

Scenario 2-  I still recieve the same error.
RelaxError: Multiple alternate location indicators are present in
the PDB file, but the desired coordinate set has not been specified.

Sincerely,
Sam

On Mon, Sep 26, 2016 at 2:19 PM, Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx> wrote:

Hi Troels,

I have attempted the fix for running on a multi-processor platform
by creating the script you told me too, and I still got the same 
result. I
have uploaded a screenshot that shows again, relax is running in the
background, but there is no output for relax, nor can I input any 
commands.
The only output I recieve is this:
Running relax with NP=$NPROC+8|bc in multi-processor mode

And any command I type in after that gets no response.

Sincerely,
Sam


On Sun, Sep 25, 2016 at 6:43 AM, Troels Emtekær Linnet <
tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam.

Try to load the pdb file and make a spin_loop over the
information.
How does the information look like?
http://wiki.nmr-relax.com/Tutorial_for_model-free_analysis_s
am_mahdi#Check_the_spin_containers_via_script

Regarding the multiprocessor on your Fedora 20 machine, try to
have a look at the bug.
https://gna.org/bugs/?25084

-----
I suspect there is a mismatch between two installations of relax.
One version of 2.x and one local of 4.x.
Try adding the full path to relax
-----

Try make a run script like this and copy it some where to your
PATH
myrelax
------

#!/bin/tcsh -fe

# Set the relax version used for this script.
set RELAX=/sbinlab2/tlinnet/software/NMR-relax/relax_trunk/relax

# Set number of available CPUs.
set NPROC=`nproc`
set NP=`echo $NPROC + 0 | bc `
echo "Running relax with NP=$NP in multi-processor mode"

# Run relax in multi processor mode.
mpirun -np $NP $RELAX --multi='mpi4py' $argv

2016-09-24 1:03 GMT+02:00 Mahdi, Sam <sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

Update on Protein number 1: So I was able to successfully run
model free with no problems on my protein (I don't know why it 
was giving
problems before). The reason it may have been giving issues 
though is the
protein I am working with forms a dimer at the concentrations we 
work with
(thus the results I have are for the Dimer form of the protein). 
The pdb
file though only has a monomer structure though. I have been able 
to obtain
the dimer pdb file using HADDOCK (docking program), but have come 
across a
few problems uploading the pdb file.
The initial problem was that all the hydrogens attached to the
nitrogen were HN labeled on the HADDOCK modified pdb file, and 
model free
could not understand what HN meant, and I would recieve this 
warning.
RelaxWarning: Cannot determine the element associated with atom
'HN'.

I could however load up all the Nitrogen, but naturally, with no
hydrogens, it wouldn't be able to calculate any bond vectors 
between
nitrogen and hydrogen. So I would recieve this error and the 
program would
close
RelaxError: The spin ID '@H' matches no spins.

To fix this, I changed all the HN spins, to just H, but then
recieved another error.
RelaxError: Multiple alternate location indicators are present
in the PDB file, but the desired coordinates set has not been 
specified.

I don't exactly understand what this error means. Is it saying
the program can't locate the 3D coordinates for the Hydrogen and 
Nitrogen?
If that is the case, why was it able to before, when it couldn't 
read any
of the Hydrogen spins. I'm just confused a bit as to what this 
error means.

Sincerely,
Sam

On Wed, Sep 21, 2016 at 3:18 PM, Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx> wrote:

Hi Troels,

Update on protein number 1. I ran it with only 5 simulations.
It took a while, but it ended up finishing. So I assume its due 
to bad data
simply slowing down the process.
Update on protein number 2. I ran it with only 2 spins as well,
and I still recieved the same error. I suspect its due to the 
pdb file. I'm
going to attempt to use another program to add the hydrogens to 
my pdb file
and try again.

Sincerely,
Sam

On Tue, Sep 20, 2016 at 1:31 PM, Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx> wrote:

Thats weird, I can open it up directly from the link you sent
me. I'll reupload it

On Tue, Sep 20, 2016 at 12:40 PM, Troels Emtekær Linnet <
tlinnet@xxxxxxxxxxxxx> wrote:

The file:
file #28673:  relax -i data for 4.0,2 a

https://gna.org/bugs/download.php?file_id=28673

Its emtpy?

2016-09-20 20:05 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

I am a bit confused what you are talking about. There is no
file labeled .?

On Tue, Sep 20, 2016 at 9:15 AM, Troels Emtekær Linnet <
tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam.

On
https://gna.org/bugs/?25084

I cannot open the file.?

In the meantime, try to specify the full path to relax. Not
just ./relax
but /home/user/xxx/relax

Best
Troels

2016-09-19 23:13 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

I just uploaded the 4.0.2 relax -i info. I already have
minfx 1.0.12 with 4.0.2. But I can't open relax on multi  
processor
platform for either version.

On Mon, Sep 19, 2016 at 10:47 AM, Troels Emtekær Linnet <
tlinnet@xxxxxxxxxxxxx> wrote:

Please upgrade!

Name               Installed    Version         Current
version
minfx              True         1.0.4
 1.0.12

relax information:
    Version:                 2.2.5
4.0.2


2016-09-19 19:41 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

I have uploaded the bug report for the issue with
running relax on multiple processors on my fedora 20 
computer. I will
upload the mpirun report bindings on the fedora 24 
computer later today
(that is not my lab so I don't have access to it, and the 
professor is not
in yet). If there is any more info that is needed please 
let me know.
Thanks again in advance.

Sincerely,
Sam

On Mon, Sep 19, 2016 at 10:24 AM, Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx> wrote:

Hi Troels,

Thanks for the quick response!

Protein 1: I will attempt to troubleshoot using the
advice you gave me. The problem occurs write after it 
indicates its writing
a file for prolate round_3 (so its about to start it). I 
will run it again
and post the output to give you a better idea. I'm 
pretty sure the output
was something like this
Over-fit spin deselection:
No spins have been deselected.
Resetting the minimisation statistics.
But I will double check and send you another email with
the actual output.
Protein 2:
I am using the sample script for dAuvergene protocol.
So the only thing I've changed since my previous run 
(the one that worked
that you wrote a tutorial for), was the pdb file and the 
data set I used.
The thing I suspected was causing an issue ,was the pdb 
file since I
slightly modified it, and thats really the only thing 
different from this
run versus the others.

Also side note, if I were to deselect the spins that I
don't have data for or I have bad data for, that 
wouldn't change any of the
calculations correct? I never have since I assumed relax 
would just ignore
all the amino acids I don't have data for, but it may 
help increase the
speed of my calculations if I just tell relax to just 
ignore the spins from
the start.

Sincerely,
Sam

On Mon, Sep 19, 2016 at 9:12 AM, Troels Emtekær Linnet
<tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam.

Happy to hear you that you get some progress.

Protein 1:
Can you help me to find out, if you are minimizing or
running Monte-Carlo simulations?
This COULD be the problem:

How relax works (at least how it works for relaxation
dispersion):
Step 1: Minimize the error for the target function.
Find the parameters which best match the target 
function to the data, by
minimizing the error.
Here each individual spin minimization is handed out
to a processor for calculation.

Step 2: Determine the error of the minimization by
monte carlo simulations.
Create (Standard 500) additional datasets with a copy
from the original. Modify each datapoint by an error, 
drawn from a gaussian
distribution where the width is described by the error 
of measurements.
Now hand out each of the datasets to the processor.
Each processor should now calculate the minimization 
for all the spins. The
minimization should be more quick, as the starting 
position is chosen from
Step 1.

Possible problem: One (or more) of the spins has
really bad data. So a little change of the data makes 
the minimization
space very different.
Think of a flat table. Where should the "minimization
ball" run into? Maybe you have created a small new bump 
in the table. This
is typically for "bad" data.

This could either be the measurement OR the error
estimation. Relax will keep on searching for 
minimization.
If you are "unlucky", some of the created datasets
will make relax hang for a very long time.

Unfortunately, it is NOT possible to ask a processor
about its "current" work, when it is doing a 
minimization for a whole
dataset.
And if it was, it would create an output of 64 spins
being minimized at the same time, creating a big mess, 
since the processors
are working alone. When doing Monte-Carlo simulations, 
relax are quite
silent. Only reporting when a whole dataset is done.

Is relax stuck in Monte-Carlo simulations?

Possible solution:
*) Set Monte-Carlo simulations to 3 (which is
minimum), and know that you have found the right 
minimum, but the error
estimation of the parameters are wrong.
*) Carefully inspect your data, deselecting all spins
which have "bad data". Look at their graphs. Consider 
working with as few
spins as possible, and work your way up! Working this 
way will greatly
increase your productivity.

Protein 2:
Are you setting the bonds for the minimization
manually?
This looks like the upper/lower bonds are specified
wrong. This is not easy to do. How are you doing it?

Best
Troels




2016-09-19 17:11 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

I have successfully been able to run the model-free
analysis on 64 cores. The issue appears to have been I 
simply did not
specify the spin number, so after looking at your 
tutorial and making the
proper modifications, it ran with no complications. 
The results are
somewhat reasonable. I decided to try to run 2 other 
proteins however; and
I've come across problems for both again.
Protein 1:
I set this up just like the tutorial, and it runs
with no warnings or errors; however, the run never 
finishes. At round_3 for
the prolate model when it starts to minimize it just 
stops. I don't mean
relax is stopped or closed, I mean it stops doing any 
calculations. Relax
is still open, and if I run the top command, I can 
still see something is
going on with the other cores, but nothing is being 
calculated. The run
with 64 cores is incredibly fast (under 4 hours), so I 
don't think it's
loading calculations or writing them, and I've left it 
there for over 24
hours, and it's still just sorta stuck. There are no 
errors, no outputs, it
just says its gonna start to minimize and then nothing 
happens after that.
Protein2:
This protein was a little different since the pdb
structure was a crystal structure. I had to use WhatIf 
to add the protons
onto the pdb file. The structure appears to load up 
fine, all the spins
appear to be read, data is loaded, vectors and are 
calculated and define,
but when I came to run the protocol this error pops up:
 File "/home/sam2/relax-4.0.2/multi/processor.py",
line 494, in run
    self.callback.init_master(self)
  File "/home/sam2/relax-4.0.2/multi/__init__.py",
line 318, in default_init_master
    self.master.run()
  File "/home/sam2/relax-4.0.2/relax.py", line 199,
in run
    self.interpreter.run(self.script_file)
  File "/home/sam2/relax-4.0.2/prompt/interpreter.py",
line 279, in run
    return run_script(intro=self.__intro_string,
local=locals(), script_file=script_file, 
show_script=self.__show_script,
raise_relax_error=self.__raise_relax_error)
  File "/home/sam2/relax-4.0.2/prompt/interpreter.py",
line 585, in run_script
    return console.interact(intro, local,
script_file, show_script=show_script, 
raise_relax_error=raise_relax_
error)
  File "/home/sam2/relax-4.0.2/prompt/interpreter.py",
line 484, in interact_script
    exec_script(script_file, local)
  File "/home/sam2/relax-4.0.2/prompt/interpreter.py",
line 363, in exec_script
    runpy.run_module(module, globals)
  File "/usr/lib64/python2.7/runpy.py", line 192, in
run_module
    fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in
_run_code
    exec code in run_globals
  File "/home/sam2/relax-4.0.2/HdeA_script.py", line
30, in <module>
    dAuvergne_protocol(pipe_name=n
ame,pipe_bundle=pipe_bundle,di
ff_model=DIFF_MODEL,mf_models=
MF_MODELS,local_tm_models=LOCA
L_TM_MODELS,grid_inc=GRID_INC,
min_algor=MIN_ALGOR,mc_sim_num
=MC_NUM,conv_loop=CONV_LOOP)
  File "/home/sam2/relax-4.0.2/auto_a
nalyses/dauvergne_protocol.py", line 246, in __init__
    self.execute()
  File "/home/sam2/relax-4.0.2/auto_a
nalyses/dauvergne_protocol.py", line 600, in execute
    self.multi_model(local_tm=True)
  File "/home/sam2/relax-4.0.2/auto_a
nalyses/dauvergne_protocol.py", line 888, in
multi_model
    self.interpreter.minimise.grid
_search(inc=self.grid_inc)
  File "/home/sam2/relax-4.0.2/prompt/uf_objects.py",
line 225, in __call__
    self._backend(*new_args, **uf_kargs)
  File 
"/home/sam2/relax-4.0.2/pipe_control/minimise.py",
line 172, in grid_search
    model_lower, model_upper, model_inc =
grid_setup(lower, upper, inc, verbosity=verbosity, 
skip_preset=skip_preset)
  File 
"/home/sam2/relax-4.0.2/pipe_control/minimise.py",
line 341, in grid_setup
    elif values[i] in [None, {}, []]:
IndexError: index 0 is out of bounds for axis 0 with
size 0
I should mention this error pops up when it decided
to calculate the first spins upper and lower bounds. 
So this isn't at the
minimization portion of the calculation (like in the 
previous bug). Thanks
in advance.

Sincerely,
Sam

On Wed, Sep 14, 2016 at 6:34 AM, Troels Emtekær
Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam.

To tackle this problem, I would advice to create
another bug.
Creation and closing of a bug "leaves trails", which
maybe will help another person, when googling for the 
same problem.

To help you, can you do a "relax -i" on both
computers?
That give some indication about package versions and
computer setup.

The first thing we need to establish, is that mpirun
is working.
We have to test the installation without relax.

Can you have a look at:
http://wiki.nmr-relax.com/OpenMPI

Try the different things like:
lscpu
mpirun --report-bindings -np 11 echo "hello world"
mpirun --report-bindings -np 4 relax --multi='mpi4py'

When we are confident about this, then we will try
make a small test script for relax.

Please try these things at both computers, and
provide 2 files with commands and output.

Then attach it to the bug report.





2016-09-14 6:40 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

So I saw the tutorial you put, and the main problem
was I had not specified my data was only for the 
Nitrogen spins. After
applying the spin column, my data loaded and relax 
ran model free with no
problem. I have a script that starts and runs relax 
and model free all
automatic, if you wish I can send it via email to 
you and you can upload it
to the tutorial wiki page. So I can successfully run 
model-free in script
mode for a uni-processor.
The problem now with the multi-processor is that
the script won't load. In the bug page I uploaded a 
screenshot where I had
input the 'mpirun -np 4 ../relax --multi='mpi4py' 
command, however I had no
output. I checked processes running in the 
background, and saw that there
was indeed 4 processess running in the background (1 
master and 3 slaves)
for relax; but there was no output, so I was unable 
to load any data, or
create a pipe or anything. This was only on the 
Fedora 24 computer, not the
Fedora 20. On the Fedora 20 computer, I was able to 
successfully open relax
on a multi processor platform. I can send the 
screenshots and the relax -i
for both computers again. I don't know why it 
doesn't work the fedora 24.
Do you know what could be causing this?

Thanks again in advance

On Tue, Sep 13, 2016 at 9:32 PM, Troels Emtekær
Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam

Can you send the mail again and include the
maillist?

Best Troels


Den tirsdag den 13. september 2016 skrev Mahdi,
Sam <sam.mahdi.846@xxxxxxxxxxx>:

HI Troels,

So I saw the tutorial you put, and the main
problem was I had not specified my data was only 
for the Nitrogen spins.
After applying the spin column, my data loaded and 
relax ran model free
with no problem. I have a script that starts and 
runs relax and model free
all automatic, if you wish I can send it via email 
to you and you can
upload it to the tutorial wiki page. So I can 
successfully run model-free
in script mode for a uni-processor.
The problem now with the multi-processor is that
the script won't load. In the bug page I uploaded 
a screenshot where I had
input the 'mpirun -np 4 ../relax --multi='mpi4py' 
command, however I had no
output. I checked processes running in the 
background, and saw that there
was indeed 4 processess running in the background 
(1 master and 3 slaves)
for relax; but there was no output, so I was 
unable to load any data, or
create a pipe or anything. This was only on the 
Fedora 24 computer, not the
Fedora 20. On the Fedora 20 computer, I was able 
to successfully open relax
on a multi processor platform. I can send the 
screenshots and the relax -i
for both computers again. I don't know why it 
doesn't work the fedora 24.
Do you know what could be causing this?

Thanks again in advance
Sam

On Tue, Sep 13, 2016 at 1:01 PM, Troels Emtekær
Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam.

I closed the 2 bug reports as invalid.

The data is not labelled correct.
But this can be corrected.

Please see this tutorial I wrote:
http://wiki.nmr-relax.com/Tuto
rial_for_model-free_analysis_sam_mahdi

I hope this give some guidance.

If you experience any new problems, please feel
free to ask!!

What you experience, will probably be the same
for many.
Your feedback is valuable for the development.

Please wait with using mpirun and multiple
processors, before you are absolutely sure
that it will run on 1 processor.

Bugfixing when using multiple processors is a
nightmare....

Best
Troels

2016-09-12 17:36 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

I just created another bug report. I simply
copy pasted the email, and uploaded the script 
files there.

Sincerely,
Sam

On Mon, Sep 12, 2016 at 5:14 AM, Troels Emtekær
Linnet <tlinnet@xxxxxxxxxxxxx> wrote:

Hi Sam.

Can you produce another bug report.

Please don't attach files to these mails as it
will strain  the mailinglists.

Cheers
Troels


Den lørdag den 10. september 2016 skrev Mahdi,
Sam <sam.mahdi.846@xxxxxxxxxxx>:

Hi Troels,

Additional question that I had, if you could
also look into this as well on Tuesday please. 
I have decided to try to
write a script to automate this whole process 
(since I won't be using the
gui to do model free), and I've come across a 
problem. I can successfully
open up relax using openmpi, and can load the 
pdb file, and assign all the
spins and isotopes; however, it appears it 
will only load one data file
(the very first one I'll have inputed in the 
script). I don't know if there
is a problem with how I wrote my script. Not 
only will it not load the rest
of my data sets, it won't actually run 
dAuvergne's protocol either, it'll
just load the data set and exit out of the 
program. Attached is the script
I wrote for relax.

This is the output once relax has loaded

script = 'model_free_sample_script.py'
------------------------------
------------------------------
----------------------------------------
from time import asctime, localtime
from auto_analyses.dauvergne_protocol import
dAuvergne_protocol
DIFF_MODEL=['local_tm','sphere
','prolate','oblate','ellipsoid','final']
MF_MODELS=['m0','m1','m2','m3'
,'m4','m5','m6','m7','m8','m9']
LOCAL_TM_MODELS=['tm0','tm1','
tm2','tm3','tm4','tm5','tm6','
tm7','tm7','tm8','tm9']
GRID_INC=11
MIN_ALGOR='newton'
MC_NUM=500
CONV_LOOP=True
pipe_bundle="mf(%s)"%asctime(localtime())
name="origin-"+pipe_bundle
pipe.create(name,'mf',bundle=pipe_bundle)
structure.read_pdb('2d9j.pdb',
set_mol_name='hRGS7')
structure.load_spins('@N',ave_pos=True)
structure.load_spins('@NE1',ave_pos=True)
structure.load_spins('@H',ave_pos=True)
structure.load_spins('@HE1',ave_pos=True)
spin.isotope('15N',spin_id='@N*')
spin.isotope('1H',spin_id='@H*')
relax_data.read(ri_id='R1_Agne
s',ri_type='R1',frq=599.642*1e6,
file='R1_Agnes',res_num_col=1,
data_col=2,error_col=3)
relax_data.read(ri_id='R2_Agne
s',ri_type='R2',frq=599.642*1e6,
file='R2_Agnes',res_num_col=1,
data_col=2,error_col=3)
relax_data.read(ri_id='ssNOE_A
gnes',ri_type='NOE',frq=599.642*1e6,
file='ssNOE_Agnes',res_num_col
=1,data_col=2,error_col=3)
relax_data.read(ri_id='R1_NMRF
AM',ri_type='R1',frq=799.642*1e6,
file='R1_NMRFAM',res_num_col=1
,data_col=2,error_col=3)
relax_data.read(ri_id='R2_NMRF
AM',ri_type='R2',frq=799.642*1e6,
file='R2_NMRFAM',res_num_col=1
,data_col=2,error_col=3)
relax_data.read(ri_id='ssNOE_N
MRFAM',ri_type='NOE',frq=799.642*1e6,
file='ssNOE_NMRFAM',res_num_co
l=1,data_col=2,error_col=3)
interatom.define(spin_id1='@N',spin_id2='@H',
direct_bond=True)
interatom.define(spin_id1='@NE1',spin_id2='@HE1',
direct_bond=True)
interatom.set_dist(spin_id1='@
N*',spin_id2='@H*',ave_dist=1.02*1e-10)
interatom.unit_vectors()
value.set(-172*1e-6,'csa',spin_id='@N*')
dAuvergne_protocol(pipe_name=n
ame,pipe_bundle=pipe_bundle,di
ff_model=DIFF_MODEL,mf_models=
MF_MODELS,local_tm_models=LOCA
L_TM_MODELS,grid_inc=GRID_INC,
min_algor=MIN_ALGOR,mc_sim_num
=MC_NUM,conv_loop=CONV_LOOP)

So it indicates that my script has loaded.
However, after it loads the spins from the pdb 
file, this is what happens
after my first data set has been loaded:

relax> relax_data.read(ri_id='R1_Agnes',
ri_type='R1', frq=599642000.0, 
file='R1_Agnes', dir=None, spin_id_col=None,
mol_name_col=None, res_num_col=1, 
res_name_col=None, spin_num_col=None,
spin_name_col=None, data_col=2, error_col=3, 
sep=None, spin_id=None)
Opening the file 'R1_Agnes' for reading.
RelaxWarning: The sequence data in the line
['Residue', 'R1', 'Error'] is invalid, the 
residue number data 'Residue' is
invalid.
RelaxWarning: The sequence data in the line
['1'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['2'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['3'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['4'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['5'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['6'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['7'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['8'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['9'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['10'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['11'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['16'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['17'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['18'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['21'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['22'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['23'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['26'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['27'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['28'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['31'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['40'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['46'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['58'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['61'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['62'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['63'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['73'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['76'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['79'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['81'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['82'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['85'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['94'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['97'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['99'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['106'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['115'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['121'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['126'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['127'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['134'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['135'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['136'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['137'] is invalid, the data is missing.
RelaxWarning: The sequence data in the line
['139'] is invalid, the data is missing.

RelaxError: The spin ID '#hRGS7:12'
corresponds to multiple spins, including 
'#hRGS7:12@N'
and '#hRGS7:12@H'.
crowlab: [~/relax-4.0.2]>

As you can see, I have all 6 data sets set to
load, but only the very first one appears to 
do so, and after it loads, it
just exits out of relax. Again, I don't know 
if this is a problem with how
I wrote the script. The Relax_script1 is the 
one that I load up to run the
whole thing. The model free script.py is just 
the script it reads once
relax has opened up.  Again, I can see all the 
spins are properly loaded,
and the isotopes are set. It just everything 
after the first data set that
doesn't load. Thanks again in advance.

Sincerely,
Sam

On Thu, Sep 8, 2016 at 10:15 AM, Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx> wrote:

Hi Troels,

Thank you so much. If there is any extra
info you need please let me know.

Sincerely,
Sam

On Thu, Sep 8, 2016 at 9:12 AM, Troels
Emtekær Linnet <tlinnet@xxxxxxxxxxxxx>
wrote:

Hi Sam.

I will have some time on Tuesday, and then
I will look at it.

Best
Troels


Den onsdag den 7. september 2016 skrev
Mahdi, Sam <sam.mahdi.846@xxxxxxxxxxx>:

Hello Troels,

I  uploaded all the files, and even added
in the entire output that i recieved using 
model free in script mode. I
didn't know if all the files uploaded need 
to have that link, so only the
initial files that were uploaded it, have 
it.
Thank you in advance for your help!
Sincerely,
Sam

On Wed, Sep 7, 2016 at 12:41 AM, Troels
Emtekær Linnet <tlinnet@xxxxxxxxxxxxx>
wrote:

Hi Sam.

You should be able to upload more files
after the initial upload.
In the comment thread, please also make a
link to this discussion.

https://mail.gna.org/public/re
lax-users/2016-09/threads.html#00001

Best
Troels



2016-09-06 19:10 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

Thank you for your reply. When I come to
upload my data though, I see there are 
only 4 available slots I can upload
my data. I have a total of 6 data files 
however, that need to be uploaded
(3 of each frequency). I also need to 
upload the relax -i of 2 different
computers, and the script file I've been 
using for a total of 9 files that
need to be uploaded. Is there a way to 
increase the amount I can upload, or
can I upload more after the initial 
submission?

On Mon, Sep 5, 2016 at 2:46 AM, Troels
Emtekær Linnet <tlinnet@xxxxxxxxxxxxx>
wrote:

Hi Sam.

To solve this problem, it would be
easier to have access to some of your 
data.
Can you upload to:
https://gna.org/bugs/?group=relax

Take each of your data files, and
delete all data, except 2 spins.
Also provide your script file, or a
description of which button you press in 
the GUI.

Please also provide information about
your system with:
relax -i

Then I will make a tutorial for you. To
be added here:
http://wiki.nmr-relax.com/Cate
gory:Tutorials

If there is a problem in relax, I will
write a systemtest which will solve the 
problem.
And the problem will never return.

If this a user error, the tutorial
should help to prevent this, and would 
be the first step before
adding/modifying the manual.

Regarding using mpirun.
Have a look at this page. Maybe it
helps.
http://wiki.nmr-relax.com/OpenMPI


Cheers.


2016-09-03 4:13 GMT+02:00 Mahdi, Sam <
sam.mahdi.846@xxxxxxxxxxx>:

Hello everyone,

So I was able to set up and run the
dauvergne_protocol successfully by
using the script in the wiki. The
problem I have come across now is the
program doesn't seem to read my data.
Using the gui interface I was able to
successfully load my data and run it.
When I upload my data using the
script command:
relax_data.read(ri_id='R1_Agne
s',ri_type='R1',frq=599.642*1e6,
file='R1_Agnes',res_num_col=1,
data_col=2,error_col=3)

The output file simply gives errors
for amino acids I don't have data for:
RelaxWarning: The sequence data in the
line ['1'] is invalid, the data is
missing.

This is fine as relax just ignores
these values and continues its
calculations. I only receive this
warning for values I don't have data 
for.
This is the same thing I got when
using the gui interface (the gui however
showed my values that I have data for
and the residue it corresponds to,
using the script I don't receive such
an output, I don't know whether this
is normal or not). However, since I
don't get this warning for every amino
acid, I assume this means it has read
the values for the other amino acids.
All of my data is the same, relax
warnings only pop up for amino acids 
that
I don't have data for. The problem is,
when I enter the dAuvergne protocol,
I get the protocol working, it starts
running local_tm however it appears
none of my data has been uploaded:
RelaxWarning: The spin '#hRGS7:2@N'
has been deselected because of missing
relaxation data
RelaxWarning: The spin '#hRGS7:3@N'
has been deselected because of missing
relaxation data

And I get that warning for every
single amino acid. From the output, it
appears to have read the file since it
knows exactly which amino acids I
don't have data for, but I don't know
why when it comes to running the
protocol, it tells me I havn't inputed
any data. I have typed everything
directly according to the script from
the wiki. From running the protoco,
it appears everything has properly
been uploaded, structure data, magnetic
dipole interactions, csa, the data
pipe, the analysis variables, the python
module imports, and setting up the
spins from the pdb file. It appears the
only error is from loading the actual
relaxation data.

On a completely unrelated side note, I
have been attempting to run relax on
multiple processors. I have tried 2
different computers, both fedora linux.
I have mpi4py and openmpi downloaded
on both. On one, I can get relax on
multiple cores working (havn't been
able to successfully run it due to
being unable to upload any data
properly). On the other however, I type 
in
the mpirun -np --multi='mpi4py'
script, but I get no output. I can see 
that
it's running in the background (top
command), but nothing pops up, no text
command, nothing. I typed the same
mpirun with the --gui, but that opened
up nothing. On a uni-processor (typing
in the exact same command without
indicating how many cores i.e. no -np
--multi='mpi4py') it works just fine,
so I don't think its my openmpi that's
an issue. I don't know whether this
is an issue with my mpi4py or a
personal computer issue (since on the 
other
computer relax runs just fine on
multiple cores).

Sincerely,
Sam

P.S. when I do enter the top command
to see what's running. My master shows
mpirun, and the 3 slaves display
python when I put -np 4, so I know
something is running in the
background. I have 8 cores.

On Wed, Aug 31, 2016 at 6:49 PM,
Mahdi, Sam <sam.mahdi.846@xxxxxxxxxxx>
wrote:

Hello everyone,

I am attempting to run relax on a
multi-processor mode. I have been able
to successfully set-up relax to
operate in a multi-processor mode by 
using
the mpirun -np #ofprocessors
/location/of/relax --multi='mpi4py'

The problem I encounter is when
using the --tee log 
dauvergne_protocol.py
command. I receive this error
RelaxError: the script file
'dauvergne_protocol.py' does not exist

I located the script file and tried
to direct to it's path
mpirun 
/usr/local/Relax/relax-2.2.5/relax
--multi='mpi4py' --tee log
/usr/local/Relax/relax-2.2.5/a
uto-analyses/dauvergne_protocol.py

But i received this error
RelaxError: the script file
'/usr/local/Relax/relax-2.2.5/
auto-analyses/dauvergne_protocol.py'
does not exist.

Even though I have the script, it
doesn't seem to be able to locate it.

On a side note, in the manual, one
dash doesn't actually run the command.
I.e. in the manual it displays
-multi='mpi4py' . What it should be is
--multi='mpi4py' . The same goes for
-tee. It should be --tee.

Sincerely,
Sam

______________________________
_________________
relax (http://www.nmr-relax.com)

This is the relax-users mailing list
relax-users@xxxxxxx

To unsubscribe from this list, get a
password
reminder, or change your subscription
options,
visit the list information page at
https://mail.gna.org/listinfo/
relax-users





































Related Messages


Powered by MHonArc, Updated Fri Sep 30 01:40:06 2016