Re: CST branch -- May 18, 2010

Hi,

Please see below.


On 18 May 2010 10:01, Pavel Kaderavek <pavel.kaderavek@xxxxxxxxx> wrote:

Hi,
if I understand your idea, this architecture of the code is possible. I
would rather make sure, I understand it correctly. Please, see my following
(slightly artificial) example:

I study relaxation of two carbon spins.
First spin interacts with three other carbons and one hydrogen. Second
studied spin interacts with one carbon one nitrogen and two hydrogens (I
neglect CSA now for the sake of simplicity). Then number of interactions
would be six (maximum of interactions to carbons = 3, plus maximum of
interactions to hydrogen=2, plus maximum of interactions to nitrogen=1)


I would say that there is 4 interactions with spin 1 and 4
interactions for spin 2, though they are both different.  See below
for the implementation.

Now I tried to describe, how does the data structure would look like for the
first spin:
The following statements would keep class Data, where the physical
quantities (for example unit vector) of the particular type of interaction
will be stored. If this interaction is not active for the spin it will
remain empty
self.data[0][0] - first C-C interaction
self.data[1][0] - second C-C interaction
self.data[2][0] - third C-C interaction
self.data[3][0] - first C-H interaction
self.data[4][0] - second C-H interaction - remain empty
self.data[5][0] - first C-N interaction - remain empty


The last 2 would not exist.

Just for the comparison, how the structure will be filled in the case of the
second spin:
self.data[0][1] - first C-C interaction
self.data[1][1] - second C-C interaction - remain empty
self.data[2][1] - third C-C interaction - remain empty


These 2 should also not exist.

self.data[3][1] - first C-H interaction
self.data[4][1] - second C-H interaction
self.data[5][1] - first C-N interaction

Did I get your idea?


I understand, but this is not necessary and adds future complications.
 The indices should also be swapped, as in my example in the post at
https://mail.gna.org/public/relax-devel/2010-05/msg00000.html.  The
first index is the spin and the second is the interaction - your
example here has this around the other way.  The reason is because
then the length of the second list, i.e. for your first spin
self.data[0] can be different to the second spin in self.data[1].

Generally, I think that even if we make the code like this, there will be
still necessary to introduce some changes into the files in directory
maths_fns. I see two reasons. First, it is necessary to sum up somewhere all
contributions from individual interactions to compare theoretical and
experimental relaxation rates. Second it is necessary to calculate cross
correlation term for asymmetric CST.


Currently the dipolar and CSA relaxation are not split into different
interactions within relax, but with your changes this will need to be
done.    I would suggest that this would be the perfect starting point
- splitting these apart - as the test suite will say when this code is
functional again.  Once these 2 are split, then the multi-dipole and 2
CSA tensors will be easy to handle.  The handling of the specific
interaction is performed in __init__() where the appropriate data is
placed into the container and in setup_equations() were the specific
equations are assigned to the data container.

As for changing other files, I think this can still all be done within
mf.py.  A few tiny changes will be required in maths_fns.ri_comps and
maths_fns.ri, but these will be obvious later.  The interactions can
all be summed together in the func_*(), dfunc_*() and d2func_*()
target functions.  This is because relaxation rates sum for each
interaction.  This includes the cross correlation terms, which may
require additional code depending on their spectral densities.

Rather than trying to make one huge patch that will make everything
work simultaneously, maybe it would be best to send many small ones
which change each part one by one.  Then I can check and make sure
it's not damaging other parts of relax.

Cheers,

Edward

Pavel
PS: Sorry I forgot to mark "Reply to all" in my previous mail.

On 5 May 2010 10:53, Edward d'Auvergne <edward@xxxxxxxxxxxxx> wrote:


On 30 April 2010 21:21, Pavel Kaderavek <pavel.kaderavek@xxxxxxxxx> wrote:

Hi,
I tried to do the changes in the __init__ function of Mf class in the
file
maths_fns/mf.py.
I am not sure about the way of initialization you described in the
previous
mail. I understand that it is necessary to initialize the lists between
the
loops, but I would do it in this way:
self.data[i].interactions = zeros(self.num_interactions[i], float64)


Hi,

It would be best to do this differently.  The interactions will be
part of self.data.  This structure can be turned into a 2D list of
lists - i.e. like a matrix where each row corresponds to one
interaction and each column corresponds to an isolated spin system.
You do this by replacing:

       # Create the data array used to store data.
       self.data = []
       for i in xrange(self.num_spins):
           # Total number of ri.
           self.total_num_ri = self.total_num_ri + num_ri[i]

           # The ratio of gyromagnetic ratios.
           g_ratio = gh[i] / gx[i]

           # Append the class instance Data to the data array.
           self.data.append(Data())
...

with:

       # Create the data list of lists used to store the interaction
and spin specific data.
       self.data = []
       for int_index in xrange(self.num_interactions):
           # Add an empty list for the interaction.
           self.data.append([])

           # Loop over the spins.
           for spin_index in xrange(self.num_spins):
               # Total number of ri (only sum for the first interaction).
               if int_index == 0:
                   self.total_num_ri = self.total_num_ri +
num_ri[spin_index]

               # The ratio of gyromagnetic ratios.
               g_ratio = gh[int_index][spin_index] /
gx[int_index][spin_index]

               # Append the class instance Data to the interaction
list for this spin.
               self.data[int_index].append(Data())


This will set up the correct data structure.

I think I can avoid procedure inside the second loop where you suggest
to a
new data
container initialise and append to the list. Now I can just fill the
list:
for j in xrange(self.num_interactions[i]):
    self.data[i].interactions[j]=interactions[i][j]

Am I wrong? Did I misunderstand something?


This would work, but the interactions structure is not necessary - it
can be directly part of self.data.  The interaction can be considered
higher level than the spin, so you can collapse this to:

self.data[interaction_index][spin_index]

This list of lists is much simpler and easier to maintain in the future.

Moreover, I found that in the __init__ function is necessary  to extend
the
dimension of frequencies per field strength. They will be different
between
(for example) CC and CH interactions. So, what do you think about this:

self.data[i].frq_list = zeros((self.num_interactions[i], num_frq[i], 5),
float64)

for j in xrange(num_frq[i]):
                frqH = 2.0 * pi * frq[i][j]
                frqX = frqH / g_ratio
                for k in xrange(self.num_interactions[i]):
                    frqY = frqH * self.data[i].gratio_ext[k] / gh[i]
                    self.data[i].frq_list[k, j, 1] = frqX
                    self.data[i].frq_list[k, j, 2] = frqY - frqX


The CC and CH interaction would be in different Data containers, so
this is not necessary.  They would be in:

self.data[CC_index][spin_index]
self.data[CH_index][spin_index]

Each interaction and each spin has it's own data container in
self.data[x][y], therefore almost all the rest of the code in math_fns
remains identical and untouched.

Small question at the end:
Can it be included in the same patch - both changes belongs to changes
in
one function or is it too large step?


It would be best to have separate patches.  That way they can be
individually checked and optimised.  In programing in a group
environment, it is recommended that each patch/commit should contain
only one concept.  If there are two unrelated changes in a patch, it
is standard practice that the patch creator is asked to split it up
and resend it.  If it was directly committed, that commit will be
reverted.  And small patches are much easier to have accepted.

One thing I would highly recommend for the start is to have a test RNA
data set.  This should be of 2 or 3 spins, and where you know what the
result is.  Normally synthetic data is best, but otherwise results
published from a reliable source can be used.  A script performing
model-free analysis is then written.  The aim is to have a data set
and script that will be used to check if the code is working.  We then
add it to the test suite as a system test.  This should be done before
anything else.  This test will then, from now until the death of
relax, enforce that this analysis always works for all users.  In the
test suite system we can exactly and easily check the optimisation
results.  This is a very useful way of coding because once the test
passes - then you know that the code is complete!

Cheers,

Edward

Re: CST branch

Header

Content

Related Messages