Hi Troels, For this, you could look at optimising the code as you are a relax developer. I can say that the R1rho dispersion model target functions used in optimisation are no where near as fast as they could be (the CPMG models could also be optimised a little more). This is because the relaxation dispersion code in relax comes from 8 different groups spread across 7 different countries. Though the relax test suite ensures that the code is always 100% functional and bug-free, it is a bit messy! And the test suite does not care about the speed of the code. You would need to identify what is actually taking up all the time by turning on the profiling flag in the base 'relax' file. Then run a small analysis and store the log. The output will tell you where all the time is spent. If most of it is in optimisation functions, then you should look into the target_functions.relax_disp module and lib.dispersion package. The profiling flag is essential to see if any code optimisations you make actually do speed up the calculation. For speeding up target functions, there are a two useful tricks: 1) Shift calculations to higher loop levels. In the R1rho dispersion models there are 4 nested loops for the spins, the spectrometer frequencies, the spin-lock offsets, and the spin-lock field strengths. For example for the DPL94 model (http://wiki.nmr-relax.com/DPL94), if you have one calculation for the kex**2 parameter, this is currently happening at the 3 level of nesting. But this calculation only needs to be performed once outside all of these loops. Shifting the calculation and sending kex2 into lib.dispersion.dpl94 can decrease the number of times this is calculated by 100-1000. I have done large amounts of this code shifting optimisation myself, but there is still a lot that can be done. 2) Shifting calculations outside of the target function. This is absolute best code optimisation you can make, as then the calculation only happens once when setting up the target function. Then it is completely removed from the minimisation step. Again for the DPL94 model, I can see that the 'theta' parameter in lib.dispersion.dpl94 is a perfect example. This comes from the tilt_angles argument to the target_functions.relax_disp module. Therefore theta does not change during optimisation (which is different to the other R1rho models where theta is dependent on the optimised parameters). It is used twice, once in sin(theta[i])**2 and once in R1*cos(theta[i])**2. So you could create two structures sin_theta2 and r1_cos_theta2 in the __init__() method which contain all of these values pre-calculated. See the spin_lock_omega1_squared data structure setup for a good example of this concept. These types of optimisation for the DPL94 model could give you up to 2-4 orders of magnitude speed ups, which is probably what you are after ;) Another useful code optimisation target would be the pre-calculation of a spectrometer frequency squared structure from self.frq in __init__(). You could even pre-calculate the r20_index values if you are desperate. You may find other useful optimisations as well, phi_ex*kex for example. Or you could try something more radical, for example for the missing data points, you will see that there is a second loop over the dispersion points (for DPL94 the spin-lock field strength). The first loop is in lib.dispersion.dpl94. You could maybe set the back calculated values to the measured values in the __init__() method. Then pass the rank-1 self.missing[0][si][mi][oi] data structure into the lib.dispersion modules which then prevents the corresponding back calculated value form being set, so it remains permanently set to the measured value. Then you can eliminate the second loop over the dispersion points in the target functions, saving a lot of time. As you are performing many, many replicated operations, from the profiling you may also find that some relax user functions are taking up a lot of your time. Most relax user functions are not optimised for speed! Note in the profiling results, that you will only see the user function backend. You should search for these and look at the cumulative time (cumtime). If the calculation takes 1 hour and the cumtime is a few minutes or less, it is not worth optimising that code. I have used this for my lactose dynamics and stereochemistry papers (http://dx.doi.org/10.1002/chem.201100854, http://dx.doi.org/10.1002/chem.201002520 and http://dx.doi.org/10.1021/ja205295q). This is why the PDB reading and writing code is now very fast in relax. Regards, Edward On 27 March 2014 17:42, Troels Emtekær Linnet <tlinnet@xxxxxxxxxxxxx> wrote:
Dear Edward. I am working on a systematic investigations of dynamic parameters for hundreds of datasets. For one example, a CPMG analysis is setup for: 17 variations of tau_cpmg The number of MC simulations is 50. 82 spins which are all clustered. There is no grid search, and only TSMFK01 is used. I do one grid search in the start, minimise this, copy over the parameters and take median, make a clustering analysis, and then repeat the last step 60 times. This would again would be needed to repeat 5-8 times for other datasets with variations. And then for other proteins. (Sigh..) I have setup relax to use 20 processors on our server, and a dispersion analysis takes between 2-6 Hours. That is a reasonable timeframe for an normal analysis of this type. But I have to squeeze hundreds of these analysis through relax, to get variation of the dynamic parameters. Our old Igor Pro scripts, could do a global fitting in 10 minutes. That does not include MC simulations. But I wonder if I could speed up relax by changing function tolerance and maximum number of iterations: minimise(min_algor='simplex', line_search=None, hessian_mod=None, hessian_type=None, func_tol=OPT_FUNC_TOL, grad_tol=None, max_iter=OPT_MAX_ITERATIONS, constraints=True, scaling=True, verbosity=1) where standard values of: OPT_FUNC_TOL = 1e-25 OPT_MAX_ITERATIONS = 10000000 Could you advise if this strategy is possible? What I hope for, is that an analysis come down to 10-20 minutes? Maybe I could cut away the MC simulations, since I am mostly interested in the fitted dynamic parameters, and not so much about their error? Thank you in advance! Best Troels _______________________________________________ relax (http://www.nmr-relax.com) This is the relax-users mailing list relax-users@xxxxxxx To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users