Hi Troels, This sub-thread (which will appear at http://thread.gmane.org/gmane.science.nmr.relax.devel/3833) will hopefully be a mini-tutorial covering the development of the relax_disp branch. Before you can be accepted as a relax developer with commit access to the source code repository, you should first submit changes as patches. This takes longer initially, but it allows the other relax developers to see how you code and if you are following the coding conventions as described in the development chapter of the relax manual (http://www.nmr-relax.com/manual/relax_development.html). I can give you feedback as you go as to how to improve the code to fit into relax. We, the relax developers, will after a few patches have a private vote to accept you as a relax developer. This is standard practice in an open source project. The full procedure for becoming a developer is detailed in the 'Committers' section of the manual (http://www.nmr-relax.com/manual/Committers.html). The PDF version of the manual is easier to read (http://download.gna.org/relax/manual/relax.pdf). Patches can be posted to the patch tracker (https://gna.org/patch/?group=relax). relax development begins and ends with the test suite. The idea is that, before any code is present, a relax system test must be created. This allows you to develop the ideas for how the UI should work with the analysis - i.e. which new user functions will need to be created and which ones will need to be expanded. A script is added to test_suite/system_tests/scripts/relax_disp/ and then a test added to test_suite/system_tests/relax_disp.py which executes the script and then checks the data and results. For example see the script 'test_suite/system_tests/scripts/relax_disp/hansen_data.py' and the function test_hansen_cpmg_data_fast_2site() in the file 'test_suite/system_tests/relax_disp.py'. This is obviously not complete as only the script is executed - the results are not yet checked (as we do not know what the result for the optimised model should be yet). This individual test can be executed with the command: $ relax -s Relax_disp.test_hansen_cpmg_data_fast_2site This test, as well as the other Relax_disp tests, were created by Sebastien Morin when he started the development of the relax_disp branch. I have renamed everything since he added it, and will probably do so again soon. It is best to develop for the script UI first - the GUI will later be modified around the graphical versions of the user functions, or directly accessing the back end of the user function. Due to the advanced state of the relax_disp branch, you probably do not need to worry about new user functions. This may be needed if you would like to expand the analysis to new types of data (for example off-resonance R1rho where R1 data need to be measured and used in the analysis, H/D exchange, etc.). The test suite is one area which can be expanded to handle the different CPMG models. The testing is currently not very extensive. For example before a new dispersion model is added to relax, it would be good if synthetic data were to be created in an external program (a Python script, Matlab, Mathematica, Maxima, etc.). It is very important that relax is not used to create the data. Synthetic data is very important for making sure that relax obtains the correct result, as you know what the result should be. With measured data you can never really know what the true result is - this is the entire point of the mathematical field of modelling (this field makes that of NMR look very, very small). Synthetic data is also useful for double checking results against other relaxation dispersion software (for reference: NESSY - http://home.gna.org/nessy/; CPMGFit - http://www.palmer.hs.columbia.edu/software/cpmgfit.html; ShereKhan - http://sherekhan.bionmr.org/; CATIA - http://www.biochem.ucl.ac.uk/hansen/catia/). Data could also be taken from Art Palmer's CPMGFit manual (http://www.palmer.hs.columbia.edu/software/cpmgfit_manual.html). This would need to be converted into peak intensities in a peak list file, but that is easy enough by simply picking random I0 values for the exponential curves. The data could be passed quickly through each of the models of the CPMGFit program and results noted. Then the results would be added to the checks of different relax system tests. Each different data set used in the testing process should be located in its own directory in test_suite/shared_data/dispersion/. That directory can include the data and all scripts used to generate the data and, for reference, it can also contain subdirectories for holding the input and output for different programs (as long as the files are not too big). The current state of the branch is that all of the user functions are pretty close to complete. The user function consists of a front end definition in user_functions/, and a backend either in pipe_control/ or specific_analyses/. The relaxation dispersion target function setup for optimisation is close to complete. You can see this in the minimise() method of the specific_analyses/relax_disp/__init__.py file, and then the __init__() method of the class in target_functions/relax_disp.py. As you will see in the model_loop() method of the specific_analyses/relax_disp/__init__.py code, clustering of spin systems is already part of this design - everything handles a group of spins assuming the same parameter values. One missing feature that I might work on soon is the handling of missing input data, as this affects my current work. This is a problem currently caught by the test_suite/shared_data/dispersion/Hansen/relax_disp.py script, as residue :71 is missing data at one field strength. But once the dispersion tests have been expanded, this can be tested properly by deleting data for single points on the exponential curves, deleting entire exponential curves (or dispersion points for the two-point analysis type), or all data from a single spectrometer field strength for a single spin. So I would suggest that you pick one of the dispersion models you are interested in and try to implement that. I am working on the Luz and Meiboom, 1963 model, but all of the other models are safe to work on. Just say which you are interested in so that we don't both change the same code. The system test data would come first. The formula can be taken, a set of parameters for 2-3 spins chosen, and a simple script written to generate the R2eff data, importantly at multiple magnetic field strengths. That data can then be converted into a generic peak list for different time periods on a basic 2-parameter exponential curve. See the 'File formats' section of the spectrum.read_intensities user function docstring, for example by typing help(spectrum.read_intensities) in the prompt UI. In the same script the creation of input files for other programs could be added, possibly at a later stage, and the data quickly run through CPMGFit, for example, for a sanity check. If you do test the other programs, you may encounter a severe bug in one of their models. No software is bug free. In such a case, we should communicate with the authors in private and they can decide what to do. You can see that I did this with Art Palmer's Modelfree program at http://biochemistry.hs.columbia.edu/labs/palmer/software/modelfree.html. Versions 4.16 and 4.20 consist of patches that I send to Art to fix compilation issues and other bugs (I pointed out the grid search problem due to the singular matrix failure of the Levenberg-Marquardt algorithm and Art made that change himself). Once some data has been created and files attached to the patch tracker (https://gna.org/patch/?group=relax), then the relax script can be written and added to test_suite/system_tests/scripts/relax_disp/. The best way would probably be for one of the current scripts to be copied (by me to start with) in the repository and then you make small changes to it and send the patches created with: $ svn diff > patch Then the script execution and data and parameter checking code can be added to test_suite/system_tests/relax_disp.py - again you can look at the other methods in that file and create a new one by copying how an old method operates. In that system test you would check that the original parameters have been found. At this stage, the test should run fine up to the grid_search user function, and then fail (or possibly at the relax_disp.select_model user function call in the script depending on whether you use the auto-analysis code in auto_analyses.relax_disp or not). This is the point where the model can be implemented. Then you would take the following steps: - Add a description of the new model with the equation and reference to the user_functions.relax_disp module. - Add the model and its parameters to the _select_model() method of the specific_analyses/relax_disp/__init__.py file. - Add any new parameter definitions to the top of the specific_analyses/relax_disp/__init__.py file in the __init__() method as needed. If new parameters are needed, then there are various places in the specific_analyses.relax_disp package where support will be needed, mainly in the specific_analyses.relax_disp.parameters module. - Create a new module in the lib.dispersion package for the model function. This module will eventually hold the model function, the gradient (each partial derivative with respect to each parameter would be in a different function), and the Hessian (the matrix of second partial derivatives). Having the gradient and Hessian will allow for the more powerful optimisation algorithms to be used. - Add a new method to target_functions/relax_disp.py which uses the new code in lib.dispersion to calculate R2eff values, combine this with the chi2 function, and return the chi-squared value (see the current func_LM63() method for how to do this). - Finally, see if the system test passes. If not, then it is time to debug. During these steps, the unit test part of the test suite can be used to make sure that individual functions and methods behave correctly. This is useful as users will always find a way to break your code. Once the system test passes, then you will know that the implementation is complete and fully functional. If your interest is in the numerical integration of the Bloch-McConnell equations, then the procedure might be slightly different. We would have to discuss this in more detail, with paper references and the necessary equations. But I think that all of this can be handled in a module of the lib.dispersion package, and the rest of the above detailed procedure would be the same. I hope this post wasn't too long for you! Regards, Edward On 6 May 2013 21:14, Troels Emtekær Linnet <tlinnet@xxxxxxxxx> wrote:
Hi Edward. When you have completed your ideas of change to the disp branch, could you send me a notits? And maybe a script file, how to launch the code? Then I could try to figure out where I should extend new code. Best Troels _______________________________________________ relax (http://www.nmr-relax.com) This is the relax-devel mailing list relax-devel@xxxxxxx To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel