The gradient

Most techniques also utilise the gradient at the current position. Although symbolically complex in the case of model-free analysis, for example, the gradient can simply be calculated as the vector of first partial derivatives of the chi-squared equation with respect to each parameter. It is defined as

∇ = $\displaystyle \begin{pmatrix}
\frac{\partial}{\partial \theta_1} \\
...l \theta_2} \\
\vdots \\
\frac{\partial}{\partial \theta_n} \\
\end{pmatrix}$ (14.3)

where n is the total number of parameters in the model.

The gradient is supplied as a second function to the algorithm which is then utilised in diverse ways by different optimisation techniques. The function value together with the gradient can be combined to construct a linear or planar description of the space at the current parameter position by first-order Taylor series approximation

f (θk + x) $\displaystyle \approx$ fk + xTfk, (14.4)

where fk is the function value at the current parameter position θk, fk is the gradient at the same position, and x is an arbitrary vector. By accumulating information from previous parameter positions a more comprehensive geometric description of the curvature of the space can be exploited by the algorithm for more efficient optimisation.

An example of a powerful algorithm which requires both the value and gradient at current parameter values is the BFGS quasi-Newton minimisation. The gradient is also essential for the use of the Method of Multipliers constraints algorithm (also known as the Augmented Lagrangian algorithm).

The relax user manual (PDF), created 2020-08-26.