Calculus of variations as a basic tool for modelling of reaction paths and localisation of stationary points on potential energy surfaces

ABSTRACT The theory of calculus of variations is a mathematical tool which is widely used in different scientific areas in particular in physics and chemistry. This theory is strongly related with optimisation. In fact the former seeks to optimise an integral related with some physical magnitude over some space to an extremum by varying a function of the coordinates. On the other hand, reaction paths and potential energy surfaces, in particular their stationary points, are the basis of many chemical theories, in particular reactions rate theories. We present a review where it is gathered together the variational nature of many types of reaction paths: steepest descent, Newton trajectories, artificial force induced reaction (AFIR) paths, gradient extremals, and gentlest ascent dynamics (GAD) curves. The variational basis permits to select the best optimisation technique in order to locate important theoretical objects on a potential energy surface. GRAPHICAL ABSTRACT


Introduction
The applicability of the optimization theory is widespread reaching into almost every activity in which models and numerical information are necessary, see e.g., natural sciences, engineering, and mathematics. A short account of these applications are computational chemistry and branches of numerical analysis like, variational principles in partial differential equations, and nonlinear equations in ordinary differential equations within others. The importance of optimization in general is in this purpose, namely, the way in the selection of the best element under some criteria from some set of available alternatives.
An important subfield of optimization theory is the calculus of variations. The CONTACT J. M. Bofill Email: jmbofill@ub.edu, W. Quapp Email: quapp@uni-leipzig.de calculus of variations is concerned with the problem of determining maximums or minimums or, in general, stationary values of functionals by seeking that argument function in the given domain of admissible functions for which the functional assumes the stationary value in question. In short, one seeks to optimize a functional over some space to a stationary value by varying a function of the coordinates.
One of the main problems in theoretical chemistry is to study the mechanisms associated with the chemical reactions. An important achievement in the development of models to understand the chemical reaction mechanisms was the introduction of the following two concepts, namely, potential energy surface (PES) and reaction path (RP) as a way to describe the molecular system evolution from reactants to products in geometrical terms. It has been motivated by a continuous mathematical development on the grounds of the model and computational algorithms to compute an RP and to locate stationary points of the PES as well.
The basic definition of the RP is a curve located on a PES that monotonically increases from a stationary point (with the character of a minimum) to a first index saddle point and from that point it monotonically decreases to a new stationary point, a minimum. The first index saddle point according to the previous definition is the highest energy point of the RP. The first and the second minimum are labeled as reactants and products, respectively, while the first index saddle point is the transition state (TS). The parametrization of a curve, say by parameter t, satisfying the above RP requirements, is the reaction coordinate. More concisely, if q is a coordinate vector of dimension N , then the RP is represented by q(t). Normally, the parameter arc-length, t = s, of the curve is taken as the reaction coordinate; however, special values of the PES can also be taken as reaction coordinates.
There exist many types of curves on the PES that satisfy the RP conditions. The fact is the reason of the variety of RP curves. The curve most widely used as RP is the so-called intrinsic reaction coordinate (IRC); this curve is the steepest descent curve in mass weighted coordinates. The IRC is the steepest descent curve joining two minimums through a TS. Another curve used as RP is the distinguished or driven coordinate method, or a more recent version, the so-called reduced gradient following (RGF), also labeled as Newton path or Newton trajectory (NT). Additionally, we have the gradient extremals (GEs); and finally the gentlest ascent dynamics (GAD).
The RPs are static curves on the PES, which means that only geometric properties of the PES are taken into account and no dynamic information can be sought from these pathways. Despite many RPs commonly being geodesic curves on a surface, each type of the RP curve has different mathematical grounds. Due to this fact each RP has its own evolution on the PES to reach the first index TS from the minimum. Taking into account all the features of the RP we present a unification of a variational point of view of each type of RP above mentioned.
A problem that appears in the differential calculus is to find a point for which a given functional takes its maximum or minimum value. The theory of calculus of variations is not only a theory of maximums and minimums but also a theory of variables and functions which are much more complicated than those which appear in the standard differential calculus. Basically in this theory the problem consists in finding a curve, q(t) = (q 1 (t), . . . , q N (t)) T , within a set of curves joining two points on the N -dimensional space such that a function I(q), depending on these N independent variables, takes an extreme value, in principle, a maximum or a minimum. Usually, the function I(q) is an integral of the form where F is a given function which is twice continuous differentiable with respect to its three arguments, t, q(t) and q (t) = dq(t)/dt. As noted above, the functions q(t) will be restricted to the class of admissible functions satisfying the conditions, q(t 0 ) = q 0 , q(t f ) = q f , q(t) continuous, and q (t) piecewise continuous. The requirement that I(q) be an extremum is that I(q) is stationary with respect to the variation of the N functions q 1 (t), . . . , q N (t) considered independently. The necessary condition to be stationary is that the N Euler equations are satisfied where ∇ q = (∂/∂q 1 , . . . , ∂/∂q N ) T and the superscript T means transposed. The notation {F q } represents the Euler operator. The task now is the construction of F for each type of path.

The Hamilton-Jacobi Equation for the steepest descent/ascent curves
A steepest-descent/ascent curve is the curve that at each point follows the gradient of the PES [1]. This curve is variational because it extremalizes the integral functional where s is the arc-length and the vector g(q) = ∇ q V (q). The function F (q, q ) does not depend explicitly on t and is homogeneous of degree one with respect to the argument q (t). Thus the steepest-descent/ascent curve always extremalizes the function given in Eq. (3) which is the class of functions associated to a Fermat variational principle [2]. For this reason this curve propagates through the PES according to a speed law or continuous slowness model related to the inverse of the gradient norm of the PES, namely, 1/ g(q) T g(q). It travels through the surface by going from a point of an equipotential line to another point of the next equipotential line by the shortest path. Thus the steepest-descent/ascent curves extremalizes the function associated to this special type of Fermat principle. It is also possible to derive a Hamilton-Jacobi equation for this variational problem [3]. The integral of Eq.(3) evaluated along a steepest-descent/ascent curve joining two points, say q 0 and q, takes the value I q0→q (q) = ∆V (q). Using the language of the Hamilton-Jacobi theory, this function ∆V (q) is called the geodetic distance between the points q 0 and q. In the present case this geodetic distance is the potential energy variation between the points q 0 and q, ∆V (q) = V (q) − V (q 0 ). Now, the first order variation in ∆q of the integral functional I q0→q (q) evaluated through the steepestdescent/ascent curve joining the points q 0 and q is a differential equation evaluated at the point q(t) of the steepest-descent/ascent curve. From this total differential equation we have that Multiplying the last two equalities of Eq.(5) by itself we obtain Eq. (6) is a first order non-linear partial differential equation in the y = ∇ q V (q) taking the place of the Hamilton-Jacobi equation or eiconal equation in the present variational problem. This is the Hamilton-Jacobi equation or eiconal equation formulated for the steepest-descent/ascent path, and in particular for the IRC [3]. We now seek for the functions V (q) that are at least twice continuously differentiable and such that if we set y = ∇ q V (q) and ν = V (q) then the equation 1/2(∇ q V (q)) T (∇ q V (q)) − 1/2g(q) T g(q) = 0 is identically satisfied. We shall consider in the space of q an arbitrary differentiable curve q = q(t) and we substitute this into y(t) = ∇ q V (q(t)) and ν(t) = V (q(t)). It can be proved that the 2N + 1 functions, q(t), y(t), ν(t) are solutions of the system of ordinary differential equations which reads as follows: This system of ordinary differential equations defines a field of curves, namely, the steepest descent/ascent curves and in particular the IRC [3]. The analysis of the second variation, δ 2 I q0→q (q), indicates that the steepest descent/ascent curve that joints two minimums of the PES through a first index saddle point is the only curve of this type that minimizes the integral functional of Eq. (3). Thus the IRC is the steepest/descent curve that always satisfies these conditions. However, the second variation conditions associated to the integral functional of Eq.  The second variation of the function Eq. (3) does not have a relation with the condition of the Minimum Energy Path (MEP). The gradient curve of the type IRC is always a reaction path since it transverses in each point of the curve an equipotential curve thus increasing (decreasing) monotonically the potential energy but it can be or not be a MEP. The MEP condition depends on the shape of the PES.
2.2. The second variation as a basic tool to locate IRC curves: the utility of the Weierstrass E-function The basis of the second order analysis is the comparison between the value of the integral of Eq. (3) evaluated through a steepest descent/ascent curve and the same integral functional evaluated using another arbitrary curve joining the same initial and final points. This difference function is known as the integral over the Weierstrass E-function and plays a basic role in the analysis of the second order variation. It is where q (t) is the tangent of the arbitrary curve [3]. The minimization of the I(q) − I(q) function given in Eq. (8) is done iteratively with respect to the parameters that characterize a given arbitrary q(t) curve connecting the fixed end points q(t 0 ) and q(t f ) both being stationary points of minimum character of a PES. We will find the near steepest descent/ascent curve to this q(t) curve. The curve q(t) is assumed to satisfy the differential equation q (t) = f (q(t)), where t 0 ≤ t ≤ t f and f is a vector of a vector field. The curve q(t) can be represented as a polygonal line or a chain line defined in the region of the PES under consideration which connects the minimums, q(t 0 ) and q(t f ), and it is the vector f = ∆q, being ∆q the difference vector between two consecutive vertex points of the chain. The minimization of the I(q)−I(q) function has the effect of transforming the curve q(t) into another curve such that the field vector f at each point of this new curve coincides as much as possible with the field vector g(q), which is the field vector of the field of the extremals, the steepest descent/ascent curves, of the functional given in Eq. (3). This is the ground of the algorithm described in Ref. [6] to find the steepest descent/ascent curve connecting the minimums of the PES, q(t 0 ) and q(t f ). The resulting steepest descent/ascent curve is the IRC path. This assertion is easy to proof. First, at the convergence of the algorithm we have that the tangent of the converged curve satisfies the equality, q (t) = g(q), for t 0 ≤ t ≤ t f . Substituting this equality into the Eq. (8) we obtain I(q) − I(q) = 0.
Second, if we differentiate Eq.(8) with respect to q after some rearrangement we obtain g(q(t)) T g(q(t)) dt g(q(t)) T g(q(t)) .
As mentioned above, the tangent of the converged curve is Substituting it in Eq. (9) we have that ∇ q (I(q) − I(q)) = 0. In other words the converged curve satisfy the stationary conditions of a steepest descent/ascent curve given in Eqs. (7) and the proof ends. Eq. (9) gives us the basic idea to design an algorithm. If at each point of the current curve q(t) joining the minimums of the PES, q(t 0 ) and q(t f ), we project the integrand into the subspace spanned by the set of directions orthogonal to the tangent q (t) then we have a residuum that is zero when the current curve coincides in all the points with the steepest ascent/descent curve joining these minimums of the PES. In a more explicit way where r(t) is the residuum vector at the point q(t) and U is the unit matrix. This is the basis of many algorithms to locate the IRC curve. The difference between them is the way to parametrize and to modify the guess of the current curve to make the residuum vector to the zero vector within some threshold. Here can be cited all the chain-of-states methods [7] like the widely used Nudged Elastic Band (NEB) [8,9], and the minimization of the Weierstrass E-function [3,10,11] where the error is minimized using a Runge-Kutta technique, the path optimization by a variational reaction coordinate due to Birkholz and Schlegel [12,13] and more recently the geodesic interpolation method [6]. It is worth to note that all these methods locate the IRC path which is a curve representation of a static reaction path, however, the category of an MEP is not guaranteed except if the whole IRC path is located on a deep valley floor. The latter condition is not always satisfied, it depends of the topography of the PES [14].

The Potential Energy Surface expressed as Wave Equation
In the introduction to section 2 above, in subsection 2.1, was explained that the steepest descent/ascent curves satisfy the Hamilton-Jacobi equation or eikonal equation (6). The picture associated to this equation is that the descent/ascent curves and in particular the IRC path are in fact orthogonal trajectories that transverse the set of contour hypersurfaces, V (q) = ν = constant. Thus it is important to take into account that the Hamilton-Jacobi equation or the eikonal equation describe a relation between the contour of a hypersurfacesurface and its orthogonal curves. The steepest ascent curves emerging from a minimum of a PES in its travel transverse orthogonally at each point with a contour hypersurface. It should be noted that all the points that belong to a contour hypersurface, say V (q) = ν, possess the same energy difference with respect to all the points of another contour hypersurface. This picture is equivalent to the picture of Fermat-Huygens of the propagation of the cone rays. We note that both pictures, Fermat-Huygens of the propagation, and the Hamilton-Jacobi theory, are strongly related [2]. In other words the construction of a solution of the eikonal Eq. (6) as a contour hypersurface with constant potential energy is similar to the Fermat-Huygens principle for the construction of wave fronts [3]. The question about the equation that governs the propagation of the "wave" associated with the steepest ascent curves was treated in former Refs. [16] and [17]. The unique possibility, the solution to this question, is a second order partial differential equation such that its associated characteristic equation is the eikonal Eq. (6), which is related with the PES. We consider the wave equation . . , ∂ 2 /∂q 2 N , and the scalar product of the gradient G(q) = g(q) T g(q). In this manner the wave equation (11) is defined in the space (q, ν) of dimension N + 1 and ν is treated as independent variable. The function ψ(q, ν) can be seen as an abstract field in any medium with slowness 1/G(q) 1/2 , which also emerges in the eikonal Eq. (6). The factor 1/G(q) 1/2 plays the role of the "velocity" of the corresponding wave solution. Eq. (11) is an hyperbolic partial differential equation with non-constant coefficients because G(q) is positive and changes with respect to the q variables. The solution of an hyperbolic equation is "wave like". If a disturbance is done in the initial data of a hyperbolic differential equation, then not every point of the space feels the disturbance at once. Relative to the fixed "energy" coordinate, ν, the disturbances have a finite propagation speed. They travel along the so-called characteristics of the equation. To find here the characteristics, we have to treat the quadratic form, which is connected with Eq. (11) where ϕ(q, ν) is the characteristic function. The connection between Eqs. (11) and (12) is briefly exposed as follows [2]. We define a coordinate transformation such that the determinant of the Jacobian of this transformation is not zero in the domain of interest, det(∇ q,ν ϕ T ) = 0. In this way Ψ(ϕ 1 (q, ν), . . . , ϕ N +1 (q, ν)) = ψ(q, ν). Instead of q, ν we have introduced a new set of N + 1 independent variables, χ, one of which is χ N +1 = ϕ N +1 (q, ν) = 0 while the remainder of variables collected in χ N are "interior values" on the manifold defined by χ N +1 = 0. Thus given on this manifold the data or quantities, Ψ(χ T N , 0) and ∂Ψ(χ T N , 0)/∂χ N +1 , we construct a function ψ(q, ν) so that on the manifold χ N +1 = 0 this function and its derivatives with respect to χ N +1 coincide with the given functions Ψ(χ T N , 0) and ∂Ψ(χ T N , 0)/∂χ N +1 of χ N , and so that ψ(q, ν) satisfies Eq. (11) on the manifold χ N +1 = 0.
All the second derivatives of ψ(q, ν) except ∂ 2 Ψ/∂χ 2 N +1 are uniquely determined by the data or quantities on χ N +1 = 0 by differentiating the quantities Ψ(χ T N , 0) and ∂Ψ(χ T N , 0)/∂χ N +1 with respect to interior variables, χ N . Now we need to know whether Eq. (11) and the data on χ N +1 = 0 determine the quantity ∂ 2 Ψ/∂χ 2 N +1 along the manifold χ N +1 = 0. In terms of the variables χ where χ N +1 = ϕ N +1 (q, ν), Eq. (11) using the chain rule takes the form where the dots in Eq. (14) stand for expressions which are known form the data of the manifold, in other words, they only contain first derivatives of Ψ with respect to χ, the second derivatives with respect to χ N and the crossed second derivatives between χ N and χ N +1 . The coefficient Q(ϕ N +1 ) is the left hand side part of the equality Eq. (12) with ϕ = ϕ N +1 . If the quadratic form Q(ϕ N +1 ) does not vanish at every point of the manifold χ N +1 = 0 then the second derivative ∂ 2 Ψ/∂χ 2 N +1 is uniquely determined in all the points of this manifold. Otherwise Eq.(11) represents an additional restriction of the data of the manifold. In this case Q(ϕ N +1 ) = 0 is called characteristic condition being satisfied on the manifold χ N +1 = 0 with the values Ψ and ∇ χ Ψ given by the data. The function ϕ N +1 (q, ν) does not need to satisfy the equation Q(ϕ N +1 ) = 0 everywhere, however, it must satisfy the characteristic conditions, Q(ϕ N +1 ) = 0, on the manifold A solution of Eq. (12) or that is the same Q(ϕ N +1 ) = 0 is obtained by taking ϕ N +1 (q, ν) = ϕ(q, ν) = V (q) − ν = 0 recovering the eikonal equation (6). The wavefront of Eq. (11) develops along its characteristic manifold in the (N +1) space, which is described by the equipotential hypersurface with V (q) = ν. We can conclude that the progression of the solution of the wave Eq. (11) is regulated by the eikonal Eq. (6), a first order partial differential equation. It is the Hamilton-Jacobi equation of the steepest ascent/descent curves in particular the IRC path. Thus we conclude that the characteristic solution of the hyperbolic second-order partial differential equation (11) with initial value or Cauchy data defined on a manifold, where in terms of χ this can be seen as the basic ground of the model PES widely used in Chemistry. [16,17] As noted above the characteristic manifold which is described by the equipotential hypersurface plays a role as "wave front", that is, surfaces across which solutions of Eq. (11) suffer discontinuities of the second derivatives. Such "wave front" occur as frontier beyond which there is no excitation at ν. Now let us assume Eq. (11) and again ϕ N +1 (q, ν) = ϕ(q, ν) = V (q) − ν = 0. Let ψ(q, ν) a function in the N -dimensional q-space taking ν as a parameter. We deal with a solution ψ(q, ν) of Eq. (11) with a hypersurface of discontinuity V (q) = ν which depends on ν and moves through the qspace. According to that explained in Subsection 2.1 every equipotential hypersurface is generated by the family of curves or rays, of ordinary differential equations Eqs. (7). From Eq. (7c) we have that dν/dt = G(q) and the ray equation takes now the form [18] Thus, in the N -dimensional q-space these rays or curves traverse the wave fronts V (q) = ν and we have dq dν The vector dq/dν is that characterizes the ray or curve transversing the wave front V (q) = ν. On the other hand, since G(q) > 0 this fact ensures the hyperbolicity of Eq. (11). In this situation we can say that the curve or ray direction and the tangent plane to the wave front are conjugate to the ellipsoid [16] (∇ q V (q)) T (∇ q V (q)) G(q) = r T r = 1 (17) where r = ∇ q V (q)G(q) −1/2 . A final remark concerns Huyghens' construction of wavefronts. We consider a wavefront V (q) = ν satisfying the eikonal Eq. (6). The spherical waves about the point q 0 may be denoted by σ(q, q 0 ) = ν. If the front at ν = 0 coincides with a prescribed surface Σ 0 , then Huyghens' construction produces a wavefront at ν as follows: around each point q 0 of Σ 0 we consider the spherical wavefront σ(q, q 0 ) = ν, and for fixed positive ν we form the envelope of all these spheres in q-space, letting q 0 range over the surface Σ 0 . This leads to the surface V (q) = ν containing the desired wave front. In more detail: at ν the wavefront is given by the envelope of the spheres of radius 1/ G(q), whose centers are the points of the wavefront at ν = 0 [2,3,16,17]. It is a possible imagination how a PES evolves.

Gradient Extremals
The gradient extremal was proposed as a reaction path some time ago, studied and analyzed several times [15,[19][20][21][22][23][24][25][26][27][28][29][30][31][32]. Its definition can be described assuming that we are on a "valley ground" (or a ridge) of the PES by an extremalization of the gradient norm, g(q) T g(q), with respect to q within the equipotential hypersurface, V (q) = ν = const, compare Figure 1. The mathematical formulation of this model consists in the extremalization of the integral functional The curve that extremalizes the integral functional of Eq. (18) is the gradient extremal and satisfies at each point the equation, where H(q) = ∇ q ∇ T q V (q) is the Hessian matrix. We note that in Eq. (19) λ plays the rule of an eigenvalue whereas the gradient g(q) plays the role of an eigenvector. This eigenvalue equation can also be written as where U is the unit matrix. The gradient extremal satisfies in each point the condition f (q(t)) = 0, but not each point of a PES satisfies this condition. For this reason the gradient extremals do not cover the whole PES, in contrast to the steepest descent/ascent curves. Gradient extremals are the curves on which the steepest descent lines of a potential energy surface have zero curvature [28]. Thus f (q(t)) = 0 given in Eq. (20) determines the gradient external implicitly. In this case the initial and final point of the curve cannot be prescribed arbitrarily, if the problem should have a solution. The tangent of this curve is computed by solving the equation U − g(q(t))g(q(t)) T g(q(t)) T g(q(t)) C(q(t))dq(t)/dt = 0 where the square matrix C(q(t)) = F(q(t))g(q(t)) + H(q(t)) 2 − g(q(t)) T H(q(t))g(q(t)) g(q(t)) T g(q(t)) H(q(t)) and F(q(t)) is the third energy derivative tensor with respect to q and the F(q(t))g(q(t)) symbol is used to indicate a square matrix that is a contracted product of a three-index array with a vector yielding a two-index array. The Eq.(21) can be derived either from the implicit function theorem, df (q(t))/dt = [∇ q f (q(t)) T ]dq(t)/dt = 0, [28] or by standard perturbation theory applied on the eigenvalue Eq. (19) [15]. Eq. (21) is computationally expensive to evaluate, it needs thirds derivatives of the energy. In addition, sometimes the gradient extremal does not evolve through the reaction valley if it exists. For these reasons gradient extremals are generally not used to study reaction mechanisms. The sufficient conditions for a minimum can be rationalized. If the gradient extremal curve starts at a minimum of the PES and ends at a first-index stationary point satisfying at each point the condition, det(W(q(t)) T C(q(t))W(q(t))) > 0, where W(q(t)) is the matrix that contains the set of N − 1 orthonormalized vectors which are all orthogonals to the current gradient vector g(q(t)), then the extremal curve achieves the condition of a minimal curve. However, if the gradient extremal has a turning point or a bifurcation point, then from this point to the end point may be hold det(W(q(t)) T C(q(t))W(q(t))) < 0, and the curve looses its minimum character. If det(W(q(t)) T C(q(t))W(q(t))) < 0 from the turning point to the end point, then other curves exist, not necessarily a gradient extremal, joining the same initial and final point so that the integral of Eq.(18) takes a lower value.
An important behavior of the gradient extremal is the following. The adjoint matrix of the Hessian, A, satisfies the relation If we multiply Eq. (19) from the left first by A(q(t)) and second by g(q(t)) T , we get after applying Eq.(23), the expression µ(q(t)) = g(q(t)) T A(q(t))g(q(t)) g(q(t)) T g(q(t)) where µ(q(t)) is equal to det(H(q(t))) but eliminating λ(q(t)) since it is an eigenvalue of H(q(t)). Let us assume that a gradient extremal touches at its turning point an isopotential energy surface, V (q) − ν = 0, of a full PES. On the others points of this gradient extremal, it crosses transversally a family of isopotential energy surfaces. We can assume that the gradient extremal after the turning point fulfills µ(q(t)) > 0. This fact indicates that the curve is now crossing a family of isopotential energy surfaces that are pseudo-convex, and vice versa are pseudo-concave [5]. Finally, if along the gradient extremal µ(q(t)) changes the sign then there is a valley-ridge inflection (VRI) point.

Newton Trajectory: the new version of the Distinguished Coordinate Path
The Newton Trajectory (NT) or Reduced Gradient Following (RGF) is characterized by a curve on the PES such that at each point of the curve the gradient vector points into a constant direction [33,34]. This can be seen in another way, the NT or RGF curve crosses the steepest descent curve at each point so that at the same point the tangent of this curve has the same direction as the constant direction of the prescribed RGF direction. Some NTs are given in Figure 2.
The distinguished reaction coordinate, or its new reformulation, the NT, is a model curve which is often used to locate transition states. The curve can be used as representation of a reaction path, again if no turning point emerges. The variational nature of the curves was studied in Ref. [35]. It corresponds to a problem where the functional only depends on the coordinates and the parameter that characterizes the curve where the q v -vector is the coordinate vector q without the q rc component. This functional is of the type F (t, q(t)) where the tangent does not appear explicitly. It can be shown [35] that the curve which extremalizes the functional integral of Eq.(25) is the curve which satisfies the Branin equation where A(q(t)) is the adjoint matrix of the Hessian matrix, H(q(t)), and the parameter t plays the role of q rc . If det(A(q(t))) is positive definite along the whole NT curve joining two minimums of the PES, then this curve is a reaction path and it falls in the category of a minimum energy path (MEP), because for this model curve both reaction path and MEP formulation coincide. In addition, it is shown in Ref. [35] that the second variation of the integral functional given in Eq. (25) is positive definite if the NT curve satisfies the inequality µ(q(t)) > 0, where µ(q(t)) is that defined in Eq. (24), which is noting more than the MEP requirement. Thus, for the NT model coincides the minimum variational condition with the MEP condition. However, if the NT has a turning point or a valley-ridged inflection point (VRI), then the minimum variational character is lost and the reaction path and the MEP conditions are not satisfied [35]. Finally, we say that an NT can start at any point of the PES with the gradient direction there.

The Newton Trajectory is a tool for Mechanochemistry. Search of the optimal pulling force as a minimization problem
An important area under development is the mechanochemistry [36]. It studies the effect of an external force to a molecular system. According to the IUPAC terminology, one defines a mechanochemical reaction as "a chemical reaction that is induced by mechanical energy". The question about how an external force couples to various reaction mechanisms is normally answered by the most accepted and widely used model consisting in a first-order perturbation on the associated PES of the unperturbed molecular system due to a stress or pulling force, f , The potential V f (q) can be seen as an effective PES. Since the direction of the external force, l = f /||f ||, stays constant it does not depend on the position. Due to this external force, the stationary points are located at different positions on the effective potential with respect to the unperturbed potential, V (q), where it holds ∇ q V (q) = g(q) = 0. The stationary points on the effective potential have to satisfy the analogous condition, ∇ q V f (q) = 0, depicting new and displaced stationary points. Thus it follows for the new stationary points of Eq. (27), One searches a point where the gradient of the original PES, g(q), has to be equal to the mechanochemical force, f , being the force that induces the chemical process. If the mechanical stress in a defined direction is f = F l with a fixed unit vector, l, then it is l = g(q)/ g(q) and F = g(q) is the magnitude. We also use the alternate form for the solution of Eq. (28) by the projector equation Where U is the unit matrix. The equation has to hold unattached from the uncomfortable norm, g(q) , and it means nothing else than that the vectors g(q) and l are parallel. If the force, F = f , is so high that the transition state disappears in a shoulder, then the mechanochemical task is fulfilled: The pulling force enforced the chemical reaction, and the reaction barrier breaks down. It is the barrier breakdown point (BBP). This holds for the very simple model ansatz of Eq. (27). We treat the fixed direction, l, but different forces, F . For point-to-point changes of the amount of the force, F , we should get the path which follows the "force displaced stationary points" (FDSPs) [37]. Now if q is a point of the FDSPs curve then we can set q = q(t) being t the parameter that characterizes this curve for increasing forces, F . It is like the Branin equation, Eq. (26). Now, we differentiate the projector Eq. (29) with respect to t, and we obtain This is an expression for the tangent of the FDSPs curve. It is the basic expression of the reduced gradient following curve (RGF), or NT, derived many years ago [38] [39] from the projector equation. For higher dimensions of the PES, this equation is better to track numerically than the Branin equation (26). Walking on the path of FDSPs or NT one has to increase the norm of the force, F , beginning at the stationary points. There is a part of the pathway from the minimum uphill and a part from the transition state downhill. If the force increases further and further, the two parts meet.
Here the norm of the gradient has its maximum. Let us assume that the curve is on a straight valley path, then one eigenvector of the Hessian points along the path, and at the meeting point of the lower and the upper part of the FDSPs we find out that the corresponding eigenvalue of the Hessian is zero. There the barrier of the former original PES disappears. Regarding the effective PES, ∇ q V f (q), with the maximal force, the transition state and the reactant minimum coalesce, thus both disappear, and the pulling force realizes the reaction. The meeting point is a "catastrophe point" [37,40,41]; it is a word of the well-known theory of Thom [26,42]. We proposed to name it more chemical: barrier breakdown point (BBP). Its necessary mathematical formula is [43] Det(H(q)) = 0 .
There is an optimal pulling direction which is given by the NT which in a point, q, of the curve satisfies The point is known as optimal BBP [44]. Note that the point also belongs to a gradient extremal. Thus on an optimal BBP, a gradient extremal and an NT coincide. In Ref. [45] is reported an algorithm to locate optimal BBPs on a PES. The algorithm is based in the location of the point q where the square function, σ(q), takes the value zero. This is down using a non-linear least square minimization algorithm [46]. The sigma function is defined in the following way where s(q) = H(q)g(q)||g(q)|| −1 . The function σ(q) is a sum of squares of nonlinear functions, s(q); thus, it is a non-negative function. It is clear that a minimum q min of the function σ(q) for which σ(q min ) = 0 is a desired solution since this can only arises if q min satisfies s(q min ) = 0, which is the optimal BBP condition given in Eq. (32). As noted above the problem to find q min for which σ(q min ) = 0, defined in Eq. (33), falls in the class of non-linear least square minimization problems. In this type of problems, the derivatives of σ(q) with respect to q are given and where s i (q) is the ith-element of the s(q) vector and J(q) = ∇ q s(q) T is the N × N Jacobian matrix, thus an element of this matrix has the form, J ij (q) = ∂s j (q)/∂q i , for i, j = 1, . . . , N . By differentiation of s(q), with respect to q, the Jacobi matrix, J(q), has the form Here T(q)g(q) is the matrix of the contraction of the gradient vector and the tensor of the third energy derivatives with respect to the coordinates, q. With this definitions one can find the minimum using the formulas of Eqs. (34) and (35) in conjunction with the Newton method. Since we are interested in the zero of the s(q) vector in the least square sense, the set of elements {s i (q)} N i=1 are small. This suggests that the second term of Eq. (35) can be neglected. We get the approximate expression for the second derivatives of the σ-function In this way, the first and approximate second derivatives of the σ-function can be determined by using only s(q) and J(q), Eqs. (34) and (35), respectively. When the second derivative is approximated as given by Eq. (35), the basic Newton method becomes the Gauss-Newton method or the generalized least squares method [46]. To improve this, we use the restricted step algorithm. In the ith-iteration, the modified Gauss-Newton where J (i) = J(q (i) ), s (i) = s(q (i) ), and α (i) is a parameter. The parameter α (i) is selected in the way that if ∆q (i)T ∆q (i) ≤ R (i) then α (i) = 1, otherwise α (i) = R (i) / ∆q (i)T ∆q (i) . The radius R (i) is a given positive parameter for the current iteration and updated at each iteration. In practice the set of Eqs. (38) is solved using the rational function optimization algorithm, that through a diagonalization process one solves this system of equations [47,48].

A comparison of Newton Trajectories with AFIR
Maeda et al. proposed the artificial force induced reaction (AFIR) method [49][50][51][52]. They change the PES by an artificial, external force of the following kind α is a numeric factor which here plays the role of the amount F in Subsection 4.1, R i and R j are covalent radii of atoms i and j. The vector r with the components r ij contains the distances between the corresponding atoms i and j. The dimension of all r ij is maximally M = N (N −1) 2 for a molecule with N atoms. It is possible to include a lower number of distances only [53,54]. Of course, all r ij > 0. The force, f, itself has the components with numbers i, j and the effective PES is thus the force, f, acts on the current point, r. Eq. (40) means that the larger distances nearly disappear in the extra force for p > 2 but only the smallest distances make a contribution to the resulting direction of the force. So, the small distances of H-atom bonds should not be used in f [49]. Of course, if the extra force moves all stationary points of the PES out of their former places then can a minimum and an SP coalesce, and a former barrier can disappear (at a BBP for α=α max ), and so a new valley opens for a contact between former distant minimums. Thus, one can use the ansatz to detect reaction valleys [50,[53][54][55][56]. To that purpose, α has to be larger than α max .
One can use Eq. (39) with a continuous change of α for the FDSPs. Usually one starts at a known minimum with α = 0. Like for NTs, continuous increase of the strength of the force, α, will move the stationary points of the effective new PES. Meada et al. only look for an increase of the force [52][53][54][55]. We have to improve the method by two alternating pieces of the curve of new stationary points. An increase and a decrease of the parameter fully describe the curve between two original stationary points over a BBP [44]. The maximal α determines the BBP. It is not an approximation of the original SP of the PES. The BBP is usually anywhere between the initial minimum and the next SP, see some instructive discussions in Ref. [44] for NTs. At the stationary point which we search for the parameter α has again to converge to zero.
In this subsection we compare the ansatz with the NT theory. We will find a stationary point on a PES, V (r), by successive moved stationary points on the new effective surfaces. For illustration we draw these points on the original PES. The dimension of the PES, as well as of the search vector, f , is less or equal M . To simplify the study, we reduce the dimension to the shortfalling case M = 2, which is not possible if a real molecule is treated. We name r 12 = x, and r 13 = y and put R i = R j = 1/2. Usually under AFIR, the exponent is p = 6. Eq.(39) then reads With the force vector Eq. (42) becomes with a scalar product between the force and the current point. It looks like the analogous definition of the Newton trajectory [33,34] but with the difference that the force is not a constant direction. We will deeper study the connection to NTs.
First, we say that the denominator of Eqs. (40) or (43) is a kind of normalization, but the vector, f, is shorter than one. It is a linear interpolation direction using all M directions. It makes that the sum of all components of the f-vector is one. The value of the norm will be less than one. The length of f differs in comparison to NTs.
For a further simplification, we cancel this normalization. One can imagine that the different vector length can be compensated by a changing factor α of the ansatz. Thus we treat only the vector for the forcẽ and so we get Search for a stationary point needs derivations If we still scale −5α =α we get really the ansatz like the one of an NT that the gradient has to be equal to the currentαf . However we know thatf is not constant. Thus, the solution curve for the moving stationary points for different factors α will not be an NT. A corresponding case was discussed in Ref. [57]. The theory of NTs cannot fully be applied. But we can search for the stationary points of the effective PES with different amounts of the α parameter. It is done in Figure 3  The magenta points start at the minimum, and they go with negative values of α up to -10 at the BBP (green point). The start direction is strongly the y direction because the modified NFK surface has at the minimum two very different values for x and y. It is an artificial particularity of this surface which does not fit the intention of the ansatz Eq. (39). Nevertheless, the calculation of AFIR points in the given direction works well. With decreasing amounts of the parameter after the BBP, the curve at least correctly finds the SP. A special property of this curve is the existence of a turning point. Here the curve touches parallel a level line of the PES.
The pink points also start at the minimum, and the go into the contrary direction with positive values of α up to 5 at the BBP. First we have increasing values of the parameter up to the BBP, then decreasing values, and after a minimum of α at 1.5 the curve leaves the reactant minimum and it get lost in the PES mountains.
The black points start at the SP, and they go with negative values of α up to -4 at the BBP. With decreasing amounts of the parameter after the BBP, the curve finds the quasi shoulder of the PES near (2,-2).
At the minimum, two curves have the correct contrary directions with different sign of the parameter. However, at the SP two curves meet with the same sign: minus. It may come from the nonlinear character of the AFIR force. Clearly, the AFIR points do not describe an NT. However, they follow an analogous rule: they start at stationary points and follow their way up to a BBP. On this piece of the curve, the norm of the gradient of the surface increases. At the BBP, there one will reach a maximum. The effective PES has here a shoulder. If one further increases the parameter, α, then the calculation of the stationary point of the corresponding effective PES stagnates here, or it jumps across the PES to a strange other solution. A small jump over the BBP for the next initial guess, and the decrease of the parameter then opens the way to the second piece of the curve. This works equally like for NTs. (However the calculation of NTs is easier because one uses a predictor-corrector method along the tangent of the curve [25], compare Eq. (30) above.) Note that here the connection of the SP and the quasi-shoulder by an AFIR curve goes somewhat strange over the PES, in comparison of the other curves studied, for example the GE, or the steepest descent.
The first variational structure of Maeda's et al. model written with the distance coordinates r, is of the type where g(r) is the gradient of the PES, α is now the Lagrange multiplier and ϕ(r) is the derivation ∇ r (f (r) T · r). If we assume that ϕ(r) = 0 we can write where U is the unit matrix. Now the task is to derive the implicit tangent from this expression. However, the problem in Maeda's at al. model, ansatz Eq. (39), is the quite complicated expression of ϕ(r), being difficult to write in an explicit form. But in the simplified case of M = 2, we can use Eq.(49) for a numeric search of a solution of the AFIR curves. We employ a Mathematica contour plot in Figure 4 for the zero contour of the square of the norm of the left hand side. The point by point optimized AFIR points of Figure 3 are added. They fit well the resulting curves. The non-linearity of the AFIR force and its non-constancy lead to 10 curves starting at the SP, quite more than the two for an NT to a given search direction. But of course, the NTs have at least a quite greater variability because around a stationary point all (constant) search directions are possible. The NTs form a dense net of curves on the PES. (And the NTs are a linear ansatz, thus easier than the AFIR method.)

Gentlest Ascent Dynamics
In 2011, E and Zhou [58] proposed the so-called Gentlest Ascent Dynamics (GAD) method that reformulates the procedure of uphill walking from a minimum basin through a set of ordinary differential equations whose solutions converge to first index saddle points [14,[59][60][61][62][63]65]. The set of equations that governs the GAD is where U is the unit matrix, H(q) is the Hessian matrix, v is a normalized vector and t is the parameter that characterizes the curve. Eq. (50a) means that the gradient is used by two different components, one in the ascent direction of the v-vector subspace and the second in the descend direction of the set of directions perpendicular to the v-vector. Eq. (50b) defines the update of the ascent direction represented by the vvector. The right hand side of Eq.(50b) ensures that the v-vector converges to an eigenvector associated with the smallest eigenvalue of H(q), and we have to ensure that the v-vector should be normalized. We note that at the starting point the norm of the v(t 0 )-vector should be equal to 1. The GAD algorithm can be seen as a Zermelo like navigation model on the PES to reach TSs in some optimal way, see [14,63,65]. The Zermelo navigation model is an example of a Mayer-Bolza problem of calculus of variations [64]. For this reason GAD falls as an example of optimal control problem into a type of variational problem where the v-vector plays the role of the control vector.
In Figure 5 we show three different GAD curves to three different control vectors, v.  Figure 5. Three curves along the gentlest ascent dynamics on the modified NFK surface, see [5]. The GAD curves differ in the initial control vector, v, the gradient there, but they converge to the same saddle point.

SP
The GAD method, as mentioned above, can be formulated as a Mayer-Bolza problem of the calculus of variations related to the solution of a problem of optimal control theory [14,63,65]. In the present context this problem can be formulated as follows: determine the vector function q(t) of dimension N , satisfying the Eq.(50a) with initial condition q 0 = q(t 0 ) and determine the control vector function, v(t), of dimension N , restricted to the normalization relation, v(t) T v(t) = 1, in such a way that the functional assumes an extremal value with the boundary condition, g(q(t f )) = 0, being satisfied at t = t f . It can be proved that this variational problem defined in this way has an associated Hamiltonian [14,65]. The Hamiltonian has the following form, 2H(q, y) = (2v T g(q)) 2 (y T y) − (1 + y T g(q)) 2 = 0 (52) where y is the co-vector of q. The Hamilton equations of Eq. (52) are the Eqs. (50) taking into account that the y-vector is related with the v-vector through the relation, ωv = (2(g(q)) T v)y, where ω = 1 + (g(q)) T y. Note that from these two relations it holds (g(q)) T y = 1 and ω = 2 along the curve. The study of the second-variation of this variational problem shows that Eqs. (50) stationarize Eq.(51) but do not neither maximize nor minimize it [65].

Conclusion and future work
The mathematical analysis of different reaction path models, some of them widely used in computational chemistry, shows that in different manners these model curves have variational nature. The fact opens the possibility to propose new curves in order to solve new physical problems. The Calculus of Variations can be used to find the adequate formulation of the problem. An example is the AFIR method of Subsection 4.2 where the variational ansatz opens the possibility to construct solution curves. Another interesting example is the Gentlest Ascent Dynamics discussed in Section 5, where the optimal control theory, an extension of the Theory of Calculus of Variations, plays a central role in the derivation of the equations governing these curves. Thus many other curve models proposed using optimal control can be derived for particular subjects or physical problems [14].