Locating transition states on potential energy surfaces by the gentlest ascent dynamics

The system of ordinary differential equations for the method of the gentlest ascent dynamics (GAD) has been derived which was previously proposed [W. E and X. Zhou, Nonlinearity 24, 1831 (2011)]. For this purpose we use diverse projection operators to a given initial direction. Using simple examples we explain the two possibilities of a GAD curve: it can directly find the transition state by a gentlest ascent, or it can go the roundabout way over a turning point and then find the transition state going downhill along its ridge. An outlook to generalised formulas for higher order saddle-points is added. 2013 Elsevier B.V. All rights reserved.


Introduction
The concepts of the potential energy surface (PES) [1,2] and of the chemical reaction path are the basis for the theories of chemical dynamics. The PES is a continuous function with respect to the coordinates of the nuclei. It is an N-dimensional hypersurface if N = 3n and n is the number of atoms. It must have continuous derivatives up to a certain order.
The PES can be seen as formally divided in catchments associated with local minima [1,3]. The first order saddle points or transition states (TSs) are located at the deepest points of the boundary of the basins. Two neighbouring minima of the PES can be connected through a TS via a continuous curve in the N-dimensional coordinate space. The curve characterises a reaction path. One can define many types of curves satisfying the above requirement. The reaction path model widely used is the steepest descent (SD).
There exist a large number of proposed methods that in principle reach a TS when the minimums associated to the reactant and product are known. See Ref. [4] and references therein. There are also methods that find the TS when only one minimum is known. In this case, the problem is much more cumbersome because the initial data are just the geometry coordinates of the minimum, however, the direction of the search is open. As in the first case many algorithms have been developed for this type of problem [4]. A great number of these algorithms are based in a generalisa-tion of the Levenberg-Marquardt method [5][6][7] that basically consists of a modification of the Hessian matrix to achieve both, first the correct spectra of the desired Hessian at the stationary point, and second to control the length of the displacement during the location process. The first proposed algorithm within this philosophy is due to Scheraga [8] and from than up to now the list is very large [9][10][11][12][13][14][15][16][17][18][19][20]. None of the methods are foolproof, each of them has some problems. Recently E and Zhou [20] have proposed an approach called the 'gentlest ascent dynamics' method. This method can be seen as a new reformulation of the method proposed some time ago by Smith [12,13] under the name 'image function'. The method is based on the generation of an image function that is a function which has its minima at exactly the points where the original PES has its TSs and moreover by an application of a minimum search algorithm to this image function. The converged point should correspond to a TS in the actual PES. Helgaker [14] modified the algorithm by the trust radius technique. Sun and Ruedenberg [21] analysed the method concluding that image functions do not exist for general PES so that a plain minimum search is inappropriate for them. A nonconservative field gradient of the image function exists. The global structure of the image gradient fields is considerably more complex than that of gradient fields of the original function. However, the image gradient fields appear to have considerably larger catchment basins around TSs. Besalú and Bofill [18] showed that the Smith algorithm is a special type of the Levenberg-Marquardt method.
In this Letter we show the connection between the Smith method [12,13] and that described by E and Zhou [20]

Background of the method
Let us denote by V(q) the PES function and by q T = (q 1 , . . ., q N ) the coordinates. The dimension of the q vector is N. The superscript T means transposition. At every interesting point q the PES function admits a local gradient vector, g(q) = r q V(q), and a Hessian matrix, H(q) = r q r q T V(q). The family of image functions of V(q), labelled by W(q), is defined by the differential equation [21] where f(q) is the image gradient vector, U v is the Householder orthogonal matrix constructed by an arbitrary vector v(q) being in principle a function of q, and I is the unit matrix. The Householder orthogonal matrix is a reflection at v(q). It has the property that U v = U v T , and it is the result of the difference between the projectors (I À P v ) and P v , because it holds trivially U v = I À 2P v = (I À P v ) À P v , being P v the projector that projects into the subspace spanned by the v-vector [22]. If the derivatives of v(q) with respect to q are nonvanishing, the image Hessian matrix, F(q) = U v H(q), is not obtained by the differentiation of f(q). Taking into account Eq. (1), this differentiation results in The term in brackets is usually not zero and not symmetric, and this non-symmetry is due to the effect of the differentiation on the P v projector. In other words, The inequality of Eq. (3) implies that the image gradient field defined by Eq. (1) is not integrable to an 'image PES' W(q). More explicitly, where dq/dt is the tangent of an arbitrary curve joining the points q 0 = q(t 0 ) and q 1 = q(t 1 ). Due to Eq. (3), this gradient field vector should be considered as a nonconservative force field. From this fact it follows an image of the PES function does, in general, not exist [21,23]. From Eq. (3) it is easy to see that at the stationary points, where g(q) = 0, the inequality is transformed to an equality if the v-vector is an eigenvector of the Hessian matrix. Note As pointed out by Sun and Ruedenberg [21], the image functions do exist until the second order in the vicinity of its stationary points for any PES taking v as an eigenvector of the Hessian matrix. Due to this fact the SD curves of the quadratic image function are approximations to the gradient image curves of W(q) being the image potential of V(q).
With the previous analysis of the general nonexistence of an image PES, we can take the image gradient field given in Eq. (1) to define the field of SD curves as, where t is the parameter that characterises the SD curve, q(t). If Eq. (5) is multiplied consecutively from the left by the set of (N-1) linear independent orthogonal vectors to the v-vector, we see that it corresponds to a curve which is energy descending along this set of directions on the actual PES, whereas is ascending on the v-vector direction. This property makes the set of curves defined in Eq. (5) suitable for the location of a TS from a minimum. This observation is supported by the fact that Eq. (5) can be rewritten as where the definition of P v has been used. Eq. (6) is the basic equation of the string method proposed for the location of reaction paths and TSs [24]. The v-vector in this method is the current tangent of the path. Because we are interested to find TSs from minimums of the PES we can use the nonconservative property of the gradient image field to modify the v-vector, during the location process. For this purpose, we first consider that at the minimum, as well as at the TS, the last term of the right hand side part of Eq. (2) is equal zero due to g(q) = 0. Second, at the TS, the Hessian matrix, H(q), possesses only one eigenpair with negative eigenvalue. The associated Raygleigh-Ritz quotient of this eigenpair with a negative eigenvalue is the lowest that the Hessian matrix can achieve at this point being equal to the corresponding eigenvalue [25]. The Raygleigh-Ritz quotient for a given vector v and matrix H is defined as, The structure of this eigenvector is unknown. To find the TS one should, however, ensure that during the research process the path walks through the PES (given until second order) such that the character of the surface becomes closer to a first order saddle point. Taking into account these two considerations we transform Eq. (3) imposing that g(q) = 0 at each point of the search and multiplying the resulting equation from the left by (I -P v ) and from the right by P v , The effect of this multiplication by the projectors, (I-P v ) and P v , is that the resulting Eq. (7), multiplied from the right by the v-vector, is the gradient of the Rayleigh-Ritz quotient with respect to this vector. If the v-vector is an eigenvector of the H(q) matrix then the right hand side part of Eq. (7) is equal zero because every eigenvector extremises the corresponding Rayleigh-Ritz quotient. We will denote the Rayleigh-Ritz quotient by k q (v) to indicate its dependence on q through the Hessian matrix. If a v-vector makes the gradient of the Rayleigh-Ritz quotient equal zero then this vector is an eigenvector of the H(q) matrix and the value of the Rayleigh-Ritz quotient coincides with the corresponding eigenvalue of the H(q) matrix. These properties suggest that a v-vector can be changed following the SD direction of the Rayleigh-Ritz quotient gradient with respect to v, Eq. (8) is also a function of q through the Hessian matrix. The Rayleigh-Ritz quotient of the new v-vector obtained from, v ? v + dv/dt Dt, will be lower with respect to the previous one and, in addition, Eq. (5) will give us a new energy ascent direction and a set of orthogonal N-1 energy descent directions. Eq. (8) searches for either the lowest positive or the single negative Rayleigh-Ritz quotient if it exists, whereas Eq. (5) determines points on the PES along the action of an increase of the energy in the vvector, and a decrease along the set of orthogonal directions to this vector. The specific action of Eq. (5) defines the type of points of the PES so that its structure until the second order on q resembles a first order saddle point. In turn, Eq. (8) finds v-vectors with lowest or possibly negative values of the Rayleigh-Ritz quotients of the corresponding Hessian matrix. These two coupled actions on the Eq. (5) and Eq. (8) describe a gradually or gentlest form to find the TS located on the boundary of the basin which contains the start minimum. Eqs. (5) and (8) are coupled since both depend on the v-vector explicitly on q through the gradient vector and the Hessian matrix respectively. The integration of Eqs. (5) and (8) implies the solution of the differential equation where q 0 is a slightly distorted point near the minimum. The gradient at point q 0 is selected automatically as the initial v-vector. Eqs. (9) are the basic expressions of the algorithm proposed by E and Zhou [20] which are reviewed in this study. Note that the system (9) is a system of differential equations. Corresponding to the initial conditions (9b) there emerges a full family of solution curves passing every point of a given region. This is in contrast to the long known gradient extremals (GE). They also realize a shallowest ascent idea; however, they are special solution curves of the equation Hg = kg, where k is an eigenvalue of H. Thus the GEs pass only points where the gradient itself is an eigenvector of H [2, [26][27][28][29][30][31]. GEs do not cover a region.

Behaviour and analysis
The set of Eqs. (9) has been integrated using the explicit Runge-Kutta method of order 8(5, 3) [32]. We have used two-dimensional PES models to analyse the behaviour of the gentlest ascent path. The first PES model used for this purpose is the Wolfe-Quapp PES [33,34]. The equation of this surface model is We look for the gentlest ascent dynamics if we start from the minimum located at the point (1.124, À1.486) with energy in arbitrary units À6.369. We take the corresponding gradient vector there as a starting point (1.2, À1.5) and as the initial v-vector, according to Eq. (9.b). The curve line is depicted in Figure 1. It starts from the minimum. The curve ends at the TS located in (0.093, 0.174) with energy À0.644. It follows the valley joining these two stationary points. Figure 1 also shows the behaviour of the v-vector. Initially the v-vector has the same direction to the tangent of the curve but along the initial sub-arc of the curve the vector points to the direction where the energy increases very fast, and finally the vector corrects its direction towards the direction of the transition vector. The transition vector is the eigenvector of the Hessian matrix at the TS with the negative eigenvalue. At the end point, namely the TS, the v-vector is the vector which produces the lowest value of the Rayleigh-Ritz quotient of the Hessian matrix. It is the eigenvalue of the transition vector. This behaviour conforms to that one which is explained in the previous section. The curve is determined by the condition that the v-vector minimises the k q (v) within the limits imposed by the Mini-Max eigenvalue theorem [25]. A second curve is computed on the same PES to get the same TS revealing new features of this type of GAD curves. In this case the curve starts near the minimum located at the point (À1.174, 1.477), with energy À6.762. The behavior of this curve is drawn in Figure  2. The path follows the valley in the direction where k q (v) takes the possibly lowest value. The path arrives a highest energy value at a turning point and starts to descend along a ridge, and it ends again at the TS located at (0.093, 0.174) with energy À0.644. As shown in Figure 2, on a large sub-arc from the minimum the v-vector has the same direction as the tangent of the curve. Of course, the large arc with the turning point at its highest energy does not represent the 'gentlest ascent', seen from the TS and the valley nearby. So, the name of the method, GAD, is a suggestive name but it is not realised in every case. However, note that the TS is in a side valley, but the method finds the TS, in general. At the point (1.849, 0.635) the curve achieves the highest energy value and starts its descents. This point is a turning point (TP) of the curve, where the next equation is satisfied where Eq. (5) has been used. In general a TP appears on the gentlest ascent path when the square of the gradient norm in the subspace spanned by the v-vector is equal to the square of the gradient norm in the complementary subspace spanned by the set of linear independent vectors orthogonal to the v-vector. This result is identical to the equality g T g/2 = g T P v g, and with the definition of P v we can rewritten it as, 1/2 1/2 = g T v/(g T gv T v) 1/2 , concluding that the GAD curve has a TP at the point where the v-vector and the gradient of the actual PES form an angle of p/4 radians. The curve leaves the valley where the starting minimum is located, at the point TP (1.842, 0.768) and it enters to a ridge region. Because this point satisfies the next equations is the adjoint matrix of the Hessian H(q) at the q point, we conclude that this point is a valley-ridge transition point (VRT) rather than a valley-ridge inflexion point (VRI) [35][36][37]. In a VRI point both equations are equal zero. From the minimum to the VRT point the curve shows positive values of the Rayleigh quotient, g T Ag/(g T g), whereas they are negative from the VRT point to the TS. This quotient is taken as a convexity criterion of a PES region where a point is located. Positivity of the quotient means that the point is in a valley, otherwise it is on a ridge [38]. The result supports the above comment, namely, that one sub-arc of the curve is located in the deep valley from the minimum to the VRT point. At this point the curve turns into the direction of the ridge following the lowest value of k q (v). But this corresponds to the v-vector orthogonal to both, the ascent of the valley and the descending ridge directions. On the sub-arc located on the ridge at each point the gradient vector increases its orthogonality with respect to the current v-vector. The curve evolves along a direction of decreasing energy. We can write In general the curve evolves so that the energy decreases at each step if the gradient vector g and the v-vector form an angle within the open domains (p/4, 3p/4) or (5p/4, 7p/4) radians.
In general we can mark the following features of a GAD curve. When along the sub-arc of the gentlest ascent curve its gradient vectors define an angle with the corresponding v-vectors within the open domains (p/4, 3p/4) or (5p/4, 7p/4) radians, then the sub-arc shows an energy decrease. If the angle is within the open domains (3p/4, 5p/4) or (7p/4, p/4) radians it shows an energy growth. When the angle is equal to p/4, 3p/4, 5p/4, or 7p/4 radians the curve is stationary with respect to the variation of the energy, the TP situation.
The same type of calculations with the PES proposed by Neria et al. [39] and modified by Hirsch and Quapp [38] confirms the same facts. The equation of this surface model is Vðx; yÞ ¼ 0:06ðx 2 þ y 2 Þ 2 þ xy À 9expðÀðx À 3Þ 2 À y 2 Þ À 9expðÀðx þ 3Þ 2 À y 2 Þ: The central point is the TS located at (0, 0) with energy À0.002. The minima are located at (2.71, À0.15) and (À2.71, 0.15) with energy À5. 24. The surface has gorges around the points (2, À2) and (À2, 2) [38]. We start the gentlest ascent path at the point (2.6, À0.2) near to a minimum. Its evolution goes over the gorge, it leaves the valley where the minimum is located and it enters the ridge, but then it enters again to the valley region of the min-imum. The ascending energy behaviour ends at the point (0.871, À1.428) being a TP. On the sub-arc between the points TP and TS a VRT point exists at (0.002, À1.28) corresponding to the descending energy. At this point the GAD curve enters a ridge that touches the TS. Figure 3 depicts this behaviour, the evolution of the v-vector is also shown.
On the Müller-Brown PES [40], the GAD method exhausts its possibilities. The global minimum is located at a very deep valley. The coordinates are (À0.56, 1.44). We use two different starting points. The first starting point of the GAD method is at the point (À0.54, 1.4) near to the minimum. The GAD trajectory leads to the next TS, see Figure 4a where it is the dotted line. It goes over a TP but that is normal. In Figure 4a we show the behavior of the gradient extremal curve (GE) [2,[26][27][28][29][30][31] for comparison. The GEs are the bold dashed lines. A part of the GAD on the ridge coincides with the GE. However, the coincidence is for different reasons. For the GAD case, in the ridge region the v-vector is orthogonal to the gradient and due to this fact the trajectory shows a decrease in energy. In other words, dV/dt is reduced, it is, dV/dt = Àg T g < 0 since v T g = 0. Recall that the v-vector evolves in this process to reach the eigenvector of the Hessian matrix of lowest eigenvalue. Analyzing the GE in the region of coincidence, the gradient at each point of this curve is by construction an eigenvector of the Hessian corresponding to a positive eigenvalue. The eigenvalue is positive because its eigenvector comes from the evolution of the Hessian at the TS labeled as SP 1 , in direction orthogonal to the col. For the GE the variation energy is given by, dV/dt = (g T g) 1/2 e g , being e g a component of the GE tangent equation [28,31] and depending of its sign, the curve increases or decreases in energy. One can conclude that in the region of coincidence the v-vector is orthogonal to the gradient vector, an eigenvector of the Hessian matrix. Thus, the v-vector is in this region an eigenvector of the Hessian.
For a second trajectory we use the point (À0.58, 1.427) for the start, also near the minimum, however, we get a totally different result, see Figure 4b. At the beginning, the GAD trajectory, a continuous line, goes along the deep valley of the minimum, uphill like the GE, which in Figure 4b is again represented by dashed lines, see also Figure 2 of Ref. [28], or Figure 5 of Ref. [41]. The initial evolution of GAD goes in the direction where the gradient and the v-vector form an angle lower than p/4 radians. In fact it coincides with the GE, which in turn implies that the v-vector is parallel to the gradient. In the region where both curves coincide the increasing energy along the GAD curve is dV/dt = ( The behavior ends at a TP which emerges near (À2.8, 0) in a region which is far away from the border line of the searched valley-ridge transition. In Figure 14 of Ref. [38] the borders between valleys and ridges of the Müller-Brown PES are shown. After the TP the trajectory turns into the left large side valley of the Müller-Brown surface. There is no stationary point, so GAD cannot find any one. The region is a kind of a trap, a 'dead' valley. The GE passes the region and leaves it to the mountains at the left hand side, but the GAD trajectory goes on and back, from one TP to the next, and again back, and only by an accident, it later leaves the region. How-ever, then it immediately leads to the TS2 of the Müller-Brown surface. It seems to be accidentally which TS is found. After the 'chaotic' evolution, the GAD trajectory finally catches a TS.

A different view of the theory and generalisation
Mathematically the above method can be reformulated as where U(v, w) is the PES function, V(q), expressed as a function of (v, w) coordinates being the w vector of dimension N-1. Finally, the Z matrix is formed by a set of N-1 linear independent vectors orthogonal to the v-vector. A suggested solution of the conditions is formulated in Eq. (9). As noted, the method is appropriate for a first-order saddle point search, and can be easily generalised to the search of higher order saddle-points. The importance of second order saddle-points is known [42,43]. If I is the order of the saddlepoint, where I 6 N, then the set of Eqs. (9) takes the form The variation energy through this curve is given by where the first N components of Eq. (15.a) have been used. Eq. (17) tells us that when the sum of the square of the components of the projected gradient vector in the subspace spanned by the set of v ivectors is higher than 1/2 then the curve evolves in the direction of increasing the energy. If it is lower than 1/2 then evolves in the direction of a decrease of energy. When the sum is equal to 1/2 the curve is at a TP.

Conclusion
In this study is reviewed the GAD method for the location of the saddle points which surround a given minimum. It behaves like a growing string method. The integration of the curve can be done by any method suitable to integrate a system of first order ordinary differential equations. The GAD constructs a curve supported by a set of generated vectors.
The numerical examples reported also show that GAD has a large convergence region.