DFFR: A NEW METHOD FOR HIGH-THROUGHPUT RECALIBRATION OF AUTOMATIC FORCE-FIELDS FOR DRUGS

We present DFFR (drug force-field recalibration) a new method for the refining automatic force-fields used to represent small drugs in docking and molecular dynamics simulations. The method is based on a fine tuning of torsional terms to obtain ensembles that reproduce observables derived from reference data. DFFR is fast, flexible and can be easily automatized for a high-throughput regime, making it useful in drug design projects. We tested the performance of the method in a few model systems and also in a variety of drug-like molecules using reference data derived from: i) DFT/SCRF (density functional theory coupled to a self-consistent reaction field representation of solvent) calculations on highly populated conformers and ii) enhanced sampling QM/MM where the drug is reproduced at the QM level while solvent is represented by classical force-fields. Extension of the method to include other source of reference data is discussed


INTRODUCTION
Force-fields (FFs) are a set of simple mathematical expressions connecting the coordinates of a system with its potential energy [1]- [3]. They are the central element in any classical molecular simulation and their accuracy is crucial to obtain meaningful results. The first FFs for the representation of complex molecules were developed by Lifson's and Allinger's groups [4] [5] focusing in the first case on biomacromolecules and in the second in small organic molecules. While Allinger's MMx FF reproduced accurately hydrocarbon-based compounds the flexibility of Lifson's functional make them preferred by the community and in fact, most of current force-fields are based on it; eq. 1: where and stand for stretching and bending (typically represented by harmonic terms), is a torsional energy related to the rotation of chemical bonds (captured by a Fourier expansion), stands for electrostatic interactions (typically represented by an atom-centered Coulomb term) and is a van der Waals term (typically represented by a r -12 -r -6 term).
FF developers quickly addressed their efforts in two different directions: i) the development of potentials able to reproduce small solvents such as water and organic liquids [6], [7] which were needed to perform realistic simulations of any chemical or biochemical system; ii) the development of operational FFs for describing proteins and nucleic acids [8]- [11]. In the first case, FF calibration focused in the non-bonded part and was carried out manually in a trail-and-error process until the refinement of the FF makes possible to obtain Monte Carlo (MC) or molecular dynamics (MD) ensembles reproducing well known experimental observables of the system (ex. vaporization heat, density, compressibility, internal energy, radial distribution functions, dielectric constant,…). In the second case, parametrization was less obvious as the number of parameters to fit for a macromolecule is enormous and the amount of experimental information available is rather limited. To reduce the parametrization problem macromolecular FFs assumed group transferability, but even with this strong assumption, parametrization was difficult due mainly to the lack of reference data. Consequently, a large part of terms in force-fields has been parametrized using low level gas phase quantum mechanical (QM) calculations, which introduces a non-negligible source of error in the FF parameters.
FFs for small liquids have remained unaltered for years [12], [13]; they are reliable and their caveats emerge only when they are used in conditions [14] very different to those considered in their original refinement. Macromolecule FFs do not have the same resiliency to the pass of time, and none of them has survived a decade as the state of the art in the field. The reasons are multiple: simplicity of the functional, problems of transferability, lack of reference experimental data and use of too simplistic QM calculations as fitted observables. All these practical problems make the parametrization prone to uncertainties, and after their publication all FFs are scrutinized by the community to detect errors that should be corrected in further revisions. Despite this inefficient evolutionary model, after more than 4 decades of refinements, the last generation of FFs for proteins and nucleic acids, provide results of high quality, in many cases not far from experiments [15]- [19].
Small molecules with unusual functional groups and potential therapeutic action are the ultimate challenge for FF development. Pharmaceutical companies have physical libraries with millions of such molecules and virtual datasets containing a much larger number of synthetically accessible compounds. Functional groups in these molecules can be extremely diverse, making transferability of parameters difficult. Experimental information on these compounds (many of them existing only in silico) is null, precluding empirically based parametrization. Finally, systematic QM parametrization exploring all degrees of freedom is simply impossible, as some ligands have a large number of correlated degrees of freedom (for some of our ligands in our BCE database (see previous companion paper) QM calculations on 10 18 different conformers would be required to systematically explore their conformational space). This situation has forced the community to use of variety of approximated methods [20]- [29], which in general assign parameters based on risky group similarity criteria and very low level QM calculations.
We present here DFFR, a new FF refinement scheme aimed to improve automatic parameter assignment methods by recalibrating those torsions that have the largest impact in determining the conformational space of drug-like small molecule. The method is based on the maximum entropy principle [30] and introduces the minimum changes in a guess force-field needed to reproduce reference data, for example the relative population of representative conformations obtained from DFT/SCRF calculations of ensembles derived from QM/MM molecular dynamics simulation. The procedure is fast, flexible, expandable to incorporate other type of reference data and easy to automatize, i.e. suitable for the high throughput regime.

METHOD DESCRIPTION
DFFR starting point is a MD simulation (ideally an enhanced sampling one) obtained using a guess FF from which rough ensembles are obtained. The ensembles are analyzed to detect those torsions changing the most, i.e. represent the most important degrees of freedom in the molecule. Within the BCE procedure (see previous companion paper) this implies Hamiltonian Replica Exchange (HREX) [31] simulations performed with standard GAFF [21] FF and AM1-BCC charges [32], but the method can be coupled to any other sampling strategy and approximate FF. We present here examples of use of the method taking as reference DFT/SCRF and QM/MM data, but the method can be extended to any reference data including experimental one (details on how to incorporate NMR observables are provided).
Definition of the parametrization space. Even the approach outlined here can be extended to any FF term, we will focus here in the parametrization of the torsional term, as it has the largest impact in determining the conformational preferences of small molecules. In most FFs this implies the parametrization of an energy term defined as shown in eq. 2: where is the value of the dihedral , , are the amplitudes and , are phase angles and the sum extends for all the chemical bonds.
A pure force approach would imply fitting amplitudes (barriers) and phases for all the dihedrals in the molecule, which is impractical even for medium sized molecules. Thus, as noted above, we ignore rigid torsions, i.e. those that are confined in a narrow region of the conformational space and fit only one dihedral per torsion. With these assumptions, we can define an optimal torsional ( ) applying only to a subset (M) of the original dihedral space (eq. 3). Modification of the method to other Fourier expansions is straightforward. For a given dihedral, the best set of amplitudes and phases are those minimizing the error function with respect to a reference potential ′ ⃗ , as shown in eq 4: where ⃗ , = ⃗ , − + and ′ ⃗ = , . . , ; , . . , , , . . , refer to all coordinates different to the optimized dihedral. Notice that the sum extends over a certain 2 -periodic interval. As above, values marked with the super-index ˜ refers to op mized FF and those without it to the original set of parameters.
Minimizing eq. 4 for the entire torsional profile is the standard procedure used to refine forcefield, but this is impractical for ligands with many rotatable bonds. Thus, as a first approach, we assume here that the general shape of the approximated potential energy function is not too different to the real one, and that the main discrepancies appear mainly in the relative population of the most stables regions. With these assumptions, we can simplify the configurational space as a set of families obtained by clustering of the original trajectory as shown in eq. 5: where the brackets mean Boltzmann's ensembles, stands for configurational variables, k are the number of clusters in which the original ensemble is divided, is the representative structure of cluster k and is the associated probability of the cluster. As shown below, in the unlikely case that we can have QM/MM data as reference simplifications implicit to the use of eq.5 are relaxed and we can use the entire density profile as reference. Note that by extending the sum to a large number of states ( → ∞) the complete torsional density profile is obtained.
Maximum entropy principle and refinement strategy. The objective of the DFFR is to obtain a new set of parameters that make the associated state probabilities match the reference ones i.e.

{ } =
, perturbing as little as possible the general shape of the conformational space, i.e. introducing the minimum bias in the original FF. This refinement strategy follows the maximum entropy (ME) principle [30] which implies that the best fitting procedure is that reproducing the reference values introducing a minimum external information (i.e. reducing the loss of Shanon's entropy; [33]). As shown by others [34] ME solutions can be derived by imposing Lagrange constrains forcing the system to fulfill some reference values. Thus, it is straightforward to demonstrate that this can be done by minimizing the Lagrange function (eq. 6).
where − ( ) is a continuum function describing the variation of Lagrange constraints with the torsion , ( ) is the probability distribution (in the dihedral space) obtained by the approximate FF and ( ) is a continuum function used as reference.
Imposing minimum in the Lagrange variable we derive eq.7: where is a constant, and we can conclude that (eq.8): which assuming ( ) = ( ) is equivalent to Bussi's equation (eq 16 in reference [34]) for a set of reference observables. Rearranging eq. 8 and using Boltzmann's relation we obtain (eq.9): This means that the new FF, i.e. that reproducing the reference distribution ( ( )) generated by ( ( )) will be that obtained by adding to the original one a continuum biasing equation Biasing the dihedral distributions. In our case we define the continuum biasing function as a Gaussian Mixture (eq. 10) where Gaussian functions fit well population densities, but assume an energy function defined by the combination of harmonic potentials, which is not practical for MD codes. Thus, we should re-write the biasing potentials with other functions. To this we end, we first use again the Boltzmann's relationship (eq. 11).
Note that here ( ) would be exactly −ln , as outlined in eq. 8.
We can now fit the torsional profiles to Fourier expansions (eq. 3) and use eq. 12 to obtain the new set of parameters.
Iteration procedure for force-field refinement. The procedure described above provide a modified set of parameters { } , which yields to a different potential and a different ensemble, eventually even to different clusters and representatives, i.e. { } → , , , . This means that a further refinement might be required by computing new ensembles, new reference values and new biasing function, which in turn will lead to { } → , , , , the process being repeated until convergence. This procedure will be slow and computationally expensive, as it will require many MD simulations and eventually more reference calculations. Fortunately, in most cases we can guess the corrected distribution of dihedrals from the previous one as shown in eq. 13: where s is one iteration of the force-field refinement, and Definition of the reference probability ( ). The method outlined here is in principle general, but practical implementation requires the definition of the reference probability function, which depends on the type of reference data available. We have explored three different sources of reference distributions:  DFT/SCRF calculations. This is the standard BCE procedure as described in a previous companion paper. It implies to compute the energy of each cluster representative at the DFT level using a continuum description of the solvent. We start by fitting the distributions around free energy minima found in MD simulations to a mixture of Gaussian functions (eq. 10) as shown in eq. 14: where is the weight of the -th component and ( , ) is the -th Gaussian density function with mean and standard deviation , as outlined above.
We refine the geometry of the k-cluster representatives at the DFT/SCRF level [35] assuming that QM/SCRF optimization does not drive the molecule to a different cluster (which is a valid assumption in all tested systems). Thus, the weight of the different minima can be easily determined by using eq 15: where k refers to each cluster representative and energies are always referred to the most stable cluster representative. We then define for each cluster (eq. 16): where is the set of QM-conformers classified in the -th component (a structure is classified into the -th component if is the closest value to the dihedral value of the structure). Assuming all free energy minima have equal curvature, we can use these weights to define the biasing Gaussians as shown in eq. 17: i) Clustering QM/MM ensembles to obtain populations around minima [40], [41] which are fitted to Gaussian functions (as in eq. 14). We then reweight the Gaussian Mixture that represents ( ) using QM/MM cluster population, so ( ) can be written as where is the set of QM/MM-conformers classified in the -th component, is the cluster k population and is the total number of snapshots in our trajectory. This procedure is equivalent to the DFT/SCRF one except for the substitution of probability derived by Boltzmann´s inversion of DFT/SCRF free energies to real distributions derived from sampling.
ii) The entire dihedral distribution of the QM/MM trajectory is used as the reference. In this case the Expectation-Minimization algorithm is used to fit the QM/MM data to a Gaussian Mixture in the entire dihedral space, from which a continuum biasing potential is derived and fitted to Fourier potentials as shown above.
 Experimental data. In the daily practice of a drug design laboratory the availability of reference experimental data is rare. However, if available the method can easily incorporate such data into the refinement process. Particularly, 3 J-couplings and NOEs which can be translated into dihedral preferences and distance restraints which can be used to re-weight the states sampled in unbiassed simulations in a manner nearly equivalent to the refinement model used taking SCRF/DFT data as reference.
Technical details: Molecules considered here were prepared as described in a previous paper (see previous companion paper) and explored using HREX simulations [31] as implemented in Gromacs 4.6.7 [42] patched with Plumed 2.1 [43]. Simulations were performed in truncated octahedron boxes of TIP3P water molecules [44] (periodic boxes were selected to guarantee a minimum distance greater than 8 Å from the ligand to the closest face). The systems were neutralized by adding suitable ions, minimized, thermalized and equilibrated for 1 ns. We used 16 replicas with the scaling parameter λ going from 1 to 0.59 following a geometrical progression. Individual replicas were followed by 10 ns simulations, taking data every 1 ps. Replica exchanges were attempted every 500 steps and each of the individual trajectories was performed at the NVT ensemble. Particle Mesh Ewald (PME, [45]) and periodic boundary conditions were used to represent long-range electrostatic effects. All bonds linking hydrogens were frozen using SHAKE [46], which allowed us the use of 2 fs time scale for integration of Newton equations of motion. Trajectories were processed to define the most populated clusters using inhouse modified version of advanced clustering method based on dihedral matrix [40]. The geometries of cluster representatives were used as starting point for IEF-MST//B3LYP/6-31G(d) geometry refinement as implemented in Gaussian [47]. IEF-MST calculations were done considering water as solvent and the associated cavity and van der Waals parameters [35].
QM/MM-REMD (Quantum Mechanical/Molecular Mechanics-Replica-Exchange MD) simulations were done for five molecules using the DFTB3 Hamiltonian [48] (QM) and TIP3P water model [13] as implemented in the AMBER program [49]- [51]. 32 replicas were used for each of the molecules, with temperatures ranging from 298 to 498 K. Trajectories were extended for 10 ns for each replica. Results were subjected to the BCE procedure to define flexible bonds and the associated probability density maps.
Trajectories were stored for further analysis using MD database recommendations [52].

RESULTS AND DISCUSSION
Model systems: Small molecules are ideal systems to check for the ability of the method to correct errors in the guess force-field. We choose methanol, butane and alanine dipeptide as prototypic small molecules, for which torsions have been previously parametrized with high accuracy, allowing us to obtain reference distributions by running HREX simulations with "gold-standard" parameters, and in parallel "guess distributions" obtained by using different guess torsional parameters which deviates significantly from the real ones. We found that, in general, convergence to the real FF parameters is very fast (see Figure 1) even in cases where very incorrect parameters were considered. For example, for methanol we defined a guess set of parameters (iteration_1) that underestimates the stability of the trans 180º minima and moves the position of the -60 and +60 minima, yielding to a very unrealistic distribution of torsion probability, but just three virtual iterations (see eq. 13 above) are enough to converge to the reference parameters ( Figure 1). For butane our "guess" torsional parameters have only a V1 term which yields to a single maximum in the torsional probability distribution. As the +60 and -60 minima were not present in the original distribution convergence requires a few more virtual iterations to obtain parameters matching the reference ones ( Figure 3). Finally, for alanine dipeptide we define as "guess" torsional parameters a single (and high) V1 term for  and  torsions both centered at 0º, which yield to completely incorrect torsional distributions and a Ramachandran's plot where none of the secondary elements are present (Figure 1). The method requires only 4-5 iterations to converge to the reference values and provide correct mono and bidimensional probability maps. In summary, DFFR seems able to refine FF even in those cases where the guess values are inaccurate.
The DFT/SCRF refinement: We tested the power of the default refinement method using DFT/SCRF energy values as reference for 13 small drug-like molecules, all of them present in our BCE database and for which the bioactive conformation is known (see Suppl. Figure S1). Based on the default analysis of torsional space in BCE we refine from 1 to 3 torsions for each ligand. The refinement procedure leads to representation of the torsional space that agree much better with the reference QM data, as shown (for selected molecules) in Figure 2, even for those cases where original GAFF values leads to quite incorrect samplings (see for example dihedral 8 in panel B or dihedral 5 in panel F of Figure 2). A global analysis of the differences between QM torsional density (inferred from the minima free energy; see above) and the MM torsional density profiles demonstrates a massive improvement in the density profile errors (Figure 3) with respect to the results obtained using the default force-field GAFF. The difference in the stability of the best optimized geometry is however small (Figure 3), indicating that the QM optimization procedure is quite robust to the geometry of the cluster representative derived from classical simulations. In fact, inspection of individual profiles in Figure 2 strongly suggests that for most purposes DFT/SCRF and refined FF results are hardly distinguishable, allowing the user to perform classical calculations with a quality similar to that obtained at the DFT/SCRF -level of accuracy. Figure 2. Selected density distributions obtained by the default FF (GAFF in red), the reference DFT/SCRF data (obtained by Gaussian fitting of the relative population of the fitting; in blue) and the refined FF (in green). Note in all cases the much closer similarity of blue and green profiles compared with the red one. Values shown here correspond typically to 3 rd iteration refinement (in most cases functionals are already converged after 2 iterations). Representation of the ligands considered are shown in Figure S1. The QM/MM refinement: The development of fast QM/MM codes taking advantage of GPUs will make possible in a future to refine some crucial dihedrals taking directly QM/MM data [53] eliminating then the intrinsic shortcomings of the incomplete sampling and the continuum-related artifacts implicit to DFT/SCRF calculations. The QM-based parametrization has the additional advantage that it can use direct torsional population profiles rather than Gaussian-fitted estimates obtained from the analysis of the relative stability of DFT/SCRF free energy minima. The obvious shortcoming of this approach is the large computational cost which limits today (perhaps not in a near future) its general application in the context of drug-design pipelines.

Number of molecules
To test the applicability of QM/MM calculations within the DFT/SCRF framework we tested the method in six prototypical drugs for which QM/MM calculations were performed at the DFTB3 level of theory (see Methods). Results in Figures 4A and B were obtained for cases where GAFF default torsional parameters are reasonable and accordingly no dramatic changes in the torsional density profiles are found after recalibration of the FF. On the contrary, Figure 4F represent one of the cases where very significant changes in the torsional energy was required to reproduce the QM/MM torsional density profile. In all cases the improvement in the torsional profiles is evident, indicating the robustness of the process to the source of data used for recalibration of the FF. As computational power increases and QM/MM calculations can be done at higher levels of QM theory we can envision a full parametrization procedure relying on QM/MM simulations for hits appearing promising in drug design pipelines.

CONCLUSIONS
We present a fast, robust and automatized method for recalibration of FF based on reference distributions obtained from reference calculations obtained at the QM level. The method is general and can incorporate experimental data, for example relative state populations derived from J-coupling or NOEs, which are not difficult to obtain in a short time scale. The method is fast and can be implemented in the high throughput regime, which facilitates its application by pharma industry for the refinement of parameters assigned by means of low-level automatic FF annotators. DFFR is highly recommended for hits and lead compounds that can be used as starting point for an optimization process and that are likely be used in massive docking experiments or as a template in similarity-powered studies. The method will help to discard compounds unable to reach the bioactive conformation and will provide estimates of the distortion energy required to adopt the bioactive state, enriching then scoring functions used in most structure-based drug design protocols.