Predicting the limit of intramolecular H-Bonding with classical molecular dynamics

The energetics of intramolecular recognition processes are governed by the balance of pre-organization and flexibility that is often difficult to measure and hard to predict. Here, by using state-of-the-art classical molecular dynamics simulations, we predict and quantify the effective strength of intramolecular interactions between H-bond donor and acceptor sites separated by a variable alkyl linker—in a variety of solvents and including crowded solutions. The fine balance of entropic and enthalpic contributions posits a solvent-dependent limit to the occurrence of intramolecular H-bonding. Nevertheless, H-bond free energies are rigidly shifted among different solvents with, for example, a systematic ~13 kJ/mol gap between water and chloroform. Molecular crowding shows little effects on thermodynamic equilibrium but it induces pronounced variations on H-bond kinetics. The results are in quantitative agreement with available experimental measurements (in chloroform) and showcase a general strategy to interrogate molecular interactions in different environments, extending the limits of current experiments towards the prospective prediction of H-bond interactions in pharmaceutical, agrochemical, and technological contexts.

Hydrogen bonds are ubiquitous interactions that drive molecular recognition, 1,2 determine the properties of water and other polar solvents, 3,4 play critical roles in enzyme catalysis [5][6][7] and contribute to the structural stability 8 as well as to the specificity of drug-target complexes. 9,10 While H-bonds are typically conceived as interactions between pairs of molecules, intramolecular hydrogen bonds are widespread in biological molecules, 11,12 and are crucial in the design of new drugs and materials, [13][14][15][16][17] including supramolecular machines. [18][19][20][21][22][23] Unfortunately, characterization of intramolecular H-bonds (and exploitation thereof) is still partial, most likely as a consequence of complexities that cloud the interpretation of experimental data obtained in large and flexible entities. 11 Thus, it has long been known, for instance, that conformational flexibility posits a limit to the occurrence of intramolecular H-bonds, 24,25 but only recently Hubbard et al. 26 have quantified such a limit using competition experiments in CDCl3 within a controlled molecular context. As a matter of fact, these measurements in deuterated chloroform 26 set an unprecedented experimental reference in the field. Yet, in aqueous environments, the tiny population of Hbonded species still challenges the limit of their experimental quantification and it remains unclear to which extent H-bond measurements performed in noncompetitive solvents can be extended to polar solvents, such as water. 27,28 As additional layer of complexity, the behavior of intramolecular H-bonds in diluted water solutions could be altered by concentrated, or crowded, conditions that are likely to represent, better than diluted ones, the biological environment. 29,30 In this context, computer simulations could bring about a major productivity leap providing quantitative information on interacting processes escaping from spectroscopic detection. Whereas ab initio simulations can provide unparalleled details on H-bonds properties 31 Here, building on recent experimental data 26 we present a computational investigation on the influence of conformational flexibility on H-bonding in a strictly intramolecular context using a series of model compounds 26 ( Figure 1A) immersed in a polar solvent (water), polar aprotic (tetrahydrofuran), and an apolar one (chloroform, for which partial experimental data 26   a Beyond the low solubility of 1-10 (predicted logS in water < -1), the experimental detection of conformational ratios in water seems unfeasible for these compounds even using an indirect competitive binding approach. Indeed, if the △G of the competitive binding event (△G b ind ) is > 1 kJ/mol, no measurable binding is observed. 26 Using a strong acceptor (e.g.: phosphine oxide), we estimate a △G b ind of at least +4 kJ/mol even for compounds 7-10 that have a fairly accessible H-bond donor.
Thus, the free energy associated to the intramolecular H-bond in a given solvent, △Gintra, can be obtained as −kBTlnP(folded)/P(unfolded), where kB is the Boltzman constant, T is the temperature and P(folded) and P(unfolded) is the population of folded and unfolded conformers, respectively.
Chloroform-derived results for compounds 1-10 show a non-trivial dependence between △Gintra and the donor-acceptor separation ( Figure Figure 3C and Supporting Information, SI). We note that increments of the linker size progressively increase the entropy cost (-T△Sintra; Figure 3C, blue triangles) of circularization, with an average penalty of ~3 kJ/mol per rotor. As the number of rotors goes beyond 6, the entropic cost becomes larger than the enthalpic gain (△Hintra, blue points in Figure 3C) and the intramolecular H-bond in chloroform is disfavored. Overall, for 7 or more rotors, the linker length enables the folded state to retain a conformational freedom that makes the resulting -T△Sintra values changing regime.
Contrary to chloroform solutions, the prevalence of circular topologies in water solvent is tiny as only a little population of intramolecular H-bond is observed in the series of compounds studied here (Figure 3, red plots). As expected, water molecules competed avidly with the intramolecular H-bond. As a result, the entropic cost of cyclization is higher than the enthalpic gain arising from intramolecular H-bond formation (red plots in Figure 3C). By comparing in-water and in-chloroform △Gintra values, we note that, depending on the environment, compounds 2-5 will populate different conformations and present different polarities-thus acting as molecular chameleons 39,40 whose conformation can, for example, modulate membrane permeability. 13,41,b This behavior occurs in a size range relevant for small-molecule pharmaceuticals and should be considered for more accurate in silico prediction of ADME profiles. [14][15][16] Very interestingly, the △Gintra vs linker-length profiles in water (red plots) and chloroform (blue plots) are clearly correlated ( Figure 3A); the intramolecular interactions are systematically 12-15 kJ/mol weaker in water than in chloroform, which is commensurate with the values proposed by other investigations. 27 The thermodynamic quantities in Figure 3C show that the fine balance between enthalpy and entropy has a clear relationship with the solvent. However, the overall balance between enthalpy and entropy (i.e.: △Gintra) controlling the formation of intramolecular cycles is, except for a rigid shifting factor, solvent independent. This is the case also when the △Gintra vs linker-length profile is collected in tetrahydrofuran (THF, Figure 3A, orange plot), suggesting that our conclusions can be extended to other pure solvents. We further note that such constant offset behavior on moving to different solvents, resembles the experimental data obtained when the competitive external binder was varied from the strong phosphine oxide acceptor to the weaker sulfinyl one. 26 Finally, as intramolecular H-bonding is of paramount importance in biology, we tested whether the above results and correlations hold also in highly concentrated, crowded conditions that can be reached, for instance, in the cellular environment (with up to 450 g/l of macromolecules). 29 Figure 4A). On the other hand, also the folding process is slowed down as the metastable conformation observed at a distance of ~0.4 nm can be stabilized by a bridging water molecule ( Figure 4A) that becomes less available in crowded conditions. 29 We argue that beyond the unspecific role of viscosity in decreasing molecular diffusion, 29 the reduction of the number of water molecules able to diffuse into and away from the H-bond in crowded conditions provides a mechanistic interpretation for the observed behavior ( Figure 4B). This model agrees with, and provides a "single-interaction perspective" to, previous studies showing that alteration of water dynamics in crowded conditions slows down protein kinetics. 47,48 The model also generalizes the behavior observed in ligand-receptor kinetics where the water accessibility of hydrogen bonding moieties can modulate ligand binding and unbinding rates. [49][50][51] In summary, we have shown that the characterization, by carefully set up

Methods
The 3-D structure of compounds 1-10 was built using Maestro from Schrödinger LCC. Compounds were modeled with the Generalized Amber Force Field (GAFF) 53 and partial atomic charges were assigned to the extended conformation using the Restrained Electrostatic Potential (RESP) fit 54 at the HF/6-31G* level of theory using the RED server. 55 See SI for further discussion on the charge model and the transferability of the computational approach. Parameters and topology files were prepared with Acpype. 56 Compounds were solvated with a 0.8 nm thick box of TIP3P 57 water molecules with periodic boundary conditions; for simulations in chloroform 58,59 and tetrahydrofuran 58,59 a 1.5 nm-thick solvent box was used. Crowded environments were created by adding to the system box multiple copies of polyethylene glycol (PEG, H−(O−CH2−CH2)n−OH; with n=2); PEG was modeled with GAFF and RESP charges.
Each system was equilibrated at constant pressure and temperature (1 atm, 298 K); production runs were evolved at 298 K in the NVT ensemble with the velocity rescaling thermostat. 60 Hamiltonian Replica EXchange (H-REX) simulations 37 used 16 replicas with the scaling of all the atoms of the solute 61 with values of λ ranging from 1 to 0.59. Exchanges were attempted every 500 steps. In infinitely diluted conditions (one solute molecule in the box), each replica run for 50 ns with a cumulative sampling time of 800 ns per compound per solvent used. In crowded conditions (with PEG) each replica either run for 100 ns (at 200 g/l) or for 200 ns (at 450 g/l). The effective temperature of each replica relates to the scaling factor λ of the replica as Teff=298K/λ and the values of lnKintra were plotted against 1000/Teff to estimate variations of entropy and enthalpy. All the simulations were run using GROMACS 4.6.7 62 patched with PLUMED 2.1 63 and the H-REX implementation. 36 Structural figures were made with VMD 64 and chemical structures drawn with Marvin 18.1 (ChemAxon).

Supporting Information
Full methodological details on system set up, MD simulations and calculations.