Experimental test of ensemble inequivalence and the fluctuation theorem in the force ensemble in DNA pulling experiments

We experimentally test the validity of the Crooks fluctuation theorem (CFT) in the force ensemble by pulling DNA hairpins, first with magnetic tweezers, next with optical tweezers using force feedback. The CFT holds when using the definition of work Wf = − ∫ xdf , where x is the molecular extension and f is the force. In contrast, it does not hold when using the usual definition, appropriate for the constant extension ensemble, Wx = ∫ f dx, showing the importance of the contribution of boundary terms to the full entropy production in a clear example of statistical ensemble inequivalence in small systems. We also evaluate the differences in the average dissipated work in the force ensemble as compared to the extension ensemble, highlighting ensemble inequivalence also at the level of molecular kinetics.


I. INTRODUCTION
Fluctuation theorems are mathematical identities that allow the recovery of thermodynamic properties in nonequilibrium experiments in driven microsystems, finding multiple applications in biophysics [1].Experiments are done by changing a control parameter following a time-dependent protocol.In small systems, where fluctuations dominate the microscopic behavior, the choice of the control parameter defines the proper statistical ensemble [2].Control parameters can be extensive or intensive depending on whether they scale with the system size.Examples of small systems controlled by extensive variables are single molecules pulled by laser optical tweezers (LOT) and atomic force microscopy (AFM) devices where the optical trap-bead distance and the cantileversurface distance scale proportionally to the polymer length.In contrast, in magnetic tweezers (MT) and acoustic force spectroscopy (AFS) the control parameter (magnetic force and acoustic pressure) is intensive and does not scale with the length of the polymer [3][4][5].In bulk systems the equation of state does not depend on the statistical ensemble.For instance, controlling the applied pressure or the volume of a gas in a piston leads to the same equation of state.However, such ensemble equivalence is not true in general.In fact, there is a considerable amount of work done in the literature addressing ensemble inequivalence from a theoretical, numerical and experimental perspective in a wide variety of systems.From mesoscopic and macroscopic solid systems such as cracking fiber bundles (e.g., paper) [6], magnetic materials exhibiting martensitic transitions [7], down to microscopic polymers [8][9][10] and gold nanofibers [11].Thanks to developments in micromanipulation technologies we can address the problem of ensemble inequivalence using single-polymers [12] as a model system.Mechanical systems (where force and * fritort@gmail.comextension are quantities that can be controlled) present clear advantages in comparison to other systems such as magnetic and electrical systems where it is possible to control the magnetic field and the voltage (the analogous of the mechanical force), but not the magnetization or the electrical current (the analogous of the extension).
The energy of a driven physical system coupled to a heat bath at a fixed temperature T is given by a Hamiltonian or energy function, H(λ, t ), where λ is the control parameter acting on the system and t is the time.The mechanical work exerted on the system when varying the control parameter from λ 0 to λ 1 is given by To illustrate how the control parameter choice constrains the physical description of the system, consider a single polymer with controlled extension, λ = x.This situation corresponds to the extensional ensemble (hereafter referred to as ExtEns).If the polymer is stretched by increasing the extension from x 0 to x 1 the mechanical work is given by the classical work expression [13]: where f = ∂ x H is the mechanical force acting on the ends of the polymer.However, if we control the mechanical force (λ = f ), the performed mechanical work in a protocol where the force is changed from f 0 to f 1 is given by: with x = −∂ f H.This situation corresponds to the force ensemble (hereafter referred to as ForceEns).Both work definitions are related by boundary terms via a Legendre transformation using extension and force as conjugate pairs: The Crooks fluctuation theorem (CFT) connects irreversible work measurements with free energy differences [14].The CFT states that the probability distribution P F (W ) of the work done on a system that is driven out of equilibrium in a finite-time protocol (forward, F process) satisfies the relation where P R (−W ) is the work distribution corresponding to the time-reversed (reversed, R process) protocol, G is the equilibrium free energy difference between the initial and final state, k B is the Boltzmann constant and T is the bath temperature.CFT holds for systems initially in equilibrium and independently of how far from equilibrium the system is driven [15].In general, it has been shown that Eq. ( 4) holds when the full work is measured, while it does not hold when partial work measurements are done [16] or when the transferred rather than the accumulated work is measured in controlled extension protocols using LOT experiments [17,18].Despite its importance for applications, the CFT has not been tested in the case of force-controlled single-molecule experiments.On the one hand, MT and AFS are high-throughput techniques that manipulate multiple molecules in parallel where force is the natural control parameter (magnetic field in MT and acoustic pressure in AFS).Testing the validity of CFT in these cases is essential to extend the applicability of free-energy recovery methods to high-throughput single-molecule techniques.Moreover, the CFT holds under general assumptions of microscopic reversibility, detailed balance, and might be used to test the validity of the work definition.The correctness of the theoretical work definition in the ForceEns Eq. (3) for the CFT Eq. ( 4) is widely accepted by the scientific community by now.However, recently there has been controversy in this regard [19][20][21][22].We thought it might be useful and illustrative to carry out the definitive experimental verification on the correctness of Eq. (3) in the CFT.Moreover, by comparing the ForceEns and ExtEns, we can show the importance of the often-neglected boundary terms in the CFT.These are essential as they strongly depend on the experimental conditions.Their study may also allow experimentalists to gather useful information to test large deviation theories, by mapping the large deviation functions between different ensembles.
There is also a fundamental interest in characterizing irreversibility in the force ensemble.Besides thermodynamics, kinetics is also strongly dependent on the statistical ensemble.The different DNA folding-unfolding rates measured in LOT experiments with and without force feedback [23][24][25] suggest stronger irreversibility and dissipation in the ForceEns as compared to the ExtEns.Here we show that energy dissipation in pulling experiments in the ForceEns is always larger than in the ExtEns and derive a general phenomenological expression for such difference.This demonstrates the strong effect of thermal fluctuations on the kinetics of small systems, depending on whether intensive variables (e.g., force or pressure) rather than extensive ones (e.g., extension or volume) are controlled.

II. FLUCTUATION THEOREM
Controlled force experiments were done using MT.We synthesized a 20 basepairs (bp) DNA hairpin [Fig.1(a)] flanked by two 29-bp double-stranded DNA handles [23].The molecular construct is tethered between a glass surface and a superparamagnetic 1-μm bead that is captured in a magnetic trap generated by a pair of permanent magnets [Fig.1(b)].The exerted mechanical force is modulated by the magnetic field gradient that increases as the magnets approach the glass surface.Thus, by approaching the magnets to the glass surface at constant velocity the DNA hairpin is stretched until it unfolds.This process is carried out from an initial force f 0 where the hairpin is folded up to a final force f 1 where it is unfolded.Next, starting from f 1 , the magnets are moved away from the glass surface following the time-reversed protocol until the force f 0 is reached.The extension of the DNA hairpin is obtained from 3D detection of the position of the bead [3].Typical force-distance curves (FDCs) can be seen in Fig. 1(c), where dark (light) curves represent the F (R) process.
According to Eqs. ( 2), ( 3), W f and W x are given by the shaded areas shown in Fig. 2(a).W f probability distributions in the F and R processes [Fig.2(b)] have been obtained using Eq. ( 3).The work value at the crossing point between both distributions (= G) does not change with the pulling rate, as expected [14].The CFT is tested by extracting logarithms in both sides of Eq. ( 4) and plotting ln [P F (W )/P R (−W )] as a function of W/k B T .When the CFT holds, data falls in a straight line of slope 1 and y-intercept equal to -G/k B T (i.e., the intersection between symbols and the horizontal black line).In the top panel of Fig. 2(d) it is shown how the CFT is fulfilled for W f .When the work is computed according to Eq. ( 2) notwithstanding the fact that force rather than the molecular extension is the control parameter, distributions of W x also present intersecting points that are independent of the pulling rate [Fig.2(c)].However, the CFT is not fulfilled [bottom panel of Fig. 2(d), the slopes of the fits are 0.07 ± 0.01 and 0.33 ± 0.04 for the 9.0 pN/s and 22.5 pN/s pulling rates, respectively].The breakdown of the CFT indicates that W x does not measure the correct thermodynamic work.In fact, the missing contribution in W x is the boundary term: . This term is not constant but fluctuates over pulling cycles as the initial and final extensions x 0 , x 1 are fluctuating variables (whereas f 0 , f 1 are fixed).In other words, the boundary term (xf ) is a stochastic variable that contributes to the tails of the work distributions that are crucial for testing the validity of the CFT in the work crossing region.
However, as the mechanical work, the free energy difference in the ForceEns G f is also related to the free energy difference in the ExtEns G x via: Angular brackets denote the average over all experimental realizations.We obtain G f = −32± 6 k B T and G x = 80 ± 7 k B T in the ForceEns and ExtEns, respectively.After subtracting stretching contributions (the different energetic contributions are described in Ref. [27]), a value of G 0 = 49± 7 k B T is obtained.This result is in very good agreement with the theoretical prediction obtained using the Nearest-Neighbor model for DNA [28,29], giving G 0 = 51 k B T .In LOT the position of the trap is the default control parameter, whereas the molecular extension and the force are fluctuating quantities.However, using force feedback control [30] the position of the optical trap is actively rectified while force is kept constant.We performed experiments in LOT in the standard passive mode (ExtEns) and in the active feedback mode (ForceEns).The latter are compared to MT measurements.Typical FDCs for LOT in the ExtEns (ForceEns) mode are shown in top (bottom) panel of Fig. 3(a).Figure 3(b) shows work distributions calculated using Eqs.( 2) and ( 3), respectively, and the test of the CFT is shown as insets.In all cases, the CFT prediction is fulfilled using the appropriate work definition, whereas it does not if it is not used the suitable work definition [27].We stress that the obtained free energy value is compatible with the theoretical prediction if the work is appropriately calculated (values in caption of Fig. 3).

III. DISSIPATION AND KINETICS
Besides thermodynamics, irreversibility effects and dissipation are yet another sign of ensemble inequivalence.From thermodynamics, average dissipated work per cycle is defined as: W dis = W − G, (where W is the average mechanical work per cycle) and it can be estimated in bidirectional pulling experiments as: 2W dis = W F − W R , where W F (R) is the mean value of F (R) work distribution.
Figure 3(c) shows W dis as a function of the pulling rate r for all experiments.Note that, under equivalent pulling rate conditions, dissipation is always lower in the ExtEns as compared to the ForceEns.As shown in Fig. 3(c) the theoretical prediction agrees perfectly with the experimental results.
In Ref. [31] an expression for the average dissipated work in two-state systems, W dis , has been derived for pulling experiments in the ForceEns [27].For two-state systems (such as the folded or unfolded state of the DNA hairpin), the folding-unfolding kinetic rates k F →U , k U →F can be written as [32][33][34][35] where k m is the unfolding kinetic rate at zero force, G F U = f c x m is the free energy difference between states F and U [27], f c is the force at which states F and U are equally populated (i.e., coexistence force) and x m = x F + x U is the molecular extension.The difference found in the average dissipation between the ForceEns and the ExtEns relies on the molecular kinetics.In the ForceEns, the unfolding transition occurs at constant force, keeping folding kinetics unchanged.In contrast, in the ExtEns every unfolding event is followed by a force jump, speeding up folding kinetics as compared to the ForceEns.In Fig. 4 are shown schematic depictions of an arbitrary unfolding event (solid lines) in the ForceEns and in the ExtEns FIG. 4. Illustration of ensemble dependence of coexistence kinetic rates.Hopping kinetics at coexistence in the ForceEns (fixed point) and ExtEns (two arrow line).Kinetic rates are always higher in the ExtEns.f is the force jump when the molecule unfolds.when the kinetic rates are modeled according to Eqs. ( 5) and (6).It is shown how kinetic rates (and the overall relaxation rate) are always higher in the ExtEns.
The same expression for W dis has been shown to be applicable for the ExtEns by appropriate rescaling of the kinetic rates at the coexistence transition between the folded-unfolded states [31], k ForceEns = k ExtEns , with where μ is the fragility parameter [36], x m is the molecular extension of the DNA hairpin, and f is the force jump at the coexistence transition in the ExtEns.

IV. DISCUSSION
In this paper we studied the problem of ensemble inequivalence at the single-molecule level.To this end, we performed nonequilibrium pulling experiments on small DNA hairpins by applying a mechanical force to the ends of the molecule, inducing their unfolding and folding.We carried out experiments in the ForceEns, both with MT and LOT with force feedback, and in the ExtEns with LOT.We found that the boundary terms in the definition of thermodynamic work have a pivotal role in the validity of the CFT and, hence, in free energy recovery methods.The presented results show that in the ForceEns Eq. (3) needs to be used for the computation of the correct thermodynamic work, resolving the recent controversy in the field [19][20][21][22].
Besides the effects on thermodynamics, we showed that the ensemble choice also affects the kinetic response of the system.In particular, the average dissipated work, that is governed by the molecular kinetics, strongly depends on the nature of the control parameter.Using a two-state model for the folding and unfolding of DNA hairpins we are able to reproduce the experimental results in both ensembles.We observed higher dissipation when an intensive variable is controlled (such as the mechanical force) with respect to the case when an extensive variable is controlled (such as the molecular extension).
In general, fluctuations of intensive variables in the ExtEns leads to effective higher kinetic rates in thermally activated processes.The characteristic Arrhenius dependence of kinetic rates, k ∼ exp(−B/k B T ) and the fluctuating nature of the barrier, B, together with Jensen's inequality, give k ExtEns ∼ exp(−B/k B T ) > exp(− B /k B T ) ∼ k ForceEns .In turn, the average dissipated work is expected to scale like W dis ∼ P /k, with P being a characteristic driving power (∼x m r in our pulling experiments), giving W ExtEns dis < W ForceEns dis .Note that Eq. ( 7) can be written as = exp(−a| The conclusions of our single-molecule study might be generalized to other physical contexts.For example, in the pressure-volume context of liquids the analogous expression to Eq. ( 7) would read as with b ∼ O(1), V being the volume, P the root-meansquare deviation of pressure fluctuations, and κ T the isothermal compressibility, κ T = −(1/V )(∂V /∂P ) T .Equation ( 8) might be applicable to small liquid droplets in compartmentalized environments where volume rather than pressure defines the ensemble.For example, in cells, the scale of pressure fluctuations is given by the osmotic pressure due to solute concentration differences.Let us consider a cell of typical size 10 μm with fixed volume V = 10 3 μm 3 , osmotic pressure fluctuations P ≈ 100 Pa (osmotic pressure differences can be as large as 300 Pa [37]), and isothermal compressibility κ T of water as small as 4 × 10 −10 Pa −1 .Inserting these values in Eq. ( 8) with k B T = 4 pN nm = 4 × 10 −21 N m at T = 298 K we obtain ≈ exp (−b), which is of order 1 if b is of order 1, as assumed.This result suggests that the kinetics of molecular reactions inside cellular compartments [38,39] might be sensitive to the ensemble.The figures employed in the previous expression for in Eq. ( 8) should be taken only as a guide, the previous expression being strongly sensitive to the three terms appearing in the exponent: P , V , κ T .Indeed, for the case of molecular reactions in much smaller compartments V can be a thousand times smaller, however also the magnitude of the pressure fluctuations, P can be comparatively larger.However, κ T must not be necessarily as small as for pure water 4 × 10 −10 Pa −1 , the bulk modulus of the cellular solvent could be larger at finite frequencies under nonequilibrium conditions.We are still far from making a definite statement of the relevant effects of ensemble nonequivalence in molecular reactions inside the cell.Further research studies specifically addressing this question are needed.In this regard single molecule experiments of molecular folding in crowded environments offer an interesting research track follow.

Folding free-energy recovery
The free-energy difference G determined from the Croks fluctuation theorem (CFT) contains contributions from the molecule and the handles in the case of magnetic tweezers (MT) and, additionally, from the bead in the optical trap in laser optical tweezers (LOT).Therefore, for MT experiments Being G x the free-energy difference between the unfolded (U ) state and the folded (F ) state calculated in the extensional ensemble (ExtEns).G 0 is the folding free energy at zero force.W rev st = W U st − W F st is the difference between the reversible work required to stretch the unfolded singlestranded DNA molecule from force 0 to up to a maximum force f max (force applied to the molecular system at x 1 ) and the reversible work needed to align the folded DNA hairpin along the force axis from 0 to f min (force applied to the system at x 0 ): Where f U (x) [f F (x)] and the inverse function x U (f ) [x F (f )] are the equation of state of the unfolded (folded) DNA.The first integral is calculated using the worm-like chain (WLC) [40]: where k B is the Boltzmann constant, T is the absolute temperature, P is the persistence length, x n is the extension of the n bases of released single-strand nucleic acid, L n c = nd b is the contour length, and d b is equal to the average interphosphate distance.We have used P = 1.35 ± 0.05 nm and d b = 0.59 nm/base [41].However, the second integral is computed according to the freely jointed chain (FJC) model considering the hairpin as a single dipole with fixed diameter d = 2 nm and equal Kuhn length [42].
The term W rev handles is the reversible work needed to stretch the handles from f min to f max : x handles (f max ) x handles (f min ) The handles are modeled according to the extensible WLC model using the Bouchiat interpolation formula [43].The elastic parameters are taken from Ref. [23].Hence, the persistence length is equal to P = 1.6 ± 0.3 nm, the stretching modulus Y = 16.9 ± 1.3 pN and the interphosphate distance d b = 0.34 nm/base.Finally, W rev bead is the reversible work needed to pull the optically trapped bead from f min to f max : where k bead (f ) is the force-dependent stiffness of the optical trap determined for the mini-tweezers instrument [23].Since MT has an extremely large stiffness this last term does not contribute significantly to the total free energy of the system.Note that: , where G f is the free-energy difference measured in the force ensemble (ForceEns) and (xf ) is the average over all experimental realizations of the force and extension boundary terms.

Fluctuation Theorem for ensemble-wrong work definitions in Laser Optical Tweezers experiments
For the case of MT we have shown in the main text how the CFT holds [top panel of Fig. 2(d) in main text] when using the ForceEns work definition for the ForceEns [i.e., Eq. (3) of main text].However, in the bottom panel of Fig. 2(d) in main text, it is shown how the CFT does not hold when the work definition corresponding to the ExtEns is used (i.e., Eq. ( 2) of main text).
When using LOT instruments the results are similar.Top panel of Fig. 5(a) shows the work probability distributions obtained using the ForceEns work definition [Eq.(3) of main text] for the case of experiments performed in the passive mode (ExtEns).Clearly, the CFT is not satisfied due to the use of an unsuitable work definition for the ExtEns [bottom panel of Fig. 5(a)].The slopes for the 6 and 16 pN/s pulling rates are, respectively, 0.11 ± 0.03 and 0.31 ± 0.02 (both in k B T units).
However, top panel of Fig. 5(b) shows the distributions obtained using the ExtEns work definition in the case of active mode (ForceEns) experiments.Again, the CFT is clearly not satisfied [bottom panel of Fig. 5(b)].The slopes for the 7 and 21 pN/s pulling rates are, respectively, 0.19 ± 0.02 and 0.17 ± 0.01 (both in k B T units).In all cases probability distributions are obtained as described in the main text.Solid lines in the CFT plots correspond to a straight line with slope equal to 1 (in k B T units).

Free-energy landscape and average dissipated work in the Force Ensemble
The free-energy landscape (FEL) maps all the available configurations of a DNA hairpin with its corresponding free energy [44].The configurations are labeled according to a reaction coordinate.In the studied hairpin [Fig.6 coordinate when a mechanical force is applied to the ends of the hairpin.
To calculate the FEL shown in Fig. 6(b) we have used the following formula: where the term G n 0 accounts for the free energy of formation at zero force of the configuration with n released basepairs.It is sequence-dependent and its values are obtained from Mfold [29].
The elastic response of the released single-strand nucleic acid is modeled according to the worm-like chain (WLC) model [40].Thus, its free energy at a fixed force f is given by where x n is the extension of the n bases of released singlestrand nucleic acid (calculated as the inverse function of the WLC model).However, the energy cost to orientate the double helix diameter (d = 2 nm) along the direction of the force f is Under the action of an external force, f , short DNA hairpins [Fig.6(a)] are usually described as two-state systems according to Kramers Bell-Evans theory [32][33][34][35] with kinetic rates k U and k F [see Fig. 6(b)].These rates correspond to the

FIG. 2 .
FIG. 2. Work distributions and CFT test in the force ensemble in MT experiments.(a) Illustration of an unfolding trajectory and measurement of the work value in the ForceEns and in the ExtEns as the area at left and below the FDC (dashed areas), respectively.(b) Work probability distributions using Eq.(3).(c) Work probability distributions obtained using Eq.(2).Distributions have been obtained using a kernel-density estimator[26] with 1.6 k B T bandwidth.In both graphs solid (dashed) lines correspond to F(R) distributions.Vertical lines correspond to the experimental uncertainty of the reversible work value.(d) CFT plot in the ForceEns (top panel) and in the ExtEns (bottom panel).Dashed straight black lines have slopes equal to 1 in k B T units.All error bars have been obtained using the Bootstrap method.Error bars shown in probability distributions are a subset of the total number of points in which the densities have been evaluated.

FIG. 5 .
FIG. 5. Breakdown of CFT symmetry.(a) Work probability distributions (top panel) and CFT plot in the ForceEns [Eq.(3) in main text] for ExtEns experiments.(b) Work probability distributions (top panel) and CFT plot in the ExtEns [Eq.(2) in main text] for ForceEns experiments.In all cases probability distributions are obtained as described in the main text.Solid lines in the CFT plots correspond to a straight line with slope equal to 1 (in k B T units).
FIG. 6. Theoretical free-energy landscape.(a) Hairpin sequence.(b) FEL evaluated at the force at which the folded and the unfolded state have the same free energy (coexistence force, f c ).The dashed line corresponds to the position of the transition state (TS) and x F (U ) are the distances from the folded (unfolded) state to the TS.