Constraints on deviations from ${\Lambda}$CDM within Horndeski gravity

Recent anomalies found in cosmological datasets such as the low multipoles of the Cosmic Microwave Background or the low redshift amplitude and growth of clustering measured by e.g., abundance of galaxy clusters and redshift space distortions in galaxy surveys, have motivated explorations of models beyond standard $\Lambda$CDM. Of particular interest are models where general relativity (GR) is modified on large cosmological scales. Here we consider deviations from $\Lambda$CDM+GR within the context of Horndeski gravity, which is the most general theory of gravity with second derivatives in the equations of motion. We adopt a parametrization in which the four additional Horndeski functions of time $\alpha_i(t)$ are proportional to the cosmological density of dark energy $\Omega_{DE}(t)$. Constraints on this extended parameter space using a suite of state-of-the art cosmological observations are presented for the first time. Although the theory is able to accommodate the low multipoles of the Cosmic Microwave Background and the low amplitude of fluctuations from redshift space distortions, we find no significant tension with $\Lambda$CDM+GR when performing a global fit to recent cosmological data and thus there is no evidence against $\Lambda$CDM+GR from an analysis of the value of the Bayesian evidence ratio of the modified gravity models with respect to $\Lambda$CDM, despite introducing extra parameters. The posterior distribution of these extra parameters that we derive return strong constraints on any possible deviations from $\Lambda$CDM+GR in the context of Horndeski gravity. We illustrate how our results can be applied to a more general frameworks of modified gravity models.


Introduction
The latest observations of the cosmic microwave background (CMB) by the Planck satellite [1][2][3][4] and of large structure clustering by several surveys e.g., [5,6] have provided us with a better understanding of the Universe. Indeed, despite the large increase in data volume and accuracy, the ΛCDM model seems to be extremely successful at describing all these data. The ΛCDM model is based on six free parameters that are adjusted to the observations, and assumes homogeneity, isotropy and that the law of gravity is Einstein's general relativity (GR). However, as data have improved, some "inconsistencies", "tensions", "anomalies", (renamed "curiosities" in [1]) with the ΛCDM model have appeared and while they seem to be not statistical significant, they are "persistent". These "anomalies" are: i) the low multipoles in the observed CMB temperature anisotropy power spectrum are lower than predicted by the standard model, ii) CMB (Planck) direct measurements of the lensing potential are mildly in tension with the prediction of the angular power spectrum best-fitting ΛCDM model , iii) the redshift space distortion of galaxy clustering data, the BOSS Lyman−α power spectrum and the iv) CFHTLens constraint on the present-day amplitude of fluctuations at linear scales of 8 Mpc/h, σ 8 , are also lower than predicted by the CMB-calibrated ΛCDM model. v) The observed cluster abundance seems also to be lower than predicted by the model, which could also be explained by a lower σ 8 . However the level of tension with ΛCDM depends somewhat on how the cluster sample is selected, with Planck SZ-selected clusters giving the stronger tension [7] . Finally vi) the value of the Hubble constant inferred from the CMB within a ΛCDM model is lower than the value directly measured locally [8] and e.g., Ref. [9] and refs therein. 1 These anomalies are each at the ≈ 2σ level and therefore taken individually are not of enough statistical significance as to claim any "new physics" beyond the standard ΛCDM model. However, if all these anomalies could be fit by a single model, they would become more than "curiosities" and the possible discovery window for new physics opens up. Indeed, there has been already significant activity in the literature to try to explain these anomalies 2 .
One basic assumption of the ΛCDM model is its reliance on Einstein's general relativity theory. One obvious step is then to investigate if changing this assumption could provide a better fit to the current observational data. The models proposed in the literature that deviate from Einstein's gravity are abundant, and in principle infinite. Then, in the absence of strong theoretical priors that can identify a particular model as the "real" theory of gravity, a generic framework is highly preferred. However, increasing the generality of this framework increases also its complexity. It is then crucial to balance our theoretical beliefs in order to optimize the freedom that is left to fit the observations. To achieve this description many attempts have been done in the literature, such as the parametrized post-Friedmann [13] or an Effective Field Theory approach [14,15]. In the present paper we assume that modifications of gravity on cosmological scales are well described by an additional scalar degree 1 The σ 8 tension is what drives claims in the literature for massive neutrinos e.g., [7,10] or phantom dark energy [11], the H 0 tension drives claims for sterile neutrinos [12]. 2 A search the ADS database reports in excess of 100 articles addressing this issue.
of freedom with at most second-order derivatives in the covariant equations of motion. An additional requirement is that the theory should satisfy the weak equivalence principle, i.e. all matter species are coupled minimally and universally to the metric g µν . This leads to the Horndeski lagrangian [16][17][18]. It encompasses many of the classical dark energy (DE) and Modified Gravity (MG) models studied to explain the late-time cosmic acceleration: quintessence, kinetic gravity braiding, galileons, f (R). 3 An efficient description of the linear perturbation theory in the Horndeski lagrangian has been investigated in two independent ways in [14,15,[26][27][28] and in [29] (see also [30] for an equivalent approach at second-order). We will make use of the notation of the latter, but the two results are equivalent. The maximal freedom one can have is described by four functions of time besides the Hubble parameter H(t), which is responsible for the expansion history of the universe, and one constant, i.e., the amount of matter density today. One of the advantages of this approach is that we can separate the background evolution from the perturbations. Indeed, these four functions appear only at the perturbative level, they are independent from each other and with respect to H(t). Here we choose a particular parametrization for these functions proposed by [29] and used in [30], where they are proportional to the cosmological density of dark energy Ω DE (t). This reduces the effective freedom of the theory to four free parameters.
The aim of this paper is to constrain the free parameters of our description using the most recent CMB and large scale structure data.
Horndeski gravity includes GR as a special case, thus if Horndeski gravity provides a significantly better statistical fit to the data away from the GR limit this will signal possible new physics. Conversely if not such signal is found the results can be used to place limits of possible deviations from GR on cosmological scales. Within the ΛCDM model and its popular extensions it is quite widespread to perform global fits considering compilations of state-of-the-art cosmological datasets. Specific models, such as f (R) or galileons, have been considered or specific data sets e.g. [31][32][33][34][35][36][37][38][39][40]. This however has not been attempted systematically for generic deviations from GR such as Horndeski models. Recently, the Planck collaboration [41] studied the implications of some DE/MG models with the latest Planck data. In particular, they first parametrized the background evolution only, letting the perturbations evolve with standard equations on top of a non-ΛCDM background. Then, they parametrized directly the perturbations introducing two independent functions of the curvature perturbations [42], using the public available code MGCamb [43,44]. Besides the analysis of some "real" theory of gravity, such as massive gravity, f (R) or coupled quintessence, they also studied DE/MG with the EFT approach described in the previous paragraph. They used the public code EFTCamb [34,45], to get constraints on a sub-class of the Horndeski class of models, i.e. in our notation the alphas different from zero are α K and α B = −α M (see [29] and Sec. 2 for details). Their conclusion is that there are no significant deviations from ΛCDM for all the models investigated. This can be ascribed to the lack of more precise/accurate data, or to the theoretical priors they are imposing, i.e. not enough freedom in the parameter space. While for the first limit we have to wait for the next generation of surveys, the latter can be relaxed to cover more generic frameworks. This is the scope of this work. In addition here for the first time we present a global fit for Horndeski gravity considering a compilation of state-of-the-art cosmological data including the latest large-scale structure data.
The rest of this paper is organized as follows. In §2 we review the Horndeski parametrization and outline the methodology adopted and the data sets used. Results and their interpretation are reported in §3. Finally we conclude in §4.

Methodology
We begin by reviewing the Horndeski model and the role of the free functions of the model. Since with cosmological data it is not possible to constrain non-parametrically these functions we parametrize them and motivate our choice. We also present the data-sets we use here. We consider CMB data and large-scale structure data obtained from galaxy surveys. In this paper we do not consider clusters data, weak lensing data or local measurements of the Hubble parameter. The amount of tension between the ΛCDM model and cluster abundance depends on the cluster data set chosen and more specifically on the mass-observable relation, indicating that this probe is not as mature and robust as other probes of σ 8 . We therefore leave the cluster abundance for future work. Traditional -2Dweak lensing analysis uses non-linear scales which could be affected by poorly understood physical processes (e.g., baryonic effects). Restricting the analysis to linear scales increases significantly the error-bars making all "tension" disappear [46][47][48]. Moreover the minimal description of the Horndeski class of models as it has been studied in the literature is strictly valid for linear and second-order perturbations [30]. Even if the goal of modeling non-linear scales in generic theories beyond GR is in principle achievable, i.e., it is possible to find a finite set of alphas at every order in perturbation theory, it is not developed enough and therefore using non-linear scales is well beyond the scope of this paper. Finally the tension with the direct H 0 measurement turns out not to be too significant after recent reanalysis see e.g., [49] and refs therein for discussion.

The Horndeski model and adopted parametrization
The Horndeski action is the most general action for a single scalar field that has second-order equations of motion on any background and satisfies the weak equivalence principle. By using an Effective Field Theory approach, which has been developed for Horndeski models in [14,15,24,[26][27][28][29]50] it is possible to identify the minimum number of operators that fully specify the linear evolution of cosmological perturbations. The result [29] is that the maximal freedom of the Horndeski lagrangian at linearorder in perturbation theory can be described efficiently by five functions of time plus one constant, i.e., the amount of matter density today Ω m (z = 0). One of the functions is the Hubble parameter H(t), which is responsible for the expansion history of the universe. The other four functions we use appear only at the perturbative level, and they were introduced first in [29]. In general Horndeski theories Ω m (z = 0) can be computed only through measurements of the large scale structure [51,52]. This is due to the fact that at the background level DE/MG can have a component that mimics some matter species. As an example, there can be a model that predicts that the DE/MG density scales as ρ DE (t) = ρ DE0 + ρ DE1 a(t) −3 , where ρ DE0 and ρ DE1 are two free parameters. Then, there would be no clear separation between matter and DE just observing the expansion history of the universe, and fixing H(t) would not be enough to determine the matter content today. However, this degeneracy could be removed at the perturbative level by using large scale structure measurements. In this paper, for simplicity we assume that both H(t) and the matter content today are specified by a fiducial ΛCDM model, i.e., the time dependence of Ω DE (z) has no components ∝ Ω m (z).
Here, it is useful to summarize the physical properties of these functions, in order to understand how to parametrize them and to interpret the results of this paper: • kineticity: α K . It is the most standard kinetic term present in perfect fluid structures (it is the only function different from zero in simple DE models as quintessence). It is possible to demonstrate that, if the quasi-static (QS) limit is valid, it does not enter the equations of motion. This indicates that it can not be constrained for models with speed of sound close to the speed of light [53]. Indeed, in this case the sound horizon of the theory coincides with the cosmological horizon and the QS approximation would be validated on the scales of interest.
• braiding: α B . 4 It comes from a mixing between the kinetic terms of the metric and the scalar. The more classical models that have a braiding different from zero are f (R)/Brans-Dicke and cubic galileons. It modifies the growth of perturbations, i.e. f and G eff . As shown in [29], it is responsible for all the new scale dependence of the theory beyond the usual Jeans length. If α B = 0, it is still possible to have anisotropic stress between the curvature perturbations, but G eff 1. For this reason, this property is crucial in order to modify the shape of the power spectrum, and the low multipoles in the observed CMB spectrum.
• Effective Planck-mass and Planck-mass run rate: M 2 * , α M . A fixed redefinition of the Planck mass can not have a detectable effect. Indeed, it can be hidden in the field equations by appropriately redefining the densities. For this reason in this work we will not pay too much attention to M 2 * . What we can observe is a variation of the Planck-mass, i.e. the Planck-mass run-rate. An evidence of α M = 0, indicates that the theory of gravity is non-minimally coupled. As shown in [55], α M generates anisotropic stress in the curvature perturbations and modifies the evolution of the gravitational waves. In the presence of non-zero braiding it contributes to modify the growth of perturbations through a modification in the effective Newton's constant.
• tensor speed excess: α T . It is defined as the excess in the speed of tensors with respect to the speed of light. Even if this is its main property, which in principle could be observed [56][57][58], on the scalar sector it introduces anisotropic stress and it modifies the growth of perturbations if α B = 0.
Hence, α B is responsible for all the new scale dependence of the theory beyond the Jeans length. If α B = 0, the effective Newton's constant, which regulates the relation between the curvature perturbation (the trace of the spatial metric) and the matter overdensities, assumes the standard value on sub-horizon scales (k 2 a 2 H 2 ), see Eqs. (4.9) and (4.11) of [29]. Then, the evolution of linear perturbations on sub-horizon scales in the extreme QS limit (which is a usual approximation assumed when semi-analytical calculations are performed) can be considered as the evolution of the perturbations in standard perfect fluid structures, with the possibility to have a non-zero anisotropic stress in the curvature perturbations. As noticed in [29], it is then possible to define the braiding scale k B (t) as Note that the formulation of the braiding scale here is different from the formula given in [29], since we are assuming a ΛCDM background. The ratio between α K and α 2 B regulates the magnitude of k 2 B (t). In the absence of the second term of Eq. (2.1), the braiding scale lies at the cosmological horizon. But in the presence of non-minimal coupling (i.e. α M = α T = 0), it is sufficient to tune α 2 B α K in order to push the braiding scales to very small scales (smaller than the scales we are observing). At this point one could ask what the magnitude of k 2 B (t) is in a pure ΛCDM universe. Indeed, if we consider a DE model like quintessence (α B = 0), as the closest model to ΛCDM, then k 2 B (t) → ∞. This would indicate that standard gravity is recovered on super-braiding scales, where the equations of motion have a perfect fluid structure. On the contrary, we could reduce the Horndeski class of models to a simple MG theory, such as f (R) (α K = 0), and consider it as the closest model to ΛCDM. In the limit where the scalar field associated acquires a large mass it would be decoupled from gravity (f (R) → R), leading to standard results. In this example we could conclude that standard gravity is recovered on sub-braiding scales. Then, the correct answer to the previous question is simply that the braiding scale defines the transition between two different physics. The exact ΛCDM limit has an indeterminate braiding scale.
When considering standard results in the literature, some confusion arises because, in general, non-minimally coupled models (i.e., α M = 0 or α T = 0) have also a non-zero braiding. Then, it becomes natural to associate α K with DE, and α B with MG. However, in our framework, all these operators are independent from each other, which ultimately means that MG can lie equivalently on super-braiding or on sub-braiding scales. We conclude that it is easier to have standard gravity on super-braiding scales because it is sufficient to choose α M = α T = 0. On sub-braiding scales the Planck-mass run-rate and the tensor speed excess have to be tuned to suppress the effect of the braiding on the growth of perturbations.
The description we use has five functions of time. One of them, i.e. H(t), fully specifies the expansion history of the universe, while the other four modify the evolution of the linear perturbations.
However, current data are not sufficient to constrain non-parametrically all these functions of time. Therefore we assume ΛCDM as our fiducial model for the evolution of the background. At the perturbative level for deviations from ΛCDM, we follow the parametrization proposed in [29,30], Table 1. We show the general behavior of the Horndeski class of models varying the functions αi or a combination of them. In the top table we assume each αi to be "→ 0" or "O (1)", we compute the estimated magnitude of kB and we give the expected type of gravity in the deep sub-horizon limit. In the central table we show the magnitude of kB and the expected gravity type as a function of particular combination of the αi functions; this particular combination is closely related to the definition of the braiding scale, Eq. (2.1). In the bottom table we give the general behavior and the list of the important parameters on sub/super-braiding scales. This clearly indicates that we expect two different physics in these two regimes.
which reads Here, all the α functions are fixed to be proportional to the dark energy density thus reducing the extra freedom of the theory to four parameters, i.e. {c K , c B , c M , c T }. With this choice we ensure the standard evolution of the universe during the eras dominated by radiation and matter, while we allow for deviations from standard gravity at late-times. This assumption, even if it is not unique, is motivated as follows: The α-functions defined in terms of the Horndeski functions (see Appendix A of [29]) are driven by the same functions of the scalar field as the energy density and its derivatives. Thus, one would naively expect the parameters c i to be O(1). In addition, this parametrization describes exactly the behavior of simple shift-symmetric models as the imperfect fluid on its attractor [59], and approximately more complicated models as the best fit of covariant galileons [39,60].
In Table 1 we list the various limits of the Horndeski class of models varying the alpha functions. In addition, we give a prediction for the magnitude of the braiding scale when it can be estimated. The ΛCDM limit is recovered exactly by setting all c i = 0. However, c K represents the contribution given by perfect fluid structures, i.e., what it is usually classified as DE. Then, even if we set c B = c M = c T = 0 and leave c K free to vary, the theory of gravity would still be GR.

Analysis method
We use a Bayesian approach to infer the constraints on our model given the datasets described in Sec. 2.3. The standard fitting of models defined in a multi-dimensional parameter space is done through Monte Carlo sampling techniques. These allow us to infer the posterior of the model's parameters, which we obtain by sampling the likelihood of the parameters values given the data and assuming a uniform prior in those parameters.
In order to do this, we use the code HiClass 5 [61] to solve the Boltzmann equations. This code, used for the first time in [60], is an extension of the Cosmic Linear Anisotropy Solver Software (CLASS) 6 by [62] to include Horndeski class of modified gravity theories. Since as outlined above we choose the parametrization in which the time evolution of the four free functions of this class of theories is given by the evolution of Ω DE (z), the constraints quoted in this paper are actually on the coefficients that fix the proportionality between α and Ω DE .
This Boltzmann code is then interfaced by the cosmology sampler MontePython 7 by [63] which is a powerful engine to run Monte Carlo Markov chains (MCMC). We run this code using the Metropolis-Hastings sampler to obtain our chains until we consider them converged. In this paper our convergence criterion is given by a value of R − 1 0.03 of the Gelman and Rubin parameter [64]. In order to avoid instabilities, we also assume a hard prior on the coefficient c T > −0.9 8 and a initial Planck mass of M 2 * ≥ 1. We also assume in our runs two massless and one massive neutrino with m ν = 0.06eV. Finally, we remove a small fraction of the points which correspond to a negative braiding scale squared k 2 B (z = 0).

Datasets
We use a compilation of recent cosmological data including CMB, large scale structure, redshift space distortions and baryon acoustic oscillations. To test the robustness of the results, different data sets combinations are analyzed, before proceeding to the global fit.
CMB: We include the Cosmic Microwave Background (CMB) temperature power spectrum from Planck. We compare three cases: we consider 1) Planck 2013 temperature power spectrum data [1] with the low-multipoles of the polarization power spectrum from WMAP [67] (hereafter WP), 2) Planck 2015 temperature and polarization data [3,4], and 3) Planck 2015 temperature and polarization data with WMAP polarization replacing Planck polarization data at low multipoles (Planck2015WP). At the end we present our main results for our "gold" data set Planck2015WP and refer the reader to the Appendix. We choose the latter as our default case because we find WP to be more constraining than Planck low-polarization for certain non-standard models (see Section 3). We emphasize that we use the full shape of the power spectrum rather than the shift parameters that determine the position of the acoustic peaks. This dataset includes about 7 e-folds out of the ∼ 60 that we expect from inflation. This is the only probe we use from the high-z Universe, much earlier than when late-time acceleration of the Universe took place. On the other hand, the multipoles below 90 probe super-horizon scales, which are relevant to this work, especially due to the anomalies reported in these scales (e.g., low value for the quadrupole [68,69], low value around ∼ 20 [70], etc.). P (k): We use the shape of the power spectrum of galaxies from the WiggleZ survey [6] including scales in the range 10 −4 h/Mpc < k < 0.2h/Mpc. These probe roughly another 7 e-folds. This constrains the shape of the power spectrum in the late Universe 0.2 < z < 1.0 so this becomes sensitive to late-time effects. There are also claims of anomalies with these kind of datasets (e.g., [10]), mostly related with the value of the normalization of matter fluctuations σ 8 which is slightly different from the one derived from CMB data. Effects from massive neutrinos have also been invoked to bring these two datasets into better agreement. Uncertainties from modeling of the halo occupation distribution that maps the relation between the galaxy and matter fluctuations have also been claimed [71]. BAO: Another cosmological probe we use is Baryon Acoustic Oscillations (BAO). BAO have their origin in the sound waves of the baryon-photon plasma in the early Universe, which were later imprinted in the distribution of galaxies, which we have observed with galaxy surveys such as 6dFGS [72], BOSS [73,74], or WiggleZ [6]. These show a feature at a characteristic scale of ∼ 150 Mpc that is used to constrain the distance-redshift relation, effectively being used as a standard ruler. We note that these measurements tend to marginalize over the shape of the power spectrum (or correlation function) and focus on the position of the BAO peak, so they cannot constrain by themselves the matter power spectrum. We note that some anomalies have been reported regarding the BAO in the distribution of the Lyman-α forest, which seems to be in tension with the distance derived from the best-fit ΛCDM model derived from Planck data. Since this BAO measurement corresponds to a redshift of z = 2.4 in which we should not expect any influence from late-time acceleration effects, it is interesting to see if they affect our constraints on MG. BAO measurements help fix the expansion history thus reducing possible parameter degeneracies. The BAO measurements used are summarized in Tab. 2 and shown in the left panel of Figure 1.
RSD: Finally we also include data from Redshift Space Distortions (RSD) in the galaxy distribution. These are also derived from Large Scale Structure surveys such as 6dFGS [75], SDSS [76,77], BOSS [78], WiggleZ [79], or VIPERS [80]. Assuming linear theory, it is possible to use the anisotropies in the galaxy clustering at intermediate scales (30h −1 − 150h −1 Mpc) to infer the line-of-sight contamination in the galaxy redshifts due to peculiar velocities in bulk motion [81]. Similarly to (isotropic) galaxy clustering being able to constrain the combination bσ 8 , where b is the bias of the galaxy distribution with respect to the dark matter and σ 8 the amplitude of (linear) perturbations on 8Mpc/h scales, RSD measures the combination f σ 8 , where f = d log D/d log a is known as the growth rate and is the logarithmic derivative of the growth factor. It is therefore a probe of the growth of structures and hence of MG. Since these measurements have been obtained with galaxy surveys below redshift z 1, these should be affected by the same effects that caused the late-time acceleration of the Universe. Moreover, there have been claims that f σ 8 values from RSD are systematically low compared with those obtained from the best-fit ΛCDM model derived from Planck data [4], which makes this dataset very relevant for this study. The redshift space distortions data compilation we adopt is summarized in Tab. 3 and shown in the right panel of Figure 1.

Results
We now present our constraints on the Horndeski parameters derived from Monte Carlo sampling of this parameter space using the cosmological datasets described in the previous Section. For CMB data we use the compilation Planck2015WP. We have also produced constraints using Planck 2013+WP and using Planck 2015 data [3,4], which uses Planck's own polarization measurements instead of the low-polarization data from WMAP. Because of the longer integration time and the additional    information from the high polarization offered by Planck 2015, the naïve expectation is that Planck 2015 should be more constraining than Planck 2013. If we enforce a hard prior on the integrated optical depth due to reionization τ reio > 0.04 that is indeed the case. However, our results depend on the choice of the prior. We conclude that for the non-standard models considered here the low polarization measurements from WMAP are more constraining that Planck's. This is why we choose to report our updated constraints using as the CMB dataset a combination of the high multipoles of Planck temperature and polarization power spectrum, the low multipoles of Planck temperature power spectrum, and the low multipoles of WMAP polarization power spectrum. A detailed comparison between constraints from these different CMB datasets is presented in the Appendix. First of all, we note that the coefficient c K is largely unconstrained by the data, even when the sampling is done in logarithmic space. This was expected, since α K does not introduce new scale dependence and it does not modify the linear growth. In the quasi-static approximation where time derivatives are considered to be sub-dominant with respect to space derivatives, α K does not enter the equations of motion. In this regime we expect no effect of α K on the observables and thus no sensitivity of our analysis to this parameter. Therefore, we choose a few arbitrary fixed values of c K and report constraints for these cases. In Table 4  c M c T CMB, c K = 0 +0.04 < c B < +1.91 −0.86 < c M < +2.00 −0.90 < c T < +1.20 CMB, c K = 1 −0.01 < c B < +1.84 −0.77 < c M < +2.00 −0.90 < c T < +1.08 CMB, c K = 10 +0.14 < c B < +1.92 −0.73 < c M < +2.00 −0.90 < c T < +0.77 Table 4. Constraints on the coefficients cB, cM, and cT from different cosmological dataset combinations and for different values of cK. Quoted limits are 95% CL. A hard prior on cT > −0.9 is applied as well as a prior on −2 < cM < +2 that has become relevant in some cases.
c K . In all cases (all data set combinations and all values of c K ) the parameter c M is consistent with zero. For this parameter the c K dependence is very mild.
When RSD is not included, the c T parameter is consistent with zero and shows no c K dependence for the data set combinations CMB, CMB+BAO and CMB+PK. For the CMB+RSD and CMB+BAO+PK+RSD data sets this parameter is negative and 0 is formally outside the 95% confidence. The c T lower limit is imposed by the prior c T > −0.9. It is then clear that RSD pushes c T towards negative values. Indeed, of the data we consider RSD is the only one that probes (with good statistical significance) the growth of perturbations (in fact it measures f σ 8 ). There is some small dependence of that in CMB lensing and ISW but at lower significance. It is well known that within the LCDM paradigm, RSD wants an amplitude of fluctuations slightly lower than that predicted when fitting the model to CMB data. Negative values of the parameter c T achieve just that: a lower f σ 8 at the redshift of interest for RSD without major alterations to other observables.
The confidence region for the parameter c B also excludes 0 in few cases and positive values seems to be formally favoured. The upper limit of c B does not depend on the adopted value of c K , but it is sensitive to the choice of data set. The lower limit on the other hand is also sensitive to the adopted c K value.
Our tightest constraints, which include all the datasets considered, formally appear to favour a theory of gravity that is minimally coupled with tensors propagating slower than the speed of light. The results are robust with respect to changes in the low-CMB polarization data (from Planck to WMAP, see Appendix 4). We have also tested that the model-dependent RSD measurement in [78] is not driving our results, by checking the consistency of the constraints when that data point is replaced with the measurement in [84]. These tests confirm the robustness of our results.
For the other data sets the ΛCDM model (c K = c B = c M = c T = 0) is not always included in the 95% posterior confidence interval.
We will study below the Bayesian evidence ratio of the ΛCDM to MG models. But before we do so, in order to interpret the significance of this result we examine carefully the values of the effective χ 2 statistic -given by the log of the likelihood-measured at the best-fit point in our Monte Carlo chains. This is shown in Table 5, where we also include the standard ΛCDM case (in which the MG equations are not used) for direct comparison. We can see that CMB data actually prefers a larger value of the coefficient c K , and with a lower significance this is also true for RSD data. On the other hand, PK data slightly compensate this trend by selecting lower values of c K at the best-fit model.  Table 5. Absolute value of the log likelihoods (i.e. χ 2 /2) at the best fit point from the individual data that comprises each dataset combination explored in our analysis. The column labelled Total displays the maximum likelihood value in the chain. The last column shows the difference in Log likelihood with respect to the ΛCDM model. Red (negative) numbers represent worst fit, positive (black) numbers better fit. Given the intrinsic uncertainty of the MCMC in determining the best likelihood value, the improvement in χ 2 offered by the more complex model is in most cases not significant.
The MCMC procedure is not optimized to find the best fit model which maximizes the likelihood, therefore there is an intrinsic error associated to these numbers which have been estimated to be ∼ 0.7 [85].
Compared to the ΛCDM model, we find that the improvement on the fitting of cosmological data due to the extra degrees of freedom provided by the Horndeski parameters is not significant in most of the cases. Possible exception are the CMB (all c K values) and the CMB+BAO for c K = 10 cases where the improvement is ∆ log Likelihood ∼ 2 at the "cost" of three extra degrees of freedom. This suggests that the deviations found in our datasets are still consistent with a fluctuation within the ΛCDM scenario, even though (remarkably) the posterior distributions of MG coefficients presented in Table 4 are not always consistent with zero.
The best-fit models that correspond to the Log Likelihood values of Tab. 5 are shown in Figure 2 where we present the cases in which we fit only CMB data, CMB and one of the Large Scale Structure datasets (BAO, RSD, or PK), and the full combination CMB+BAO+RSD+PK. We show how the observables fitted in our analysis are reproduced by the MG models described in Section 2 where we fix the coefficient c K to 0 (blue lines), 1 (red lines), and 10 (green lines). We also include for direct comparison the best-fit model in a ΛCDM cosmology with standard gravity (black lines). We find that MG has more flexibility to fit the observables, but, given current error bars, the difference cannot be regarded as more than a mere statistical fluctuation. It remains to be checked with future data whether the significance of these differences increases or is still consistent with noise.
Given the above apparent contradiction, we investigate whether the posterior confidence intervals might be driven by strong correlations between the parameters varied in our MCMC chains. Strong correlations are known to introduce prior volume effects in multidimensional analysis when the posterior distribution is non-Gaussian.
Interestingly we find no significant correlations between the ΛCDM and the MG sectors of our parameter space, which greatly simplifies this analysis. We will therefore focus on possible cor- parameters are nevertheless found to be uncorrelated, at least to the extent driven by the precision of the datasets used here. The issue of whether the data favours a MG model is an issue of model selection or model comparison rather than parameter estimation. Within the Bayesian framework, which is the one adopted here, model comparison is done by considering the model-averaged likelihood, referred to here as the Bayesian Evidence, E. Under the assumption of equal a priori model probabilities, the ratio of Evidence values for two models, given the same data, quantifies the relative odds of these models being the correct description of the observations. Thus the key model comparison quantity is the Bayes factor, which is the ratio of the Evidence values for the two models. In general, the Evidence is the result of a multi-dimensional integral over the model parameters, the evaluation of which can be computationally expensive. However here the two models are nested: the simpler model (the ΛCDM+GR one) is a specific case of the MG models when the four coefficients c B,M,T,K are zero. In this case it is possible to perform rigorous model comparison between nested models without the need for evaluating numerically-intensive multi-dimensional integrals using the Savage-Dickey density ratio (SDDR; [86,87]).
In Table 6 we report the Bayes factor of the ΛCDM to MG models computed following [85]; we  use a slightly modified version of the Jeffrey's scale to interpret the evidence ratios. The Bayes factor favours the simpler, ΛCDM, model in the cases CMB and CMB+PK with odds ∼ 3 : 1 ("substantial" evidence).
As with all Bayesian methods, the Bayes factor between the two models depends on the prior on the model parameters. The comparison between the Bayesian Evidence ratio and the Log Likelihood analysis presented above, which is prior-independent, serves to quantify a possible prior-dependence of the Bayes factor.
Next we briefly discuss the implications of our bounds on the c i parameters on the existence of a (possible) new scale associated to gravity (eq. 2.1). While formally one can compute a value for k B (z = 0) given a set of parameters c i , the interpretation of confidence levels on this parameter requires some caution. As shown in Table 1, the ΛCDM limit (which is recovered exactly by setting all our parameters c i = 0) has an indeterminate k B . In our analysis the ΛCDM model offers a very good fit to the data. For the collection of models in parameter space close to ΛCDM, the k B distribution is unconstrained. When the ΛCDM model is not disfavoured by the data the braiding scale is not a good (derived) parameter to use since small variations of our c i can produce huge changes on this scale. For this reason we do not dwell much on this quantity here. The braiding scale can still be a good quantity to use when investigating models beyond the simple ΛCDM model: it is meaningful to evaluate it only in a MG scenario.
At this point it is useful to give a prescription on how to interpret our results for "real" theories. Indeed, strictly speaking the constraints we obtain are only valid for sub-classes of Horndeski that have the time evolution of their α's proportional to the time evolution of the DE/MG density on a ΛCDM background. In the literature, it is possible to find many models that are sub-classes of the Horndeski lagrangian, but in general each of them has a different dynamics. We claim that our results can be applied to more general frameworks. The reason for this is that most of the DE/MG models are constructed in such a way that they explain the late-time acceleration of the universe, with standard evolution at early-times (radiation/matter dominated eras) and modifications at late-times. Then, one can have in a reasonable amount of time (without the need of running MCMC chains) an idea of what is the allowed/not allowed region in the parameter space of a given theory. As a toy model we will make use of a minimal generalization of the imperfect fluid introduced in [59,88]. In this model we consider K = −X n , G 3 = µX m , G 4 = M 2 Pl /2 and G 5 = 0. 9 This is a shift-symmetric model that has an attractor for the background evolution. As usual, on this attractor the equations can be integrated analytically, which is an advantage we use in order to illustrate some results. The steps one has to follow are: 1. Solve the background evolution (i.e. H(t)) and check that it is close to ΛCDM. Within our example it is possible to demonstrate that in the infinite future, where Ω DE = 1, the equation of state for the scalar field is w DE = −1; 2. Use the definition of the alphas in terms of the Horndeski functions (in [29]) in order to express them as a function of background quantities. In our case we get: 14 CMB+BAO+RSD+PK, c K = 10 +0.00 < c B < +2.86 −1.59 < c M < +0.30 −0.90 < c T < −0.15 Table 7. Constraints on the coefficients cB, cM, and cT from different cosmological dataset combinations and for different values of cK. Quoted limits are 99.73% CL. A hard prior on cT > −0.9 is applied.  [59,88] as discussed in the text. The three points in the left two panels correspond to (from left to right) n = 1/3, n = 2/3, n = 1. In the right panel the three models overlap. Clearly values n 2/3 are disfavoured.
3. Check that the evolution of these α functions is indeed ∝ Ω DE (or in which regime this applies).
In our case this condition is satisfied exactly; 10 4. Invert the relations between our c i and the parameters of the theory (usually this step has to be done numerically). In our the result is: n = c B /2 and m = (c K /c B + c B − 1)/2; 5. Compare with constraints in Fig. 4 to define the allowed parameter region of the original theory.
In Fig. 4 we illustrate this last step in the procedure. The symbols correspond to different values for the parameters of the toy model considered here. Values of n ∼ 1/3 or 2/3 are allowed but values n ≥ 3/4 are excluded. This is interesting as n = 1, which is the standard kinetic term and standard case proposed in the original reference, is ruled out.

Discussion and conclusions
We have considered a general version of Horndeski theories of gravity, where modifications to general relativity on cosmological scales are well described by an additional scalar degree of freedom with at most second-order derivatives in the equations of motion and where the weak equivalence principle is satisfied. This description encompasses many of the classical DE/MG models studied to explain the late-time cosmic acceleration. We chose a parametrization in which the time evolution of the four arbitrary functions in the Horndeski Lagrangian is given by the time evolution of Ω DE . The ΛCDM limit is recovered when all the extra parameters are zero.
We have performed a global fit for the parameters of the model (including the cosmological parameters), considering the state-of-the art cosmological data: Cosmic Microwave Background data from the Planck mission and the matter power spectrum, redshift-space distortions and Baryon Acoustic Oscillation data from galaxy surveys of Large Scale Structure.
We used the standard Bayesian approach to parameter inference and obtained posterior confidence intervals on the model's parameters. The main result of this work is that we find no significant statistical evidence for deviations from Einstein's gravity in the current data. While formally for some data-sets combinations the ΛCDM+GR limit is (just) outside the 95% posterior confidence interval, the improvement in the fit at the expense of adding extra parameters, quantifies in terms of difference of log likelihood is not significant. The Bayes factor also confirms this by not favouring the more complex model (Horndeski gravity) over the simpler (ΛCDM+GR) one. These findings are robust to to choice of specific data-set used and deviations from the ΛCDM+GR limit are mostly driven by the low multipoles of the Cosmic Microwave Background anisotropies.
These results are the best constraints so far on this parametrization of the Horndeski Lagrangian, in which α i ∝ Ω DE , which limit any possible deviations from ΛCDM+GR.
It is interesting to note that in constraining this class of models beyond ΛCDM, the measurement of large-scale Cosmic Microwave Background polarization plays an important role. In fact we find that the low polarization data from the Planck satellite are less constraining than the WMAP ones. Forthcoming improvements on large-scale polarization measurements might help improve our constraints.
While strictly speaking the constraints we obtain are only valid for sub-classes of Horndeski considered, our results can be applied to a more general framework. In fact most of the DE/MG models reported in the literature are constructed in such a way to explain the late-time acceleration of the universe, with standard evolution at early-times (radiation/matter dominated eras) and modifications at late-times, making therefore possible to "map" them to our class of model by interpreting our parameters as effective parameters. This mapping is of course model-dependent and might not always be analytic, but this procedure enables one to rule in or out specific models or parameter ranges for models without having to perform any data analysis or expensive exploration of multidimensional parameter-space and inference. We have illustrated this procedure for a specific MG model.   Table 8. Constraints on the coefficients cB, cM, and cT from different Cosmic Microwave Background datasets in combination with BAO+RSD+PK and for different values of cK. Quoted limits are 95% CL. A hard prior on cT > −0.9 is imposed.