Improved constraints on the expansion rate of the Universe up to z~1.1 from the spectroscopic evolution of cosmic chronometers

We present new improved constraints on the Hubble parameter H(z) in the redshift range 0.15<z<1.1, obtained from the differential spectroscopic evolution of early-type galaxies as a function of redshift. We extract a large sample of early-type galaxies (\sim11000) from several spectroscopic surveys, spanning almost 8 billion years of cosmic lookback time (0.15<z<1.42). We select the most massive, red elliptical galaxies, passively evolving and without signature of ongoing star formation. Those galaxies can be used as standard cosmic chronometers, as firstly proposed by Jimenez&Loeb (2002), whose differential age evolution as a function of cosmic time directly probes H(z). We analyze the 4000 {\AA} break (D4000) as a function of redshift, use stellar population synthesis models to theoretically calibrate the dependence of the differential age evolution on the differential D4000, and estimate the Hubble parameter taking into account both statistical and systematical errors. We provide 8 new measurements of H(z) (see Tab. 4), and determine its change in H(z) to a precision of 5-12% mapping homogeneously the redshift range up to z \sim 1.1; for the first time, we place a constraint on H(z) at z \neq 0 with a precision comparable with the one achieved for the Hubble constant (about 5-6% at z \sim 0.2), and covered a redshift range (0.5<z<0.8) which is crucial to distinguish many different quintessence cosmologies. These measurements have been tested to best match a \Lambda CDM model, clearly providing a statistically robust indication that the Universe is undergoing an accelerated expansion. This method shows the potentiality to open a new avenue in constrain a variety of alternative cosmologies, especially when future surveys (e.g. Euclid) will open the possibility to extend it up to z \sim 2.


Contents 1 Introduction
The expansion rate of the Universe changes with time, initially slowing because of the mutual gravitational attraction of all the matter in it, and more recently accelerating, which is referred to generically as arising from "dark energy" [1][2][3][4].
The most generic metric describing a flat, homogeneous and isotropic Universe is the Friedmann-Lemaître-Robertson-Walker (FLRW) one: ds 2 = −c 2 dt 2 + a(t)δ ij dx i dx j that relates the line element in space-time (ds 2 ) to the time element (c 2 dt 2 ) and to the space element (dx 2 ) using only the expansion factor a(t), which characterizes how space is expanding as a function of time. For a given model that specifies the equation of state of all components in the Universe, a(t) is fully determined.
However, we do not know what constitutes most of the energy budget in the Universe, and thus a(t) needs to be determined observationally. The function a(t) is related to the Hubble parameter by H(t) =ȧ(t)/a(t). This parameter has been measured with high accuracy (∼3%) only in the present-day Universe, i.e. the Hubble constant H 0 [5][6][7]. One of the key goals of modern cosmology is therefore to constrain H as a function of cosmic time. To determine H, several observational tools have been proposed, from standard "candles" (e.g. Type Ia Supernovae) to standard "rulers" (e.g. Baryonic Acoustic Oscillations), but none of them has achieved high accuracy results over a significant fraction of the Universe lifetime [8][9][10].
An independent approach is provided by the differential dating of "cosmic chronometers" firstly suggested by Jimenez & Loeb (2002) [11], because it gives a measurement of the expansion rate without relying on the nature of the metric between the chronometer and us, which is not the case for methods which depend on integrated quantities along the line of sight. The cosmic chronometers formalism is very straightforward.
The expansion rate is defined as: (1.1) and since the redshift z of the chronometers can be known with high accuracy (e.g. spectroscopic redshifts of galaxies have typical uncertainties σ z ≤ 0.001), a differential measurement of time (dt) at a given redshift interval automatically provides a direct and clean measurement of H(z).
The major power of this method, as already underlined in Refs. [7,11,12], is that it is based on a differential approach. This not only helps to cancel out the systematics that would have come in if evaluating absolute ages, but also minimizes the potential effects of galaxy evolution: the integrated evolution as measured across all the redshift range it is not relevant when differential quantities are estimated, but all that matters is just the evolution that takes place between the redshifts where the differences are taken (for a more detailed discussion, see Sect. 3.1).
If we want to move beyond the local Universe, the best cosmic chronometers are galaxies which are evolving passively on a timescale much longer than their age difference. Based on a plethora of observational results, there is general agreement that these are typically massive (M stars ∼ 10 11 M ⊙ ) early-type galaxies (ETGs hereafter) which formed the vast majority (>90%) of their stellar mass at high-redshifts (z > 2 − 3) very rapidly (∼0.1-0.3 Gyr) and have experienced only minor subsequent episode of star formation, therefore being the oldest objects at all redshifts (e.g. [13][14][15][16]). Thus, a differential dating of their stellar populations provides dt in Eq. 1.1. It is worth recalling that differential dating of stellar populations is not only possible, but can be very accurate when targeting single stellar populations. As an example, we note that differential ages can be obtained for globular clusters in the Milky Way with a precision of 2-7% (including systematic errors) (e.g. Ref. [17]).
Compared to other approaches based on the global spectral or photometric analysis [11,12,[18][19][20][21], it has been found that one of the most direct and solid ways of doing this is to use the 4000Å break (hereafter D4000) in ETG spectra, thanks to its linear dependence on age for old stellar populations [7]. This break is a discontinuity of the spectral continuum around λ rest = 4000Å due to metal absorption lines whose amplitude correlates linearly with the age and metal abundance (metallicity, Z) of the stellar population (in some age and metallicity ranges), that is weakly dependent (for old passive stellar populations) on star formation history (SF H), and basically not affected by dust reddening [7,[22][23][24] (see also Sect. 3.3, and figures therein). If the metallicity Z is known, it is then possible to measure the difference between the ages of two galaxies as proportional to the difference of their D4000 n amplitudes: ∆t = A(Z)∆D4000 n , where A(Z) is a slope which depends on metallicity.
The differential aging of cosmic chronometers has been used to measure the observed Hubble parameter [18,19], to set constraints on the nature of dark energy [18][19][20], and most recently to provide two new estimates of the Hubble parameter (even if with large errorbars) H(z ∼ 0.5) = 97 ± 62 km s −1 Mpc −1 and H(z ∼ 0.9) = 90 ± 40 km s −1 Mpc −1 [12], and to recover the local Hubble constant [7].
In this paper we present improved constraints on the Hubble parameter up to redshift z ∼ 1.1, obtained using the technique described by Moresco et al. 2011 ([7], hereafter M11). In order to fully exploit passive ETGs as reliable cosmic chronometers, two main challenges must be faced: the appropriate sample selection and the reliable differential dating of their stellar ages. The paper is organized as follows. The selection criteria and the properties of the different samples are presented in Sect. 2. In Sect. 3 we introduce the theoretical basis used to estimate the Hubble parameter from the D4000 − z relation, describing how the observed D4000 − z relation has been obtained and how stellar population synthesis models have been used to calibrate the relation between D4000 and the age of a galaxy. In Sect. 4 we discuss the detailed procedure to estimate H(z), and how statistical and systematical errors have been taken into account in the global error budget. In Sect. 5 and 6 we present our H(z) estimates, compare them with other H(z) measurements available in literature and show the constraints our data impose on different cosmological scenarios.

Sample selection
For a reliable application of the cosmic chronometers approach, it is essential to select an appropriate sample of passively evolving ETGs over the widest possible redshift range. The optimal choice to homogeneously trace the redshift evolution of cosmic chronometers would have been a dedicated survey, mapping with the same characteristics and properties the D4000 − z relation in the entire redshift range. However, a single survey of ETGs covering a wide redshift range with spectroscopic information does not exist. To circumvent this limitation, we exploited both archival and still to be released surveys, and the total sample used in this work is therefore the combination of several different subsamples.
The general selection criteria adopted to extract the final sample of ETGs were based on the following main steps: (i) extraction of the reddest galaxies with multi-band photometric spectral energy distributions (SEDs) compatible with the template SEDs of ETGs at z ∼ 0 or with old passive stellar populations [25]; (ii) high-quality optical spectra with reliable redshifts and suitable to provide D4000 n amplitudes up to z ∼ 1.5; (iii) absence of emission lines (Hα and/or [OII]λ3727 depending on the redshift) in order to exclude ongoing star formation or AGN activity; it is worth noting that emission lines (and in particular the [OII] and Hα lines) are not detectable even if we average (stack) together the spectra of different ETGs in order to increase the signal-to-noise ratio (see Fig. 1), hence excluding the possibility of low-level star formation or AGN activity not detected in individual spectra because of the higher noise.
(iv) stellar masses (M) estimated from photometric SED fitting to be above 10 11 M ⊙ (above 10 10.6 M ⊙ at z > 0.4) in order to select the most massive ETGs; (v) spheroidal morphology typical of elliptical galaxies (when this information was available).
There now exists overwhelming evidence confirming a "downsizing scenario" for ETGs, with more massive ETGs having completed their star formation and mass assembly at higher redshifts than less massive ones (e.g. see [13][14][15][16][25][26][27][28]). With the described selection criteria, thus, we have considered the reddest, oldest, passive envelope of ETGs in the entire redshift range, i.e. the best possible to trace the differential age evolution of the Universe.
Stellar masses were all evaluated assuming a standard cosmology, and rescaled (when necessary) to a Chabrier initial mass function (IMF) [29]. Given the non-uniform photometric and spectral coverage of the various surveys, a totally homogenous mass estimate was not obtainable, and different models and techniques have been used. However, this fact does not pose a major concern for the analysis, since different techniques recover very similar stellar masses for passively evolving galaxies (e.g. see [16,42]), and the primary parameter which may significantly bias the estimate is the IMF, which has been corrected for. Moreover it is worth emphasize that the masses estimated do not directly affect the scientific results, but are only used to select the most massive galaxies in all the surveys, independently on their absolute value. Three high redshift early-type galaxies (with 1.8 < z < 2.2) have been also considered, studying the possibility to extend this approach up to much higher redshifts.
Each spectroscopic survey used for this purpose has its own characteristics. In the following, the relevant details of each subsample are presented (see also Table 1).
SDSS-DR6 MG Sample. This sample has been taken from the analysis of M11. ETGs have been extracted from the SDSS-DR6 Main Galaxy Sample (MGS, [30]), matching SDSS photometry (u, g, r, i, and z) to 2MASS photometry (J, H, and K), in order to obtain a wider photometric coverage and extract robust mass estimates from the fitting of their photometric SEDs. For each galaxy, the 4000 A break amplitudes have been taken from the MPA-JHU DR7 release of spectral measurements 1 . Passive ETGs have been selected combining a photometric criterion, i.e. selecting those galaxies Figure 1. The ETG spectral evolution. In order to increase the signal-to-noise ratio and the visibility of spectral features, mean stacked spectra were obtained by co-adding individual spectra of ETGs in each redshift bin. For each stacked spectrum, the bin central redshift is indicated on the top-right. The spectra are typical of passively evolving stellar populations and do not show significant [O II]λ3727 emission. The spectra are normalized in the blue region of D4000n (3850-3950Å), where the average flux is indicated by a segment in the hatched region on the left. The hatched region on the right indicates the red D4000n range (4000-4100Å), where the solid segments represent the average fluxes and the dashed one indicates the average flux of the lowest redshift spectrum. A trend of decreasing red flux (i.e. D4000n, which is defined as the ratio between the average fluxes in the red and blue ranges defined above) with increasing redshift is clearly visible. As a reference, a BC03 spectrum with delayed τ SFH (τ = 0.1 Gyr), solar metallicity and age of 2.5 Gyr is overplotted in red to a high-z stacked spectrum. The model spectrum has been convolved at a velocity dispersion of 250 kms −1 , typical of the ETGs considered. whose best-fit to the SED matched a local E/S0 template, and a spectroscopic criterion by excluding those galaxies showing emission lines (rest-frame equivalent width EW> 5Å). The stellar masses of these galaxies have been estimated with SED fitting, using a wide library built with BC03 models [24], exponentially delayed Star Formation Histories (hereafter SFH), with a Star Formation Rate (hereafter SFR) SF R(t) ∝ t/τ 2 exp(−t/τ ) with 0.05 < τ < 1 Gyr, ages with 0 < t < 20 Gyr, dust reddening 0 < A V < 1 modeled with a Calzetti's extinction law [31] (0 < A V < 0.6 in the case of values of age/τ > 4), solar metallicity, and a Chabrier IMF. Despite the wide range of extinction allowed, the best-fit provided a distribution of A V peaked at 0, with a median value of 0.2, compatible with the selection of passive ETGs. A mass cut has been applied, selecting galaxies with stellar masses 11 < log(M/M ⊙ ) < 11.5. Stellar metallicities were obtained from the estimates of Ref. [32] 2 , who performed simultaneous fits of five spectral absorption features which depend negligibly on the α/Fe ratio, i.e. D4000, Hβ and Hδ a +Hγ a as age-sensitive indices and [Mg 2 Fe] and [MgFe]' as metallicitysensitive indices; some works (e.g. see [33,34]) have found in particular for Hδ a +Hγ a a dependence on α/Fe ratio, but in Ref. [32] it is also shown that the metallicities and ages obtained including or excluding those features do not present any discrepancy. The original redshift range (0.15 < z < 0.3, see M11 for further details) has been reduced to z < 0.23 to limit the effect of the mass incompleteness due to the magnitude limit of the sample. In conclusion, the SDSS MGS ETGs sample contains 7943 ETGs in the redshift range 0.15 < z < 0.23.
SDSS-DR7 LRGs sample. The Luminous Red Galaxies (LRGs) [35] represent a spectroscopic sample of galaxies based on color and magnitude selection criteria, defined to yield a sample of luminous intrinsically red galaxies that extends fainter and farther than the main flux-limited portion of the SDSS main galaxy spectroscopic sample [36]. They are selected by imposing a luminosity and rest-frame color cut intended to follow passive evolution. Two different cuts have been designed to select LRGs at z 0.4 and z 0.4 (for further details, see [35]). Due to the informations available for this sample, it was not possible to apply the same spectro-photometric selection criterion used for the SDSS MGS; therefore, to reduce the possible contamination of starforming galaxies, we impose a threshold in the signal-to-noise per pixel ratio for the spectra of these galaxies, rejecting galaxies with S/N < 3, and more severe spectroscopic cuts excluding galaxies with measured equivalent widths of the emission lines [OII] and Hα; as for the SDSS MG sample, the estimates of the equivalent widths and the S/N ratio have been taken from the MPA-JHU DR7 release of spectral measurements 3 . Stellar mass measurements of this sample have been obtained from VESPA [37], a code developed to recover robust estimates of masses and star formation histories from a fit to the full spectral range of a galaxy with theoretical models; BC03 models have been adopted, and a Chabrier IMF. Due to their targeting criteria, LRGs sample a range of stellar masses higher than the previous SDSS ETGs, with 11 < log(M/M ⊙ ) < 13. Taking into account both the effect of the mass incompleteness and the mass distribution of the sample, we decided to consider only the LRGs sample at 0.3 < z < 0.4, and we selected galaxies with stellar masses 11.65 < log(M/M ⊙ ) < 12.15. In this way we obtained 2459 ETGs, in the redshift range 0.3 < z < 0.4. Stern et al. (2010) sample. This sample has been obtained from the analysis of Ref. [12]. Within this work, optical spectra of bright cluster elliptical galaxies have been obtained with the Keck LRIS instrument. Rich galaxy clusters were targeted in order to obtain an as large as possible sample of red ETGs over the redshift range 0.2 < z < 1. Nine high S/N stacked spectra in the redshift range 0.38 < z < 0.75 have been selected and analyzed (see Fig. 7 of Ref. [12]) to study the D4000 n − z relation. All of these spectra clearly show features and continuum characteristic of old passive stellar populations.
zCOSMOS 20k bright sample. This sample has been extracted from the zCOSMOS 20k bright sample [38]. The observed magnitudes in 12 photometric bands (CFHT u * , K and H, Subaru B J , V J , g + , r + , i + , and z + , UKIRT J and Spitzer IRAC at 3.6 µm and 4.5 µm) have been used in order to derive reliable estimates of galaxy parameters from the photometric SED-fitting. The spectra have been obtained using the VIMOS spectrograph mounted at the Melipal Unit Telescope of the VLT at ESO's Cerro Paranal Observatory. The 4000Å break amplitudes have been obtained using the spectral measurements of Platefit [39]. Passive ETGs have been selected by combining photometric, morphological and optical spectroscopic criteria, following the approach of Ref. [25]. Galaxies have been chosen with a reliable redshift measurement, a best-fit to the SED matching a local E-S0 template, weak/no emission lines (EW< 5Å), spheroidal morphology, and a K − 24µm color typical of E/S0 local galaxies (i.e. K − 24µm< −0.5); for further details about the sample selection, see Ref. [25]. The stellar mass has been estimated from SED fitting of those galaxies, using a wide library built with BC03 models, exponentially delayed SFHs with SF R(t) ∝ t/τ 2 exp(−t/τ ) with 0.05 < τ < 1 Gyr, ages with 0 < t < 20 Gyr, dust reddening 0 < A V < 1 modeled with a Calzetti's extinction law (0 < A V < 0.6 in the case of values of age/τ > 4), solar metallicity, and a Chabrier IMF. As in the SDSS MGS sample, also in this case the best-fits to the data presented a distribution of A V peaked at 0, with a median value of 0.2, compatible with the selection of passive ETGs. A mass cut log(M/M ⊙ ) > 10.6 has been applied to select the most massive population. Because of the wavelength coverage of the zCOSMOS spectra, the D4000 n break is available only in the range 0.43 z 1.2. In conclusion, the zCOSMOS 20k ETGs sample contains 746 ETGs in the redshift range 0.43 < z < 1.2.
K20 sample. The starting sample consists of about 500 galaxies selected in the K-band from a sub-area of the Chandra Deep Field South (CDFS)/ GOODS-South and from a field around the quasar 0055-2659 [40]. Optical spectra were obtained with the ESO VLT UT1 and UT2 equipped respectively with FORS1 and FORS2. Passive ETGs have been selected using the optical spectroscopic classification of Ref. [41], using a parameter cls=1, characteristic of red galaxies with no emission lines and elliptical morphology. Mass estimates have been taken from the SED fitting of Ref. [28], who used a wide library of BC03 models (with exponentially decaying SFHs, SF R(t) ∝ 1/τ exp(−t/τ ), with 0.1 < τ < 15 Gyr, ages in the range 10 7 < t < 10 10.2 yr, dust reddening 0 < E B−V < 1 modeled with SMC law, metallicities in the range 0.02 < Z/Z ⊙ < 2.5, and a Salpeter IMF). The stellar masses were rescaled to a Chabrier IMF by subtracting 0.23 dex from logM (see [16,42]), and selected to have log(M/M ⊙ ) > 10.6. In conclusion, the K20 ETGs sample contains 50 galaxies in the redshift range 0.26 < z < 1.16.
GOODS-S sample. Old passive ETGs were extracted from the GOODS-S field [43] combining morphological and photometric criteria based on optical color cuts as a function of redshift (for more details on the sample selection, see Ref. [44]). The spectra have been obtained from VVDS [45], VIMOS [46], and FORS2 [43,47]. Mass estimates have been taken from Ref. [48], where the adopted Salpeter IMF was rescaled to a Chabrier IMF as previously discussed. A mass cut of log(M/M ⊙ ) > 10.6 has been used, and all galaxies with emission lines have been excluded as in the other samples. With this approach, 46 galaxies were selected in the redshift range 0.67 < z < 1.35.
Cluster BCGs sample. This sample consists of ETGs of the X-ray selected clusters RX J0152.7-1357 at z = 0.83 [49], RDCS J1252.9-2927 at z = 1.24 [50], and XMMU J2235.3-2557 at z = 1.39 [51], which include their BCGs and other galaxies within 250 kpc radius from the center, with no detectable [OII]λ3727 emission line in their spectra. Stellar masses have been evaluated from the SED fitting assuming BC03 models, solar metallicity, delayed exponential SFHs, and a Salpeter IMF; the masses have been therefore rescaled to a Chabrier IMF as previously discussed. By selecting spectra with high signal-to-noise, we are left with 5 galaxies in the range 0.83 < z < 1.24, all with masses log(M/M ⊙ ) > 11.
GDDS sample. Within the GDDS [52], Ref. [53] analyzed the spectra (obtained with the GMOS multi-slit spectrograph) of 25 galaxies with measured D4000 n and Hδ, in the range 0.6 < z < 1.2 and with masses M > 10 10.2 M ⊙ . Stellar masses have been derived from template fits to the multicolor photometry [54], assuming a Baldry et al. (2003) IMF [55]. Masses have been scaled by -0.03 dex to convert them to a Chabrier IMF [42,53]. ETGs have been selected to have a negligible specific Star Formation Rate (sSF R = SF R/M, sSF R < 10 −1 Gyr −1 ), additionally applying a mass cut log(M/M ⊙ ) > 10.6. In this way, 16 galaxies were selected in the redshift range 0.91 < z < 1.13.
UDS sample. Based on selection from the UKIDSS Ultra Deep Survey (UDS, [56]), a spectroscopic survey was undertaken using the VIMOS and FORS2 spectrographs at the VLT (UDSz; [57]). From a spectroscopic sample of over 2000 galaxies, passive ETGs were identified using the criterion described in Ref. [58], with SFRs estimated using the rest-frame UV flux / [OII]λ3727 emission / 24 µm detections. The stellar mass estimates were made by fitting double-burst models. Using Charlot & Bruzual (2007), models were constructed as two sequential bursts, giving the galaxy an older and younger stellar subpopulation, assuming a Chabrier IMF. The models used for this analysis are the most dissimilar with respect to the ones used in the other surveys; however, recently Ref. [59] estimated the stellar masses of a sample of massive galaxies in the UDS survey with the same models, and comparing them with masses obtained with standard τ BC03 model found a mean difference of only ∼ 0.04 dex (see Fig. 3 of Ref. [59]). The ETGs have been selected applying a mass cut log(M/M ⊙ ) > 10.6. A redshift cut z < 1.4 has been applied, since at z ≥ 1.4 the D4000 n break is near the red edge of the optical spectra, ∼ 1µm, where the sky noise is too high and CCDs become transparent. In conclusion, this UDS sample contains 50 galaxies in the redshift range 1 < z < 1.4.
High-z sample. We decided to expand our sample of ETGs with three high redshift (z > 1.8) galaxies. Onodera   With this approach, we selected a final sample of 11324 old, passively evolving ETGs with no signatures of star formation or AGN activity at 0.15 < z < 1.42, and stellar masses in the range of 10.6 < log(M/M ⊙ ) < 12.15. The key feature of our sample is the combination of the selection of massive and passive galaxies. This sample selection has been chosen to provide ETGs that formed most of their stars at comparable epochs (i.e. with similar redshift of formation [15]), which means that they represent an homogeneous sample in terms of ages of formation, and can be used as "cosmic chornometers". Moreover, basing on the stellar mass function [16] and clustering [63], it has been found that this population of galaxies experienced negligible major merging events in a large fraction of the redshift range considered in our study [64], i.e. they did not increase their mass significantly in the time comprised between the redshifts adopted for the differential approach, meaning that the contamination to the initial population (with a given age of formation) is minimal.
We decided to treat separately the SDSS MG sample, the LRGs sample, and the sample at z > 0.4, where, due to the lower statistical power provided by the existing surveys, we chose to merge together all of the high-redshift samples. In Ref. [7] it has been shown that there is an evident dependence of the D4000 break on the stellar mass; therefore, to avoid spurious mass-dependent effects, we divided each subsample into two mass bins, as follows: • for the SDSS MG sample, we defined a low mass range for 11 < log(M/M ⊙ ) < 11.25 (5210 galaxies) and a high mass range for 11.25 < log(M/M ⊙ ) < 11.5 (2733 galaxies, using the same mass ranges adopted in Ref. [7]); • for the LRGs sample, we defined a low mass range for 11.65 < log(M/M ⊙ ) < 11.9 (1410 galaxies) and a high mass range for 11.9 < log(M/M ⊙ ) < 12.15 (1049 galaxies); • for the z > 0.4 sample, we defined a low mass range for 10.6 < log(M/M ⊙ ) < 11 (566 galaxies) and a high mass range for log(M/M ⊙ ) > 11 (365 galaxies).
In this way, we ensure in each subsample a nearly constant mass as a function of redshift and we avoid mixing the redshift evolution characteristic of different mass regimes. In the sample at z > 0.4, we decided to consider also the mass range 10.6 < log(M/M ⊙ ) < 11 to extend the H(z) analysis with another mass bin, since the low statistics of this sample does not allow us to further divide the mass bin with log(M/M ⊙ ) > 11. Figure 2 shows the redshift and mass distributions of the overall sample; the redshift distribution in particular shows clearly the presence of structures in the redshift range 0.45 < z < 1. The D4000 n − z plots for all the samples separately are shown in Fig. 3.

The method: from spectra to ages
The aim of this section is to illustrate the method with which we used the 4000Å break as a proxy of the stellar age to estimate the expansion rate of the Universe, H(z).
The 4000Å break is a feature in galaxy spectra that was firstly introduced by Ref. [65] as the ratio between the continuum flux densities in a red band (4050-4250Å) and a blue band (3750-3950Å) around 4000Å restframe: We decided to adopt the slightly different definition introduced by Ref. [23] (hereafter D4000 n ), where narrower bands (3850-3950Å and 4000-4100Å) have been used in order to be less sensitive to dust reddening.
The amplitude of this feature, due to metal absorption lines, depends on the age and metallicity of the stellar population, as well as on the star formation history. However, within specific ranges of D4000 n , it has been demonstrated that its amplitude correlates linearly with the age of the galaxy and is weakly dependent on the star formation history if the stellar population is old and passively evolving [7].

The expansion rate is
where a(z) = 1/(1 + z). M11 introduced an approximate linear relation between D4000 n and galaxy age (at fixed metallicity): where A(Z, SF H) (in units of Gyr −1 ) is the conversion factor between age and D4000 n . This approximation has the considerable advantage that the relative D4000 n evolution directly traces the age evolution of a population of galaxies: The Hubble parameter H(z) can, then, be rewritten as a function of the differential evolution of D4000 n : Thus, to estimate H(z) using the approach just discussed, it is therefore necessary to: 1. derive an observed D4000 n − z mean relation, that will provide the quantity dz/dD4000 n ; 2. calibrate the D4000 n -age relation with stellar population synthesis models and therefore quantify the A(Z, SF H) parameter; 3. estimate H(z), verifying the robustness of the results against the adopted choice of binning, stellar population synthesis model, SFH, and metallicity.

The importance of the differential approach
One of the main challenges for the study of galaxies as a function of redshift, and even more in the analysis we are performing, is to ensure and validate the assumption that we are looking at the same population as a function of redshift, so that their properties may be meaningfully compared at different redshifts. Several issues may falsify this assumption. One of the most significant is "progenitor-bias", which refers to the issue that samples of ETGs at high redshift might be biased towards the oldest progenitors of present-day early-type galaxies [66,67], therefore not sampling the same population studied at intermediate redshifts. Another effect to be taken into account is the mass evolution of ETGs as a function of redshift, which, when comparing galaxies of the same mass, may alter the shape of the age−z relation. Finally, one also has to take into account the fact that, lacking a unique spectroscopic survey suitable to study ETGs evolution over a wide redshift range, we are forced to merge together information obtained from surveys with different selection criteria, photometry, mass estimates, systematics and so on. However, it is fundamental to emphasize that our method relies only on a differential measurement, since the estimate of H(z) is based only on the measurement of the quantity dD4000/dz. Therefore, each H(z) point is obtained not by comparing ETGs at z ∼ 0 with ETGs at z ∼ 1, so that the effects described above may play a significant role, but instead comparing points close in redshift, with ∆z ∼ 0.04 at z < 0.4 and ∆z ∼ 0.3 at z > 0.4, as is discussed in the next section. Converted into terms of cosmic time, the previously quoted ∆z correspond to differences in cosmic time of ∼ 500 Myr for z < 0.4 and of ∼ 1.5 Gyr at z > 0.4, which are short times for potential effects due to merging or mass evolution: this helps in mitigating the problem of the mass evolution and also of the progenitor-bias, which typically can affect studies where the properties of distant ETGs are directly compared to those of nearby ones. In Appendix A.1 it is discussed how we checked the reliability of our analysis against this effect, and a quantitative estimate based on observational constraints is given, but it is as well discussed how, given the present errors, these estimates have to be considered as upper limits, since our data are still compatible with not being biased by such an effect.
We also emphasize that for massive and passive ETGs, the measured evolution in terms of mass and number density is observed to be less significant as compared to less massive galaxies. Ref. [16] shows that ETGs with log(M/M ⊙ ) > 11 are compatible with no evolution in number density from z ∼ 1 to z ∼ 0, while for less massive ones (10.7 < log(M/M ⊙ ) < 11) one observes an evolution roughly of a factor three over the same redshift range. This redshift range corresponds to a range of cosmic times five times longer than the range of cosmic time covered by our differential estimates.
Ref. [68] shows that the average mass in individual quiescent galaxies grows by a factor of ∼ 2 from z = 2 to z = 0, spanning again a range of cosmic time ∼ 7 times longer than the range used in our analysis.
Moreover, we are also treating the different ETG samples (SDSS MGS, LRGs and "z > 0.4" ETG samples) separately, estimating H(z) only within each sample. In this way, the differences between the various samples represent less than an issue, as long as the homogeneity of selection criteria, mass measurement and systematics treatment is guaranteed within each sample and the redshift evolution is estimated independently within each subsample. What is important is simply to ensure a uniform sampling where the differences are taken, but a global uniform sample (e.g. in terms of mass ranges and absolute ages) is not required, as these differences would just produce systematic constant offsets that will drop out when evaluating a differential quantity.

The observed D4000 n − z relation
The observed D4000 n − z relation for the entire ETG sample is shown in the upper panel of Fig. 6. Each orange and green point at z < 1.5 indicates the median value of D4000 n in a given redshift bin for a given mass range, green for the lower mass range and orange for the higher mass range as described in section 2. The points in the gray shaded area represent the single D4000 n measurements of Ref. [60][61][62]. The choice of the redshift bin width has been made taking into account the difference between the SDSS MGS subsample (z < 0.3), LRG subsample (0.3 < z < 0.4), and the other subsamples (z > 0.4).
Due to its high statistics, the adopted bin width for the SDSS MGS subsample is ∆z = 0.02, with each bin including 1000 galaxies for the low mass bin and ∼ 500 for the high mass bin. For the LRGs subsample, which have much lower statistics, we have used a wider binning to keep the number of galaxies per bin sufficiently high, with ∆z = 0.05 and each bin having 500 galaxies both for the high and the low mass bins. For the other subsamples at z > 0.4, an adaptive redshift binning centered around the structures present in the redshift distribution has been used (see Fig. 2), with wider bins where there were no structures. In this way, it was also possible to obtain an almost constant number of galaxies per bin, with N gal ∼ 70 for the low mass bin and N gal ∼ 50 for the high mass bin. The size of the redshift bins is fundamental, since too wide a redshift binning will produce a poorly sampled H(z), and too narrow a redshift binning will produce oscillations in the D4000 n − z relation.
In each redshift bin, the median D4000 n was then derived, separately for each mass regime. The associated errors are standard deviations on the median, defined as the "median absolute deviation" (MAD, MAD = 1.482 · median(|D4000 n − median(D4000 n )|)) divided by √ N , i.e. σ med (D4000 n ) = MAD/ √ N (see Ref. [69]). As discussed above, the D4000 n is a feature that depends both on age and metallicity, and less significantly on other parameters such as the assumed SFH, IMF or α−enhancement (< 10%, see Sect. 3.3 and Appendix A). It is therefore important to obtain informations about the metallicity of our ETGs. For the SDSS MGS ETG sample, where the signal-to-noise ratio and the wavelength coverage of the spectra allows us to estimate the metallicity, the median metallicity Z/Z ⊙ and their errors σ med (Z/Z ⊙ ) have been derived. On average, we find a slightly super-solar metallicity, with a mean value close to Z/Z ⊙ = 1.1 almost constant, only slightly decreasing with redshift. For the higher-redshift samples, where metallicity estimates are not available, we made a conservative choice of assuming a metallicity Z/Z ⊙ = 1.1 ± 0.1, because this range largely contains all the possible median values of the metallicity found in our SDSS MGS sample, also considering larger mass cuts.
The assumption to assign to ETGs at z > 0.3 the same range of metallicity observed in ETGs at z < 0.3 is well justified by the evolutionary scenario of the most massive ETGs; ETGs should not show a significant metallicity evolution because the vast majority of their stars were already formed at higher redshifts (z > 2 − 3) and most of their gas was consumed, hence not allowing significant changes of the metallicity with respect to z ∼ 0 (e.g. see [70]). This picture is also supported by direct measurements of solar to slightly super-solar metallicities in massive ETGs up to z ∼ 1 (e.g. see [71,72]) and also at higher redshifts (z ∼ 2, [73]). As a consistency check, in Fig. 1 we overplotted to a high-z ETGs stacked spectrum a BC03 spectrum with delayed τ SFH (τ = 0.1 Gyr), solar metallicity and age of 2.5 Gyr; the model spectrum has been convolved at a velocity dispersion of 250 kms −1 , typical of the ETGs considered. From the figure it is possible to notice the good agreement between the model and the observed spectra. low mass z D4000 n σ med (D4000 n ) Z/Z ⊙ * σ med (Z/Z ⊙ ) log(M/M ⊙ ) # gal 0. 16 Table 2. The median D4000n, metallicity Z/Z⊙ and log(M/M ⊙ ) as a function of redshift, and relative uncertainties (σ med (D4000n, Z/Z⊙) = MAD/ √ N , see text). * The metallicity has been estimated only for the SDSS MGS ETG sample, where the signal-to-noise ratio of the spectra was high enough; in the z > 0.3 sample, a metallicity Z/Z⊙ = 1.1 ± 0.1 has been assumed. Table 2 shows, for each redshift bin, the median D4000 n values and their errors, the median metallicity Z/Z ⊙ values and their errors, the median masses and the number of galaxies analyzed.

The calibration of the D4000 n -age relation
This section describes how the relations between D4000 n and age were derived. First, in order to mitigate uncertainties due the choice of the stellar population synthesis model, we adopted two independent libraries of synthetic spectra: the BC03 models [24] and the new MaStro models [74]. These two models differ substantially for the stellar evolutionary models used to construct the isochrones, for the treatment of the termally pulsing asymptotic giant branch (TP-AGB) phase, and for the procedure used for computing the integrated spectra (e.g. see [75,76] for more detailed reviews). The two models are also based on independent libraries of stellar spectra, with MaStro using the latest MILES models [77] and BC03 using STELIB [78]. The metallicities provided by the two models are Z/Z ⊙ = [0.4, 1, 2.5] for BC03 and Z/Z ⊙ = [0.5, 1, 2] for MaStro. The resolution of the two models is similar, 3Å across the wavelength range from 3200Å to 9500Å for BC03, and 2.54Å across the wavelength range 3525Å to 7500Å for MaStro. For both models, the D4000 n -age relations have been derived for four different SFHs, using an exponentially delayed SF R(t) ∝ t/τ 2 exp(−t/τ ) with τ = [0.05, 0.1, 0.2, 0.3] Gyr. The choice of the grid of τ has been done considering that the SFHs adopted must be compatible with the observed SEDs and spectra of passive ETGs. From analysis of the SDSS MGS ETG sample, Ref. [7] found that the τ distribution presents a median value below 0.2 Gyr for all of the mass subsamples, and that τ ≤ 0.3 Gyr is required for the majority of the observed SEDs. The same applies to galaxies at z > 0.3; e.g. the SED-fit analysis of zCOSMOS 20k ETGs finds, allowing τ free up to 1 Gyr, a median value of τ = 0.3 Gyr. This result is consistent with several other studies of ETGs [79][80][81], for which it has been found that the majority of massive field and cluster ETGs formed the bulk of their stellar mass at z 2 over short (i.e. τ < 0.1 − 0.3 Gyrs) star formation time-scales.
Since we want to ensure the validity of the linear approximation in the relations between D4000 n and the stellar age, we found that it is convenient to divide them into two regimes of D4000 n values which span the D4000 n range observed in the ETG spectra of our sample: • the low D4000 n is defined for 1.65 < D4000 n < 1.8 • the high D4000 n is defined for 1.8 < D4000 n < 1.95 The D4000 n -age relation from BC03 and MaStro models derived independently for these two regimes are shown in Fig. 4. These figures also show the dependence on metallicity and SFH (for the adopted range of τ ).
A best-fit slope A(Z, SF H) has been estimated for each D4000 n regime. The linear approximation is valid and accurate for all the regimes and metallicities, having in the case of BC03 models linear correlation coefficients > 0.996 with a mean value of 0.9987 ± 0.0004 for the "high D4000 n " regime and > 0.966 with a mean value of 0.990 ± 0.003 for the "low D4000 n " regime. In the case of MaStro models the linear correlation coefficients are always > 0.997 with a mean value of 0.9984 ± 0.0003 for the "high D4000 n " regime and > 0.986 with a mean value of 0.995 ± 0.001 for the "high D4000 n " regime.
In the two D4000 n ranges, a mean slope for each metallicity A(Z) has then been obtained by averaging the slopes A(Z, SF H) obtained for the four SFHs, and considering the dispersion between these measurements as the associated error. Those values are listed in Tab. 3, and shown in Fig. 5.
For a fixed metallicity, the similarity of the individual and mean slopes, as well as the small errors associated with the mean slopes A(Z), confirm that, for τ ≤ 0.3 Gyr, the dependence of the D4000 n -age relation on the SFH is negligible with respect to the dependence on metallicity. We also checked the effects of extending the grid up to τ = 0.6 Gyr, and we found that the results on the slopes A(Z) are still consistent within 1σ. If we adopt a different functional shape of the SFH (e.g. a declining exponential SF R(t) ∝ 1/τ exp(−t/τ )) the results on A(Z) are compatible within 0.5σ with the ones obtained with the exponentially delayed SFHs. high D4000 n low D4000 n BC03 A(Z/Z ⊙ = 0.4) 0.02893 ± 0.00004 0.032 ± 0.001 BC03 A(Z/Z ⊙ = 1) 0.060 ± 0.001 0.145 ± 0.003 BC03 A(Z/Z ⊙ = 2.5) 0.193 ± 0.002 0.27 ± 0.07 high D4000 n low D4000 n MaStro A(Z/Z ⊙ = 0.5) 0.0299 ± 0.0002 0.0321 ± 0.0001 MaStro A(Z/Z ⊙ = 1) 0.065 ± 0.001 0.106 ± 0.002 MaStro A(Z/Z ⊙ = 2) 0.138 ± 0.02 0.21 ± 0.02 Following the approach of M11, for each D4000 n regime the obtained values of the slopes (quoted in Tab. 3) have been interpolated with a quadratic function, shown in Fig. 5. When the metallicity is known, these relations allow one to associate the correct A(Z) parameter to each given metallicity.

The estimate of H(z)
The estimate of H(z) is based on two quantities (eq. 3.5): • the relative D4000 n evolution as a function of redshift • the parameter A(Z) for a given metallicity The quantity dz/dD4000 n has been calculated directly from the median D4000 n − z relation shown in the upper panel of Fig. 6, estimating the ∆D4000 n between the i-th and the (i + N )-th point for each mass bin. The choice of N was the result of a trade-off analysis between two competing effects: on the one hand, we want N as small as possible to maximize the number of H(z) measurements, on the other hand, N has to be large enough in order to have a D4000 n evolution larger than the statistical scatter present in the data, in order to ensure an unbiased estimate of H(z). For the SDSS MGS sample, where we have four median D4000 n points, we chose N = 2, to have two estimates of H(z) mutually independent; for the LRGs sample, having only two D4000 n points, N has to be equal to one. For the z > 0.4 sample, the analysis indicated N = 3 as the best compromise, being the smallest value ensuring a redshift evolution larger than the scatter in the data. This choice provides a redshift leverage between i-th and (i + N )-th point of ∆z ≈ 0.04 for the SDSS MGS sample and the LRGs sample, and of ∆z ≈ 0.3 for z ≥ 0.4 samples, which corresponds to a difference in cosmic time of ∼ 500 Myr and of ∼ 1.5 Gyr respectively.
For the high-z sample, the differential D4000 n evolution has been estimated between a median value of D4000 n and the D4000 n value of the individual galaxies, i.e. respectively the third last point with the galaxy of Ref. [60], the second last point with the galaxy of Ref. [61], and the last point with the galaxy of Ref. [62].
For the choice of A(Z) needed in eq. 3.5, as described in the previous section, the D4000 n − z relation has been divided into two parts: if the median D4000 n value is greater than 1.8, then we extrapolate the A(Z) values using the high D4000 n regime, while if the median D4000 n value is smaller than 1.8, then A(Z) values obtained from the low D4000 n regime have been used. In the case in which the two points used to estimate ∆D4000 n lie one in the first regime and one in the other, a median relation between the high D4000 n and the low D4000 n regimes has been adopted.
For the metallicity, we distinguish two cases: the SDSS MGS subsample at z < 0.3, where Z is known for each ETG, and the rest of the sample at z > 0.3 where the metallicity is unknown due to the limited signal-to-noise ratio and/or the limited wavelength coverage of the spectra. For the SDSS MGS ETGs, we adopted the observed median metallicity for each redshift bin, as quoted in Tab. 2. For the ETGs at z > 0.3, as discussed in section 3.2, we assumed a median metallicity Z/Z ⊙ = 1.1 ± 0.1. Thus, this metallicity range ∆Z enters as an additional uncertainty in the H(z) error budget (see next section).
Given these metallicities, A(Z) is obtained from the interpolated A = f (Z) relation described in section 3.3 and shown in Fig. 5. The detailed procedure is described in the following section, where we also discuss how the metallicity uncertainties (as well as the SFH uncertainties) are treated in our error estimate.
The measurements of H(z) have proven to be extremely robust even changing between completely different stellar population synthesis models: performing the analysis separately with the MaStro and the BC03 model, the values obtained are in agreement with a mean difference of 0.5 ± 0.4σ, except for the last point where there is a difference of 1.6σ. The results will be discussed in section 5, and the lower panel of Fig. 6 shows the comparison of the H(z) measurements for the two models.

H(z) error budget
There are two main sources of error in the H(z) estimate: a statistical error related to the computation of dz/dD4000 n , which depends on the median D4000 n errors, and a systematic error related to the estimate of A(Z), which depends on the metallicity range spanned by the data, on the SFH assumptions, as well on the adopted stellar population synthesis model.

Statistical error
In the previous section we described the method used to estimate the relative D4000 n evolution, and the statistical error σ stat has been obtained with standard error propagation.
Systematic error To estimate the systematic uncertainty, we consider two effects.
1. SFH contribution The spread of the A = f (Z) relation due to SFH assumption causes, at the typical metallicity of our ETGs (Z/Z ⊙ ∼ 1.1), an error on the estimate of A(Z) of about 13% in the high and low D4000 n regimes for the MaStro models, and of ∼2% and ∼13% respectively in the high D4000 n and low D4000 n regimes for the BC03 models. This uncertainty is shown, as an example, as the red shaded region in Fig. 5.

Metallicity contribution
To consider also the uncertainty due to metallicity, we have associated to each H(z) measurement three possible values of A(Z), representing the minimum, the median, and the maximum value of A(Z) allowed by the metallicity range. As shown in Fig. 5, the minimum value of A(Z) has been obtained by considering the lowest metallicity in a given redshift bin (i.e. Z/Z ⊙ − σ med (Z/Z ⊙ ), see Tab. 2) and using the lowest possible A = f (Z) interpolated relation; the median A(Z) value has been obtained by considering the median metallicity (i.e. Z/Z ⊙ ) and the best-fit interpolated relation; the maximum A(Z) value has been obtained by considering the highest metallicity in a given redshift bin (i.e. Z/Z ⊙ + σ med (Z/Z ⊙ )) and using the highest possible A = f (Z) interpolated relation. In this way, to the previous uncertainty due to the SFHs assumption, we add an uncertainty on A(Z) that depends specifically on the range of metallicity considered (e.g. for the range of metallicity of the z > 0.3 samples, it is ∼20% and ∼22% for the high and low D4000 n regimes with MaStro, and ∼20% and ∼19% for the high and low D4000 n regimes with BC03). The total uncertainty due to the SFHs and metallicity is represented by the black shaded region of Fig. 5.
For each of the three values of A(Z) described in point (2.) we estimate a value of H(z), and the dispersion between these measurements quantifies the systematic error σ syst .
Each H(z) measurement is, then, estimated as the weighted mean of the H(z) values obtained with the three values of A(Z), and the total error on H(z) has been obtained by summing in quadrature the statistical error σ stat and the systematic error σ syst .

Results and cosmological implications
The Hubble parameter H(z) has been estimated as described in section 4 separately for the SDSS MG sample, the LRGs sample, and the samples at z > 0.4; for both BC03 and MaStro models, the estimates have been obtained for each of the mass ranges described in section 2. We find that the H(z) estimates obtained for the two different mass ranges show good agreement, therefore providing strong evidence that this approach is not dependent on the chosen mass range. Since these estimates are statistically independent, they have been averaged using a weighted mean of the H(z) points at the same redshift, using as weights the corresponding error of each measurement.
The results, obtained using BC03 and MaStro stellar population synthesis models, are compared in the bottom panel of Fig. 6, separately shown in the upper and lower panels of Fig. 7, and reported in Tab. 4. In these tables we report for each measurement the statistical error σ stat and the systematic error σ syst ; we also reported the total error, estimated by summing in quadrature the statistical and the systematic error, with its absolute and percentage value. The local value of H 0 shown refers to the recent measure of Ref. [6], with H 0 = 73.8 ± 2.4 km s −1 Mpc −1 obtained including both statistical and systematic errors. After averaging the H(z) estimates obtained from the two mass ranges, we obtained a fractional error on H(z) at z < 0.3 of the order of ∼ 5%, which is close to the value Figure 6. The D4000n-redshift relation and H(z) measurements. The D4000n feature measured from different surveys is shown in the upper panel as a function of redshift. Grey, green and orange symbols indicate the D4000n values for individual galaxies and its median value in each redshift bin, respectively for the low and the high mass subsamples (as described in Sect. 2). The H(z) values have been estimated between the i-th and the (i + N )-th points as described in Sect. 4, not to be biased by the statistical scatter of the data; in the lower panel are shown the results relative to BC03 (in cyan) and to MaStro (in violet) stellar population synthesis model and their 1σ uncertainties; H(z) estimates obtained with MaStro models have been slightly offset in redshift for the sake of clarity. The solid point at z = 0 represents the measurement of Ref. [6]. As a comparison, we also show the H(z) relation for the ΛCDM model, assuming a flat WMAP 7-years Universe [56], with Ωm = 0.27, ΩΛ = 0.73 and H0 = 73.8 km s −1 Mpc −1 . estimated locally in M11; we recall that Ref. [12] achieved an accuracy of only ∼ 15% at z < 0.3. Furthermore, our derived H(z) values are fully compatible with ΛCDM, constraining the expansion rate very tightly.
At higher redshifts, the error increases because of the smaller number of observed galaxies in the samples, but still provide typical accuracy of (12-13)% on H(z) up to z ∼ 1, which improves by a factor 2 − 3 current measurements in this redshift range. The redshift range 0.5 < z < 1 is critical to disentangle many different cosmologies, as can be seen by the upper and lower panels of Fig. 7.
The comparison between the measurements obtained with BC03 and MaStro models is shown in the lower panel of Fig. 6. Even though the models are based on completely different stellar libraries and evolutionary synthesis codes, the H(z) measurements show very good agreement: we conclude that our results are basically independent of the assumed stellar population synthesis model. In particular, the present work goes beyond the present state of the art [12] in the following aspects: the use of different samples which give consistent results, ensuring that the sample selection does not introduce any systematic error; the analysis performed with two different stellar models, which provide compatible results; the use of the D4000 feature, which is less model dependent and more robust than the approach of e.g. Ref. [12]; the control of the mass dependence systematics; the homogeneous coverage of the full redshift interval, and particularly the cosmic time between 5 and 8 Gyr ago, which is the crucial cosmic time when the Universe changes from deceleration to acceleration; the higher accuracy across the entire redshift range (5-12%, including systematic errors).
In the upper and lower panels of Fig. 7 Table 4. H(z) measurements (in units of [km s −1 Mpc −1 ]) and their errors; the columns in the middle report the relative contribution of statistical and systematic errors, and the last ones the total error (estimated by summing in quadrature σstat and σsyst). These values have been estimated respectively with BC03 and MaStro stellar population synthesis models. This dataset can be downloaded at the address http://www.astronomia.unibo.it/Astronomia/Ricerca/ Progetti+e+attivita/cosmic chronometers.htm (alternatively http://start.at/cosmicchronometers). that: we explored six different scenarios: • Ω Λ,0 = 0, Ω m,0 = 1 (i.e. Einstein -de Sitter, EdS hereafter) The first two models have been chosen to see how our measurements compare to the standard EdS and ΛCDM model. The third model corresponds to the one discriminating between an accelerating and a non-accelerating Universe. We then considered three quintessence models (for a detailed discussion about theoretical models, see also Refs. [84,85]): the first one assumes a constant w q = −1.5, and the other two are designed in such a way that they provide the same luminosity distance as ΛCDM model at the 1% level. Thus it would be nearly impossible to distinguish them from ΛCDM with integrated measurements from standard candles, angular BAO or gravitational lensing measurements. The linear and quadratic model are somewhat ad-hoc, but they can be accommodated in more physically motivated models of quintessence. The models assume a flat Universe, Ω m,0 = 0.27, and Ω Λ,0 = 0.73 as pointed out by the latest WMAP 7-years results [86]; the Hubble constant H 0 has been chosen as the value that minimizes the chi squared with respect to the data points, assuming H 0 in the range allowed by Ref. [6], H 0 = 73.8 ± 2.4 km s −1 Mpc −1 .
This represents a direct measurement of H(z) without assuming any cosmological model. The observed Hubble parameter H(z) has been compared with the theoretical relation with a standard χ 2 formalism. These results allow us to discard an EdS model at more than 7σ, independent of the assumed stellar population synthesis model. From a purely observational point of view, it gives a direct 6σ evidence of an accelerated expansion with both BC03 and MaStro models, confirming the results obtained with other probes (e.g. [1,2]). Concerning the other models, the one that best fits the data is ΛCDM for both BC03 and MaStro, while BC03 measurements show some tension with the model with w q (z) = −1.3 + (z − 1) 2 (at ∼ 2σ), and MaStro measurements show some tension with the model with w q (z) = −1 + 0.8 · z (at ∼ 1.5σ).
The comparison with evolving w q shows that in principle this method has the possibility to discriminate models that produce the same luminosity distance, which would not be possible using SNe, even with a future SN survey. We enphasize that in order to obtain these measurements, we used no single dedicated survey. This means that if a survey may provide at z > 0.3 a statistic comparable to the one obtained in the SDSS MGS, it would in theory be possible to constrain H(z) at the ∼ 5% level up to z ∼ 1, allowing the possibility to distinguish between dark-energy models with evolving w q (z) from ΛCDM.
The points plotted in the gray region of Fig. 8 represent the H(z) estimates obtained with the Figure 8. The D4000n-redshift relation and H(z) measurements including the "high-z" sample. In the upper panel, the three starred points in the gray shaded area at z > 1.5 correspond to the three individual galaxies at 1.8 < z < 2.2 [60][61][62]. In the middle and bottom panels are shown the H(z) estimates, respectively for BC03 and MaStro stellar population synthesis models. The three starred symbols in the gray area and the corresponding H(z) estimates are not used in the χ 2 comparison with the models, because they are based only on three individual galaxies. The curve represents the theoretical H(z) relation for ΛCDM model, assuming a flat WMAP 7-years Universe [56], with Ωm = 0.27, ΩΛ = 0.73, and H0 = 73.8 ± 2.4 km s −1 Mpc −1 .
"high-z" sample. As explained in Sect. 4, lacking enough statistics at those redshifts, these H(z) measurements have been obtained by estimating the differential evolution between the last median points of the D4000 n − z relation up to z ∼ 1.4 and single D4000 n measurements: therefore they have not been used in the following for the comparison with the theoretical model. Here we just want to show how it is possible to extend this method up to much higher redshifts, and that future measurements of D4000 n at high redshifts for ETGs will open the possibility to extend this approach up to z ∼ 2. For example, Euclid will identify spectroscopically the rarest and most massive quiescent galaxies (M/M ⊙ > 4 · 10 11 ) at z > 1.8 [87], providing a large spectroscopic sample of high redshift ETGs, which will allow a much more detailed investigation of quintessence models. Pushing the "cosmic-chronometers" approach to higher redshifts, will however require a more detailed study of the systematics. Moving to z > 1.5 will mean to select galaxies which are closer to their redshift of formation, and therefore it will become more important to establish their detailed SFH; also the treatment of the progenitor-bias will be a more delicate matter. A good way to deal with this issue, providing that enough statistics is available, would be to select the oldest galaxy at each redshift, to probe exactly the upper and redder envelope of the age-z relation. Moreover, at higher redshifts (hence at younger ages) the theoretical stellar population synthesis models start to differ more, as testified by the larger differences found from the test with the ETGs at z > 1.8 (see the H(z) values at z > 1.2 in Fig. 8); therefore a better understanding of which can be the best model to be used will become important too. Given the issues to be faced just described, a priority to extend this method should be to ensure enough statistics, to reduce at minimum the statistical errors and to establish the reliability and robustness of having selected really the oldest galaxies at each redshift, from which depend the H(z) measurements. In this respect, analyzing the Baryon Oscillation Spectroscopic Survey (BOSS, [88]) would be a good step forward at intermediate redshifts, providing spectra and redshifts of ∼ 1.5 million luminous galaxies to z = 0.7.

Comparison with other H(z) measurements
We compared our H(z) measurements with data available in the literature [12,82,83] obtained with other methods. The first two references use the "cosmic chronometers" approach to estimate H(z) up to z ∼ 1.8, and are therefore useful to check the agreement with the H(z) evolution expected from our data; it is worthwhile to remember that these measurements estimate the differential ages of the oldest ETGs at each redshift in various surveys, and therefore do not suffer by definition of the progenitorbias effect. Ref. [83] uses instead a combination of measurements of the baryon acoustic peak and the Alcock-Paczynski distortion from galaxy clustering in the WiggleZ Dark Energy Survey, along with other baryon acoustic oscillation and distant supernovae datasets to determine the evolution of the Hubble parameter with a Monte Carlo Markov Chain technique up to z ∼ 0.9; in this case, having a precision better than 7% in most redshift bins, these measurements are important to check the correctness of our data. The comparison between our dataset and the other Hubble parameter estimates is shown in Fig. 9, and we find a remarkable agreement between the different measurements: the H(z) extrapolated at higher redshift from our data is in good agreement with the measurements of Refs. [12,82], and the comparison with the measurements of Ref. [83] (in the case of overlapping redshift range) is always well within the 1σ errorbars.

Conclusions
We have analyzed the D4000 n − z relation for a large sample of ETGs extracted from different spectroscopic surveys; several selection criteria have been combined in order to obtain a reliable sample comprising the most massive and passive ETGs in the redshift range 0.15 < z < 1.42. Due to the differences between the various surveys, SDSS MGS ETGs, LRG ETGs and "z > 0.4" ETGs have been studied separately, and each subsample has been further divided into two mass subsamples, since in Ref. [7] it has been shown a dependence of the D4000 n −z relation on mass (i.e. downsizing). In this way we obtained 7943 SDSS MGS ETGs with masses 10 11 < M/M ⊙ < 10 11.5 , 2459 LRG ETGs with masses 10 11.65 < M/M ⊙ < 10 12.15 , and 922 "z > 0.4" ETGs with masses 10 10.6 < M/M ⊙ < 10 11.5 .  Tab. 4), the crosses from Ref. [12], the open triangles from Ref. [82], and the open dots from Ref. [83]. For each subsample, a median D4000 n − z relation has been estimated, and used to evaluate the differential D4000 n evolution as a function of redshift. In each subsample, the couple of points used to evaluate the difference have been chosen as the best compromise between a small redshift leverage, to provide the maximum number of H(z) measurements, and a long redshift leverage, not to be biased in the ∆D4000 n estimate by the statistical scatter present in the data. In this way, we compare points separated by ∼ 500 Myr in cosmic time for SDSS MGS and LRG ETGs, and ∼ 1.5 Gyr for "z > 0.4" ETGs, considerably mitigating the possible effects of mass evolution of our sample. Such effects are most onerous for analysis comparing z ∼ 0 to z ∼ 1 ETGs. (e.g. progenitor-bias).
We have studied the theoretical D4000 n -age relation for two different stellar population synthesis models, BC03 [24] and MaStro [74], to estimate the conversion factor A(Z, SF H) between ∆D4000 n and ∆age. This parameter, calibrated on the measured stellar metallicity for SDSS MGS ETGs and on a extrapolated metallicity for LRG ETGs and "z > 0.4" ETGs, have been estimated for different choices of SFHs, and an averaged A(Z) has been thus obtained. The effect of metallicity and SFHs, which may affect the H(z) estimate, have been studied and considered as systematic sources of error into the global error budget of H(z).
Finally, we estimated H(z) separately for each mass subsample of SDSS MGS, LRGs and "z > 0.4" ETGs. Since the results show a good agreement between the two mass subsamples, we averaged those estimate, to provide a single H(z) estimate at each redshift which is mass independent. This analysis has been performed with both BC03 and MaStro models, and the results are in agreement with a mean difference of 0.5 ± 0.4σ, except for the last point where there is a difference of 1.6σ, witnessing the robustness of the results against changes of stellar population synthesis model.
We provide 8 new measurements of the observational Hubble parameter, mapping homogeneously the redshift range 0.15 < z < 1.1. At low redshift, we obtain for the first time a precision comparable to recent estimates of the Hubble constant (5-6% at z ∼ 0.2); at higher redshift, the precision decreases due to the decrease of statistics, but still the errorbars are at < 13% up to z ∼ 1.1, considering both statistic and systematic errors. We also show the possibility to extend our analysis up to much higher redshift by analyzing the D4000 n of three high redshift ETGs at z ∼ 1.7 − 2.2; however, since in this case the H(z) measurement is based only on a single D4000 n measurement, we do not use these points to constrain cosmological models. Comparing these measurements with theoretical H(z) relations, we obtain strong (6σ) evidence of the accelerated expansion of the Universe, confirming the results obtained with other probes [1,2]. An EdS Universe is ruled out at 7σ, and the model that best fits our data is ΛCDM. The comparison with three quintessence models, built to be undistinguishable with respect to ΛCDM model at the 1% level, show the capability of this approach to constrain different cosmologies.
Studying the D4000 n of three high-z ETGs (z > 1.8) we show the possibility to extend this approach constraining H(z) up to z ∼ 2. Given new upcoming spectroscopic surveys of ETGs (e.g. Euclid [87] and BOSS [88]), the "cosmic chronometer" approach may represent a new complementary cosmological probe to place stringent constraints on the Dark Energy Equation of State parameter w and its potential evolution with cosmic time.
studied, and the effect of the α−enhancement, for which massive ETGs have been found to higher ratios of alpha elements to iron than Milky Way-like galaxies. All these issues will be addressed separately in the following sections.
A.1 Estimating the effect of the progenitor-bias on H(z) The "progenitor-bias" is an effect to be taken into account when two samples of ETGs at low-and high-redshifts are compared. It has been firstly introduced by Refs. [66,67], who pointed out that such comparison results biased if the progenitors of the youngest ETGs at low redshift drop out of the sample at high redshift.
As discussed in the text, both the selection criteria and the method of analysis have been intended to minimize this effect, which is expected to be more important for less massive ETGs, in which a residual of star formation is still ongoing (e.g. see [16]). To test the reliability of our measurements, we tackled this issue from two sides.
1. We re-analyzed our data estimating the Hubble parameter from the upper envelope of the D4000 n − z relation. This approach has been already applied in the literature [12,82], and probing only the oldest galaxy population at each redshift it should provide an estimate of H(z) as close as possible to the unbiased one. We have considered the following mass ranges: log(M/M ⊙ ) > 11.25 for SDSS MGS, 11.65 < log(M/M ⊙ ) < 11.9 for LRGs, and log(M/M ⊙ ) > 10.6 for z > 0.4; these mass ranges are defined so that there is no strong median mass evolution in each subsamples along the spanned redshift range. For each samples, we considered only the D4000 n values above the median D4000 n in each redshift bin, and then estimate the median  D4000 n − z relation (with its error). In this way we estimated evaluate the 75 upper percentile of the D4000 n − z relation, and we used it as a proxy of the upper-envelope. The Hubble parameter has been then estimated on this new sample as described previously, with the only change of the conversion parameter A accordingly to the D4000 n range spanned. The H(z) estimates obtained in the main analysis and from the upper envelope are compatible within ∼ 0.3σ on average, and for the majority of the values with differences smaller than 6%, showing therefore a remarkable agreement. These measurements are shown in Fig. 10 and reported in Tab. 5.
2. We estimated quantitatively this effect, using conservative assumptions based on observational constraints. However, since these estimates will be model-dependent and based on constraints coming from other surveys, and since, as just discussed, our data are compatible with not being biased by the progenitor-bias, we preferred not to add them to the total error budget, and we considered pessimistic priors to provide an upper limit estimate to the error due to this effect.
The effect that the progenitor-bias has on the age-redshift relation is illustrated qualitatively in Fig. 11. Since the low-z ETGs sample are biased towards lower ages with respect to the high-z samples, it basically acts by flattening the age(z) relation: in this way the "observed" H(z) will result bigger than the "true" one. This effect can be studied by estimating the percentage shift ∆ prog bias on the Hubble parameter: The progenitor-bias can be simply considered as an evolution of the mean redshift of formation as a function of redshift of the ETG samples. In the case that the redshift of formation of the galaxies is homogeneous, i.e. ∆z form /∆z = 0, then there will be no progenitor-bias effect; otherwise, if ∆z form /∆z = B = 0, this means that, in a given redshift bin, instead of measuring the real age difference dt due to the passive evolution, one will measure dt−δ, where δ will be due to the variation of redshift of formation. The "observed" H(z) relation will be H(z) observed = −1/(1 + z) dz dt−δ , and the percentage shift with respect to the "true" Hubble parameter will be therefore equal to: The age of a galaxy t(z) is defined as: and hence the age difference δ between two galaxies formed at redshift z f,1 and z f,2 is: where in the last equation we considered z form as the mean redshift of formation and, as stated before, ∆z form /∆z = B. Remembering that dz/dt = (1 + z)H(z), it is possible to write: and the "true" H(z) relation can be recovered from the observed one from: To obtain an estimate of δ/dt, we considered a simple model: we assumed that a given distribution of redshifts of formation for ETGs is observed at z = 0, and that at increasing redshift the youngest galaxies drop out of the sample due to selection effects, and so the left side of the distribution results cut. This will produce a variation of the mean redshift of formation as a function of redshift, ∆z form /∆z (see Fig. 11). Many works have found that, at least up to z ∼ 1, the number density of the most massive ETGs is almost constant, while it shows a more appreciable decrement with redshift at smaller masses. Ref. [16], studying the zCOSMOS survey, reported an evolution of less than 15% for galaxies with log(M/M ⊙ ) > 11, and of ∼ 50% for masses log(M/M ⊙ ) ∼ 10.5. Ref. [89] analyzed ETGs in the COSMOS survey, finding no evidence for a decrease in the number density of the most massive ETGs out to z ∼ 0.7; relaxing the assumption about star formation histories and other properties, they estimate a maximum decrease in the number density of massive galaxies at that redshift of ∼ 30%. The recent analysis of Ref. [68] found a noticeable evolution in number density of quiescent galaxies in the NEWFIRM Medium-Band Survey from z ∼ 0 to z ∼ 2, but this evolution is remarkably smaller up tp z ∼ 1, and also compatible with no evolution within the errorbars. In order to give an estimate for our two mass subsamples, we assume conservatively an evolution of ∼ 30% for ETGs with log(M/M ⊙ ) > 11 from z ∼ 0 to z ∼ 1, and of ∼ 50% for ETGs with 11 < log(M/M ⊙ ) < 10.6. Not having any prior knowledge about the shape of the distribution of the redshifts of formation, we considered the case of a flat distribution (as an extreme case) between z f,min and z f,max and the case of a gaussian distribution (as suggested e.g. by Ref. [15]) with the same mean and dispersion of the flat one.
We proceeded as follows: • we built several distributions of redshifts of formation at z = 0 considering different combinations of z f,min and z f,max , compatible with the observations of relatively high redshifts of formation for these massive and passive ETGs, i.e. (z f,min ,z f,max ) = (1, 3), (1.5, 3), (2, 3); • for each of these models, we considered both the case of a flat and of a gaussian initial distribution; • for each of these distributions, we estimated the relation z form (z) between z = 0 and z = 1 considering to have at z = 1 respectively 30% and 50% less younger ETGs with respect to z = 0, i.e. for two cases representative of the evolution of the high mass bin (log(M/M ⊙ ) > 11) and of the low mass bin (11 < log(M/M ⊙ ) < 10.6); • for all the models, we estimated the quantity ∆z form /∆z, and we averaged them to obtain a mean < ∆z form /∆z > respectively for the high-and low-mass bin; • we used the estimated < ∆z form /∆z > to evaluate the quantity δ/dt at the redshifts used in our analysis, considering a flat ΛCDM model (Ω m = 0.27, H 0 = 72 km s −1 Mpc −1 ). To estimate this quantity we took into account that the high-mass sample of our analysis has at all redshifts log(M/M ⊙ ) > 11, while the low-mass sample has log(M/M ⊙ ) > 11 at z < 0.4 and 10.6 < log(M/M ⊙ ) < 11 at z > 0.4 (see Tab. 2); • we estimated a mean error due to the progenitor-bias using Eq. A.5 and A.6 for the lowand high-mass subsamples of ETGs, and averaged them to estimate the mean effect on H(z).
The values σ prog bias obtained are reported in Tab. 6. The different model studied have been built to consider the uncertainty in our knowledge of the distribution of redshifts of formation, estimating in this way an averaged effect. We notice that the estimated errors are smaller than the statistical errors, on average ∼ 0.6σ stat . In Tab. 6 we also show the increase in the total error when also this uncertainty is added in quadrature; it is evident that the variation in the total error is really small, adding only ∼ 1% on average.
To cross-check the results obtained with the model introduced above, we also studied the theoretical models which predicts the effect of the progenitor-bias from van Dokkum & Franx (2001) 4 (hereafter vDF, [90]); we refer to this reference for a detailed discussion about the parameters used to set the models. We varied the available parameters building different models, fixing only the amount of ETGs already in place at z = 1 (∼ 70% for the high masses and ∼ 50% for the low masses) and the redshift of formation (2 < z form < 3); using these models, we estimated an error σ vDF prog bias as described above for our model. These values are reported in Tab. 6. The results obtained with vDF models are compatible with what found with our model, showing also in this case that, given our selection criteria, the progenitor-bias add a small contribution to the total errorbar, ∼ 2% on average. We also studied the default "strong evolution" option provided, finding in this case even smaller errors due to the high value of redshift of formation set.  Table 6. Upper limit theoretical estimates of the error due to progenitor-bias (in units of [km s −1 Mpc −1 ]) in the various redshifts bin and the percentage increase in the total error when also this contribution is summed in quadrature. These values have been estimated using both the simple model described here (σ prog bias ) and the models of Ref. [90] (σ vDF prog bias ).
As a further check, it is worth recalling that the values of H(z) obtained with this analysis are in very good agreement with the measurements obtained in literature with other approaches and methods (see Sec. 6 and Fig. 10). From these tests we therefore conclude that, given the present-day errorbars, our data are compatible with not being affected by the progenitor-bias.

A.2 Impact of the adopted Initial Mass Function
The impact of the IMF on the D4000 n is insignificant, and does not affect our analysis: the difference between D4000 n values estimated in a single stellar population model with a Chabrier or a Salpeter IMF are less than 0.3% for all the metallicities considered in this analysis, and less than 0.2% for the solar metallicity (which is the one that better fit our ETGs).

A.3 The α−enhancement
The effect of the α−enhancement is slightly more difficult to be analyzed. Many studies have pointed out that massive ETGs are enhanced in α elements (e.g. see [91][92][93][94][95][96]) with respect to solar neighborhood. Even if some works [97,98] have been developed to take into account also this effect in modeling stellar populations, the impact of considering the α−enhancement on the D4000 n has not yet been modeled and studied into details. The models of Ref. [97] are available 5 for 6 chemical mixtures, at 3 fixed values of [Fe/H] one model solar scaled and one α−enhanced for which the abundances of all classical α elements are increased by 0.4 dex relative to the scaled-solar model; the corresponding stellar metallicities are in the range 0.3 < Z/Z ⊙ < 3. For these models, all the different spectral indices are available, included the D4000 n . We decided to compare the slope of the α−enhanced models with the solar-scaled models to check the impact of this effect; however it has to be underlined that, due to the availability of the models, this comparison is different from the study performed in our analysis, since it is done at fixed [Fe/H] instead of at fixed metallicity. We found that, in the range of D4000 n relevant for our analysis, while the absolute values of D4000 n are different, the slopes are not strongly dependent on the α-enhancement, with a percentage difference between 2% and 8% 6 . This results is confirmed also studying single stellar population models based on MARCS theoretical libraries [99], using solar metallicity and [α/F e] = 0, 0.4 (Claudia Maraston, private communication); in this case the difference in D4000 n is extremely small, on average only ∼ 0.5%, with a percentage variation in the slope of ∼ 8%. As a final remark, it has to be considered also that since in our analysis we consider the most massive ETGs, these galaxies will have quite similar enhancements, so that considering this effect will not introduce an additional systematic error, but just a shift in the slope. In this way the relative effect will be further mitigated.