The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: mock galaxy catalogues for the BOSS Final Data Release

We reproduce the galaxy clustering catalogue from the SDSS-III Baryon Oscillation Spectroscopic Survey Final Data Release (BOSS DR11&DR12) with high fidelity on all relevant scales in order to allow a robust analysis of baryon acoustic oscillations and redshift space distortions. We have generated (6,000) 12,288 MultiDark PATCHY BOSS (DR11) DR12 light-cones corresponding to an effective volume of $\sim192,000\,[h^{-1}\,{\rm Gpc}]^3$ (the largest ever simulated volume), including cosmic evolution in the redshift range from 0.15 to 0.75. The mocks have been calibrated using a reference galaxy catalogue based on the halo abundance matching modelling of the BOSS DR11&DR12 galaxy clustering data and on the data themselves. The production follows three steps. First, we apply the PATCHY code to generate a dark matter field and an object distribution including nonlinear stochastic galaxy bias. Secondly, we run the halo/stellar distribution reconstruction HADRON code to assign masses to the various objects. This step uses the mass distribution as a function of local density and non-local indicators (i.e., tidal field tensor eigenvalues and relative halo exclusion separation for massive objects) from the reference simulation applied to the corresponding PATCHY dark matter and galaxy distribution. Finally, we apply the SUGAR code to build the light cones. The resulting MultiDark PATCHY mock light cones reproduce the number density, selection function, survey geometry, and in general within 1 $\sigma$, for arbitrary stellar mass bins, the power spectrum up to $k=0.3\,h\,{\rm Mpc}^{-1}$, the two-point correlation functions down to a few Mpc scales, and the three-point statistics of the BOSS DR11&DR12 galaxy samples.


INTRODUCTION
The observable Universe represents a unique realization of an underlying physical cosmological process. Large galaxy redshift surveys like the Baryon Oscillation Spectroscopic Survey (BOSS; e.g., Bolton et al. 2012;Dawson et al. 2013; email:kitaura@aip.de, Karl-Schwarzschild-fellow † Campus de Excelencia Internacional UAM/CSIC Fellow ‡ MultiDark Fellow Alam et al. 2015), a branch of the ongoing Sloan Digital Sky Survey (SDSS-III Eisenstein et al. 2011) scan the sky with unprecedented accuracy trying to unveil structure formation in an expanding Universe. One important question arises in the analysis of the data provided by such surveys: if the Universe is comparable to a huge unique experiment, how can we determine the uncertainties in the measurement of quantities derived from observing it? One strategy consists of dividing the observations into subvolumes, treating each of the subsamples as independent measurements, and computing the errors with jackknife or bootstrap estimates. While this approach continues being relevant as a way to obtain error estimates directly from the data (see e.g. Norberg et al. 2009), it also implies several disadvantages. First, it does not include systematic errors present in all subvolumes, secondly it does not lead to a physical understanding of the data by itself, and thirdly it introduces variance beyond the one already present in the data on scales larger than the subvolumes. The last point is especially critical when the signal sought has a large characteristic scale and its detection significance crucially depends on the volume, as is the case for baryon acoustic oscillations (BAOs; see e.g. Seo & Eisenstein 2005;White et al. 2009). During the past decades, there has been a huge effort to encode our physical knowledge of structure formation in computational algorithms, and compare the theoretical models to the actual observations. Pioneering works started with qualitative comparisons (see e.g. Klypin & Shandarin 1983;Blumenthal et al. 1984;Davis et al. 1985). Since then simulations have grown and such comparisons have turned increasingly more quantitative (see e.g. Klypin et al. 2003;Springel et al. 2005;Boylan-Kolchin et al. 2009;Klypin et al. 2011). These efforts are essential to understand structure formation and yet they suffer from a strong limitation: as simulations always push the computational limits they are not suited for massive production. In fact the number of current state-ofthe-art large volume N -body simulations is of order 10 (Kim et al. 2009;Angulo et al. 2012;Prada et al. 2012;Alimi et al. 2012;Watson et al. 2014;Fosalba et al. 2015;Skillman et al. 2014;Klypin et al. 2014;Ishiyama et al. 2015). However, an ideal approach to determine the uncertainties from current and upcoming surveys scanning large sky areas, and hence covering huge volumes, such as BOSS 1 , DESI 2 /BigBOSS (Schlegel et al. 2011 (Cimatti et al. 2009;Laureijs 2009), requires thousands of such simulations if the simplest error determination methods are used (Dodelson & Schneider 2013;Taylor et al. 2013;Percival et al. 2014). Alternative more efficient methods need to be considered to face this challenge. A few pioneering works explored a viable strategy more than a decade ago relying on simplified fast gravity solvers using perturbation theory: pinocchio (Monaco et al. 2002(Monaco et al. , 2013 and pthalos (Scoccimarro & Sheth 2002). Nevertheless, these methods are not trivial, need calibration with N -body simulations, and still demand high computational efforts. For this reason, some of the first analysis of large surveys (Percival et al. 2001;Cole et al. 2005) was done based on lognormal realizations (see also Percival et al. 2004;Beutler et al. 2011), which match the two-point statistics by construction research-groups-and-projects/4most 7 http://www.euclid-ec.org (Coles & Jones 1991), although their three-point statistics is very different from the true one (see e.g. White et al. 2014;Chuang et al. 2015b). It is also not clear that their fourpoint statistics will be accurate (Cooray & Hu 2001;Takada & Hu 2013).
The analysis of past data releases of the BOSS collaboration utilized 1,000 mocks, created based on an improved version of pthalos (Manera et al. 2013. The use of approximate gravity solvers in these methods came at the expense of only matching clustering statistics on a wide range of scales to ∼ 10% precision (and strongly deviating towards small scales < ∼ 20 h −1 Mpc). This sets the agenda for the current BOSS data release DR11&DR12 and the requirements for a new generation of mock galaxy catalogues. Ideally one would like to base these on efficient solvers that are trained on exact solutions and deliver a comparable precision. A new generation of methods that can meet these high requirements have been developed during the past two years, in particular, patchy (Kitaura et al. 2014), qpm (White et al. 2014), and ezmocks (Chuang et al. 2015a). The key concept exploited by these methods is to rely only on the large-scale density field obtained from approximate gravity solvers and use biasing prescriptions to populate it with mock galaxies, in a similar way to the methods proposed to augment the resolution of N -body simulations (de la Torre & Peacock 2013;de la Torre et al. 2013;Angulo et al. 2014;Ahn et al. 2015). One should however be careful, as computing an accurate dark matter field is a necessary, but not sufficient condition to reproduce the correct halo/galaxy three-point statistics. The bias parameters are degenerate in the two-point statistics and need to be additionally constrained to reproduce higher order statistics . We will rely in this work on the patchy method due to its verified accuracy in the two and three-point statistics for different populations of objects (see application of the hadron code to patchy and ezmock; Zhao et al. 2015). An additional set of galaxy mocks fitting the BOSS DR11&DR12 (CMASS and LOWZ) data at two mean redshifts (respectively) based on qpm have been produced in an unprecedented effort. These are constructed with a different structure formation model based on low resolution particle mesh solvers, and a different galaxy bias, based on a rank-ordering scheme assigning most massive objects to the highest density peaks, (for a comparison of both sets of catalogues see §3 and Gil-Marín et al. 2015a).
Another approach uses approximate PT based solutions to speed up N -body solvers (see Cola method, Tassev et al. 2013;Howlett et al. 2015;Koda et al. 2015). This method is very promising to generate ensembles of reference mock catalogues; however, it has the drawback of requiring large computational memory for the force calculation and large number of particles to resolve the haloes (see Chuang et al. 2015b), and is therefore not suitable for the massive production aimed in this work. The speed of the method over N -body simulations comes at the expense of not resolving the sub-structures required to produce a realistic galaxy catalogue. This problem can be circumvented by, e.g., augmenting the missing objects with the halo occupation distribution model, hereby losing some of the advantage of having a higher precise description of the nonlinear clustering over the above mentioned methods which rely only on the large scale dark matter field, as shown in a comparison study (see Chuang et al. 2015b, and references therein). One may need an approach like cola, to model the large-scale structure, combined with the galaxy bias presented in this work for future emission line galaxy-based surveys. We will, however, demonstrate here that this is not necessary to model the distribution of luminous red galaxies (LRGs) aimed in this work.
One could argue whether mock catalogues are required at all, as analytical models may deliver an almost direct computation of error bars and covariance matrices (Hartlap et al. 2007;Hamaus et al. 2010;Dodelson & Schneider 2013;Taylor et al. 2013;Kalus et al. 2016). It still remains to be shown that these methods making simple assumptions, such as that the density field is Gaussian distributed, yield the same accuracy as covariance matrices based on large sets of mock catalogues.
Nevertheless, the purpose of mock catalogues is manifold, as they not only serve to provide error estimates, but also to provide an understanding of the systematics of the survey and of the methodology. Any analytical prediction or data analysis method should be cross-checked with large ensembles of mock galaxy catalogues for which the products of this work could be useful. One clear example is the case of BAO reconstruction techniques (see e.g. Eisenstein et al. 2007;Padmanabhan et al. 2012;Anderson et al. 2014;Ross et al. 2015).
We exploit the efficiency and accuracy of the patchy code to produce 12,288 galaxy mock catalogues 8 including the lightcone evolution of galaxy bias based on the halo abundance matching technique applied to the reference BigMultiDark N -body simulation (see Rodríguez-Torres et al. 2015, companion paper), and to the peculiar motions based on the observational data, matching the two, three-point statistics, in real and redshift space of the BOSS DR11&DR12 galaxy clustering data at different redshifts and for arbitrary stellar mass bins. Special care has been taken to include all relevant observational effects including selection functions and masking. The MultiDark patchy BOSS DR11 mock catalogues presented in this work are publicly available 9 . This paper is structured as follows: in section §2 we describe the methodology. This section starts with the generation of the reference catalogue using N -body simulations and the halo abundance matching technique. Subsequently the scheme to massively generate mock catalogues is described. Then we show in §3 the statistical comparison between the mock catalogues and the BOSS DR12 data. Subsequently we discuss future work ( §4). Finally, in section §5 we present the conclusions. The reader interested only in the results may skip §2 and directly go to §3. 8 This corresponds to an effective volume of ∼192, 000 [h −1 Gpc] 3 , a factor of ∼ 20 times larger than the volume of the DEUS FUR simulation (Alimi et al. 2012), and a factor of ∼ 375 times larger than the DarkSky ds14 simulation (Skillman et al. 2014). 9 http://data.sdss3.org/sas/dr11/boss/lss/dr11_patchy_ mocks/ The BOSS DR12 mock catalogues will be made publicly available together with the data catalogue: http: //data.sdss3.org/sas/dr12/boss/lss/dr12_patchy_mocks/.

METHODOLOGY
To construct high-fidelity mock light cones for interpreting the BOSS DR11&DR12 galaxy clustering, we adopt an iterative training procedure in which a reference catalogue is statistically reproduced with approximate gravity solvers and analytical-statistical biasing models. The whole algorithm involves several steps and is summarized in the flow chart in Fig. 1. (i) The first step consists of the generation of an accurate reference catalogue. Here we rely on a large N -body simulation capable of resolving distinct haloes and the corresponding substructures. This permits us to apply the HAM technique to reproduce the clustering of the observations with only one parameter: the scatter in the stellar mass-to-halo mass relation (see Rodríguez-Torres et al. 2015, companion paper;and §2.1). This technique is applied at different redshift bins to obtain a detailed galaxy bias evolution spanning the redshift range covered by BOSS DR11&DR12 galaxies. In this way we obtain mock galaxy catalogues in full cubical volumes of 2.5 h −1 Gpc side at different redshifts.
(ii) In the second step we train the patchy code (Kitaura et al. 2014 to match the two-and three-point clustering of the full mock galaxy catalogues for each redshift bin.
Here we consider all the mock galaxies together in a single bin irrespectively of their stellar mass.
(iii) In the third step we apply the hadron code (Zhao et al. 2015) to assign stellar masses to the individual objects.
(iv) In the fourth step we apply the sugar code (see Rodríguez-Torres et al. 2015, companion paper) which includes selection effects, masking, and combines different boxes at different redshifts into a light-cone.
(v) In the fifth step the resulting MultiDark patchy mock catalogues are compared to the observations. The process is iterated until the desired accuracy for different statistical measures is reached.
In the next sections we will describe in detail these steps described above for the massive generation of accurate mock galaxy catalogues. The reader interested only in the results may directly go to §3.

Cosmological parameters
MultiDark patchy Mocks for BOSS DR11&DR12: training mock catalogs from observed and simulated data sets  Flowchart of the methodology applied in this work for the generation of high fidelity BOSS DR11&DR12 mock galaxy catalogues: i) starting from a reference mock catalogue calibrated with the observations, ii) followed by the reproduction of the whole catalogue, iii) with the subsequent mass assignment, iv) and survey generation. v) The final catalogues are compared with the observations and the simulation, and the previous steps are repeated until the mock catalogues are compatible with the observations within 1-σ for the monopole and quadrupole up to k ∼ 0.3 h Mpc −1 .
We note that there are alternative methods connecting haloes to galaxies like the HOD model, which we are not going to consider here (e.g., Berlind & Weinberg 2002;Kravtsov et al. 2004;Zentner et al. 2005;Zehavi et al. 2005;Zheng et al. 2007;Skibba & Sheth 2009;Ross & Brunner 2009;Zheng et al. 2009;White et al. 2011). These methods are based on a statistical relation describing the probability that a halo of virial mass M hosts N galaxies with some specified properties. In general, theoretical HODs require the fitting of a function with several parameters, which we want to avoid here.
At first order HAM assumes a one-to-one correspondence between the luminosity and stellar or dynamical masses: galaxies with more stars are assigned to more massive haloes or subhaloes. The luminosity in a red-band is sometimes used instead of stellar mass. There should be some degree of stochasticity in the relation between stel-lar and dynamical masses due to deviations in the merger history, angular momentum, halo concentration, and even observational errors (Tasitsiomi et al. 2004;Behroozi et al. 2010;Leauthaud et al. 2011;Trujillo-Gomez et al. 2011). Therefore, we include a scatter in that relation necessary to accurately fit the clustering of the BOSS data (see Rodríguez-Torres et al. 2015, companion paper). To do this, we modify the maximum circular velocity (Vmax) of each object adding a Gaussian noise: V scat max = Vmax(1 + N (0, σ)), where N (0, σ) is a Gaussian random number with mean 0, and standard deviation σ. Then, we sort all objects by V scat max and then, we selected objects starting from the one with larger V scat max and we continue until we get the proper number density at different redshifts bins.
By construction, the method reproduces the observed luminosity function (or stellar mass function). It also reproduces the scale dependence of galaxy clustering over a large range of epochs (Conroy et al. 2006;Guo et al. 2010). When abundance matching is used for the observed stellar mass function (Li & White 2009), it gives also a reasonable fit to lensing results (Mandelbaum et al. 2006) and to the relation between stellar and virial mass (Guo et al. 2010).

Generation of mock galaxy catalogues
All covariance matrix estimates based on a finite number of mock catalogues, Ns, are affected by noise, which must be propagated into the final constraints. The impact of the uncertainties in the covariance matrix on the derived cosmological constraints has been subject of several recent analyses (Dodelson & Schneider 2013;Taylor et al. 2013;Percival et al. 2014). In particular, Dodelson & Schneider (2013) showed that this additional uncertainty can be described by a rescaling of the parameter covariances derived from the distribution of measurements from a set of mocks with a factor given by where N b is the number of bins in the corresponding clustering measurements and Np is the number of parameters measured. This implies that a large number of mock catalogues are necessary for a robust analysis of the galaxy clustering data. For the anisotropic BAO measurements of Cuesta et al. (2015) the estimation of the full covariance matrix of the monopole and quadrupole of the two-dimensional correlation function from the ensemble of 1,000 qpm corresponds to an additional uncertainty of 2% on the constraints on H(z)r d and DA(z)/r d . Using the 2,048 MultiDark patchy mock catalogues, the effect is reduced to the order of 1%. Large sets of catalogues are even more important for fullshape fits of anisotropic clustering measurements, where the inclusion of information from smaller scales can significantly improve the constraints based on redshift space distortions (RSD; requiring a larger number of bins). For example, in the analysis of Sánchez et al. (in prep.), based on measurements of the clustering wedges statistic , the use of mock catalogues corresponds to a rescaling of the parameter covariances by m = 1.04 and 1.085 when using 1,000 or 2,048 catalogues, respectively. This additional uncertainty corresponds to a degradation of the true constraining power of the clustering measurements, which should be minimized by using a larger number of mock catalogues. For this reason we have made the effort in the BOSS collaboration of producing at least 1,000 mocks for each BOSS DR11&DR12 sub-sample.
The strategy for the massive production of mock galaxy catalogues relies on generating dark matter fields with approximate gravity solvers on a mesh. We use grids of 960 3 cells with volumes of (2.5 h −1 Gpc) 3 and resolutions of 2.6 h −1 Mpc for which the structure formation model can be considered to be accurate §2.2.1. Then the galaxies are populated on the mesh according to a combined nonlinear deterministic §2.2.2 and stochastic bias model §2.2.3. In a post-processing step we assign halo/stellar masses to each object §2.2.5. Finally we apply the survey geometry and selection functions §2.2.6.
Let us start describing the patchy code (PerturbAtion Theory Catalog generator of Halo and galaxY distributions).

Approximate fast structure formation model
We rely on augmented Lagrangian Perturbation Theory (ALPT) to simulate structure formation. Let us recap the basics of this method and refer for details to . In this approximation the displacement field Ψ(q, z), mapping a distribution of dark matter particles at initial Lagrangian positions q to the final Eulerian positions We rely on second order Lagrangian Perturbation Theory (2LPT) for the long-range component Ψ2LPT(for details on 2LPT see Buchert 1994;Bouchet et al. 1995;Catelan 1995).
The resulting displacement field is filtered with a kernel K: ΨL(q, z) = K(q, rS) • Ψ2LPT(q, z). We apply a Gaussian filter K(q, rS)= exp (−|q| 2 /(2r 2 S )), with rS being the smoothing radius. We use the spherical collapse approximation to model the short-range component ΨSC(q, z) (see Bernardeau 1994;Mohayaee et al. 2006;Neyrinck 2013). The combined ALPT displacement field is used to move a set of homogenously distributed particles from Lagrangian initial conditions to the Eulerian final ones. We then grid the particles following a clouds-in-cell scheme to produce a smooth density field δ ALPT . One may get some improvements preventing voids within larger collapsing regions, which essentially extends the collapsing region towards moderate underdensities (see muscle method in Neyrinck 2016). This approach requires about eight additional convolutions being about twice as expensive, as the approached used here. Moreover, we have checked that the improvement provided by including muscle is not perceptible when using grids with cell sizes of 2.6 h −1 Mpc.

Deterministic bias relations
In this section we describe the deterministic part of our bias model. This is combined with a stochastic element, described in §2.2.3, and a nonlocal element, described in §2.2.5, to produced the full model. The deterministic bias relates the expected number counts of haloes or galaxies ρg ≡ Ng ∂V at a given finite volume to the underlying dark matter field ρM, with [· · · ] ∂V being the ensemble average over the differential volume element ∂V (in our case the cell of a regular mesh). This relation is known to be nonlinear, nonlocal and stochastic (Press & Schechter 1974;Peacock & Heavens 1985;Bardeen et al. 1986 Baldauf et al. 2012Baldauf et al. , 2013Ahn et al. 2015).
In general this bias relation will be arbitrarily complex: with B(ρM) being a general bias function, fg = ρg V B(ρ M ) V , ρg V being the number densityNg, and [· · · ] V being the ensemble average over the whole considered volume V (in our case the volume of the mesh). The deterministic bias model we consider in this work has the following form: with and {ρ th , α, , ρ , τ } the parameters of the model. We have modelled threshold bias (Kaiser 1984;Bardeen et al. 1986;Cole & Kaiser 1989;Sheth et al. 2001;Mo & White 2002) as a combination of a step function θ(ρM − ρ th ) (Kitaura et al. 2014) and an exponential cut-off exp − ρ M ρ (Neyrinck et al. 2014). The local bias expansion (Cen & Ostriker 1993;Fry & Gaztanaga 1993) is summarized by a power-law (de la Torre & Peacock 2013; Kitaura et al. 2014). In addition we consider a bias (ρM − ρ th ) τ which compensates for the missing power of PT based methods. Nonlocal bias has been recently found to be relevant (McDonald & Roy 2009;Baldauf et al. 2012;Chan et al. 2012;Sheth et al. 2013;Saito et al. 2014). A non-local bias introduces a scatter in the local deterministic bias relations described above. In this work, the scatter is first described by a stochastic bias relation (see §2.2.3). We have investigated second order nonlocal bias with patchy without finding that this can have a relevant effect on the mock catalogues considering stochastic bias and the full (one single mass bin) catalogue (see Autefage et al. in prep.). In fact once one considers different populations of halo or stellar mass objects, then nonlocal bias plays an important role. We solve this in a post-processing step when assigning the masses to each galaxy (see §2.2.5 and Zhao et al. 2015).

Stochastic biasing
The halo distribution is a discrete sample Ng,i of the continuous underlying dark matter distribution ρg,i: for each cell i and {pSB} being the set of stochastic bias parameters. To account for the shot noise one could do Poissonian realizations of the halo density field as given by the deterministic bias and the dark matter field (see e.g. de la Torre & Peacock 2013). However, it is known that the excess probability of finding haloes in high density regions generates over-dispersion (Somerville et al. 2001;Casas-Miranda et al. 2002).
The strategy up to now has been to generate a mock catalogue which reproduces the clustering of the whole population of galaxies for a given redshift. This has the advantage that by mixing massive and low mass galaxies we will always be dominated by overdispersion, which is much easier to model than underdispersion. In particular we consider the negative binomial (NB) probability distribution function (for non-Poissonian distributions see Saslaw & Hamilton 1984;Sheth 1995) including an additional parameter β to model over-dispersion (tends towards the Poisson probability distribution function for β → ∞ and for low λ values).
We note that a proper treatment of the deviation from Poissonity is also crucial to get accurate density reconstructions (see Ata et al. 2015, and Ata et al in prep.).
We will need, however, to take care of the different statistical nature of each population of galaxies when we assign masses to each object (see §2.2.5).

Redshift space distortions
Let us recap here the way in which RSDs are treated in the patchy-code (see Kitaura et al. 2014).
The mapping between Eulerian real space x(z) and redshift space s(z) is given by: s(z) = x(z) + vr(z), with vr ≡ (v ·r)r/(Ha); wherer is the unit sight line vector, H the Hubble constant, a the scale factor, and v = v(x) the 3-d velocity field interpolated at the position of each halo in Eulerian-space x using the displacement field ΨALPT(q, z). We split the peculiar velocity field into a coherent v coh and a (quasi) virialized component vσ: v = v coh + v σ . The coherent peculiar velocity field is computed in Lagrangian-space from the linear Gaussian field δ (1) (q) using the ALPT formulation consistently with the displacement field (see Eq . 2): with v2LPT(q, z) being the second order and vSC(q, z) being the spherical collapse component (for details see Kitaura et al. 2014). We use the high correlation between the local density field and the velocity dispersion to model the displacement due to (quasi) virialized motions. Effectively, we sample a Gaussian distribution function (G) with a dispersion given by σv For the Gaussian streaming model see Reid & White (2011), for non-Gaussian models see e.g. Tinker (2007). In closely virialized systems the kinetic energy approximately equals the gravitational energy and a Keplerian law predicts γ close to 0.5, leaving only the proportionality constant g as a free parameter in the model (see also Heß et al. 2013). We assign larger dispersion velocities to low mass objects considered to be satellites. The parameters g and γ have been adjusted to fit the damping effect in the monopole and quadrupole as found in the BigMultiDark N -body simulation first and later further constrained with the BOSS DR12 data for different redshift bins (see discussion in §3).

Halo/stellar mass distribution reconstruction
Once we have a spatial distribution of objects {rg} which accurately reproduce the clustering of the whole galaxy sample at a given redshift, we assign the halo/stellar masses M l g to each object l according to the statistical information extracted from the BigMultiDark simulation using the BOSS DR12 Halo mAss Distribution ReconstructiON (hadron) code (for technical details see Zhao et al. 2015). In particular we sample the following conditional probability distribution function

MULTIDARK PATCHY MOCKS DR12
where ρM is the local density, T the tidal field tensor (in particular the eigenvalues), ∆r M min a minimum separation between massive objects due to exclusion effects, {pc} a set of cosmological parameters, and z the redshift at which we want to apply the mass reconstruction. We note that at this stage we consider nonlocal biasing through the tidal field tensor and the minimum separation of objects. Using all this information it has been proven that one can recover compatible clustering for arbitrary halo mass cuts with the N -body simulation up to scales of about k = 0.3 h −1 Mpc (Zhao et al. 2015). We extend the algorithm to stellar masses including the rank ordering relation and scatter described in §2.1.

Survey generator
The SUrvey GenerAtoR (sugar) code is an openMP code which constructs light-cones from mock galaxy catalogues (see Rodríguez-Torres et al. 2015, companion paper). This code applies geometrical features of the survey, including the geometry (using the publicly available mangle mask; Swanson et al. (2008)), sector completeness, veto masks and radial selection functions.
The sugar code can construct light-cones using a single box or multiples boxes at different redshifts, in order to include the redshift evolution in the final catalogue. The first step in the construction of the lightcone is to locate the observer (z = 0) and to transform from comoving Cartesian coordinates to equatorial coordinates (RA,Dec) and redshift. To compute the observed redshift (redshift space) of an object, first we compute the comoving distance from the observer to the object, and then we transform it to redshift space following s = rc + (v ·r)/aH(z real ) (see §2.2.4), where Once we compute the redshift of each galaxy, we consider two options to select objects in the radial direction: (i) downsampling: this option preserves the clustering of the input box selecting objects randomly to have the desired number density.
(ii) selecting by halo property: this consists of rank ordering objects by a halo property and selecting them sequentially until the correct number density is obtained. We used 10 redshift bins to construct the light-cones. This permits us to obtain the galaxy bias, the growth, and the peculiar motion evolution as a function of redshift. A visualization of the BOSS DR12 and one md patchy mock realization is shown in Fig. 2. We can clearly see from this plot that both the data and the mocks follow the same selection criteria including the survey mask (the colour code stands for the stellar mass), and there are no obvious visual differences beyond cosmic variance. The empty regions seem to be similarly distributed for both cases, indicating that the three-point statistics should be close, and the statistical comparison between the md patchy mock galaxy catalogues and the observations of BOSS DR12 yield good agreement. The number densities for LOWZ and CMASS galaxy samples are recovered by construction (see Fig. 3). We investigate the performance of the mock galaxy catalogues in detail in the following subsections.
To avoid redundancy we show only the results for BOSS DR12, as the only difference with respect to the BOSS DR11 mocks is the applied mask and selection function. 11 We have produced half the amount of mock catalogues for DR11, i.e., 1,024 for each LOWZ, CMASS, combined, southern and northern galactic cap.

Two-point and three-point correlation functions
We perform first an analysis in configuration space computing the two-and three-point correlation functions. To compute the clustering signal in the correlation function for the md patchy mock lightcones and the observed data we rely on the Landy & Szalay (1993) estimator. We will follow their notation referring to the data sample (either simulation or observed data) as D and to the random catalogue as R. The correlation function is then constructed in the following way: as a function of separation between pairs of galaxies in redshift space s. The three-point correlation function gives a description of the probability of finding three objects in three different volumes, and can be computed following Szapudi & Szalay (1998) as a function of separation between the vertices of triangles spanned by triplets of galaxies in redshift space s12, s23, s13. Fig. 4 shows that we accurately recover the clustering (monopole) for arbitrary stellar mass bins showing an almost perfect agreement with observations. Only for the two largest stellar mass bin, we find deviations larger than 1-σ. This is mainly due to the "halo exclusion effect", which is only approximately modelled, assuming a minimum separation for massive galaxies, and not the full separation distribution function (Zhao et al. 2015). We find, however, that these differences are not critical, as they are restricted to small scales ( < ∼ 20 h −1 Mpc) and only a low number of objects are affected. We further compute the monopole and quadrupole for LOWZ and CMASS (see Fig. 5 and  §3.3). The monopole agrees towards small scales down to a few Mpc within 1-σ.
There is a deviation of the monopole around the BAO peak and towards larger scales. While the galaxy mock catalogues cross zero right after the BAO peak, the observations do not. In this study, we have applied all of the systematic weights, such as the stellar density contamination, detailed in Reid et al. (2016) and Ross et al. (in prep.). The correlation function measurements are quite covariant between s bins at these scales, making the deviations less significant than one would expect by the visual impression. The significance and potential causes of the large-scale excess are studied in Ross et al. (in prep.), where it is also shown that it has no significant impact on BAO measurements. This is even more so, as the overall shape of ξ(s) in BAO measurements is marginalized over with a polynomial (see e.g. Anderson et al. 2014). See also Ross et al. (2012); Chuang et al. (2013) for similar studies on an earlier BOSS data set and Huterer et al. (2013) for potential photometric calibration systematics, which have not been accounted for in this analysis.
In the case of RSD measurements one has to make sure that the analysis is performed on scales which are not affected by systematics (Gil-Marín et al. 2015a, companion paper). The quadrupole is in very good agreement on all    scales, further supporting that RSD analysis should be safe, even in case there are some remnant systematics in the data.
An investigation of the three-point function demonstrates that the md patchy mocks have a quality very similar to those based on N -body simulations after calibration (see left-hand and central panels in Fig. 6). We have constrained the galaxy bias parameters (see §2.2.2, 2.2.3, and 2.2.5) based on the reference catalogues from the BigMul-tiDark simulation on cubical full volumes at each of the 10 redshift bins, matching the two-and the three-point statistics. To fit the latter we focused on matching the higher order correlation functions through the probability distribution function of galaxies in the reference catalogues following the approach presented in Kitaura et al. (2015). Using the observations to constrain the three point statistics is not trivial, due to incompleteness effects. This explains why the md patchy mock catalogues better fit the reference catalogue than the data, especially for the CMASS galaxies. The three-point statistics performs worse for the qpm mocks, possibly because they do not include an iterative validation step fitting higher order statistics (beyond the two-point correlation function). The nonlinear RSD parameter (see §2.2.4) was iteratively constrained based on the observations, as we explain in the next section.

Monopole and quadrupole in Fourier space
The galaxy power spectrum P and the galaxy bispectrum B are the two-and three-point correlation functions in Fourier space. Given the Fourier transform of the galaxy overdensity, δg(x) ≡ ρg(x)/ρg − 1, where ρg(x) is the number density of objects andρg its mean value, and the galaxy power spectrum and galaxy bispectrum are defined as, with δ D being the Dirac delta function. Note that the bispectrum is only well defined when the set of k-vectors, k1, k2 and k3 close to form a triangle, k1 + k2 + k3 = 0. It is common to define the reduced bispectrum Q as, Q(α12|k1, k2) ≡ B(k1, k2) P (k1)P (k2) + P (k2)P (k3) + P (k1)P (k3) .  where α12 is the angle between k1 and k2. This quantity is independent of the overall scale k and redshift at large scales and for a power spectrum that follows a power law. Moreover, it presents a characteristic "U-shape" predicted by gravitational instability. Mode coupling and power law deviations in the actual power spectrum induce a slight scale-and time-dependency in this quantity. However, in practice it has been observed that at scales of the order of k ∼ 0.1 h Mpc −1 the reduced bispectrum does not present a high variation in its amplitude.
The measurement of the bispectrum is performed in the same way as the approach described in Gil-Marín et al. (2015c). This method consists of generating k-triangles and randomly orientating them in k-space. When the number of random triangles is sufficiently large, the mean value of their bispectra tends to the fiducial bispectrum (for details see Gil-Marín et al. 2015c). Discreteness adds a shot noise contribution to the measured power spectrum and bispectrum. In this paper we assume that these contributions are of Poisson type and therefore are given by, Bsn(k1, k2) = 1 n [P (k1) + P (k2) + P (k3)] + 1 n 2 (17) where k3 = |k1 + k2| andn is the number density of haloes. For both power spectrum and bispectrum we present the BOSS DR12 data error-bars computed from the dispersion among 2,048 and 100 realizations of md patchy mock catalogues, respectively.
The Fourier space analysis has been used to improve the modelling of the RSD in the galaxy mock catalogues. We have assigned higher peculiar random motions to about 10% of the galaxies to fit the quadrupole of the data with a specific value for each of the 10 redshift bins. The resulting monopoles and the quadrupoles show a good agreement with the observations over the range relevant to BAOs and RSDs up to at least k 0.3 h −1 Mpc for both LOWZ and CMASS (see Fig. 7). This agreement is further supported after BAO reconstruction, as can be seen in Fig. 8. Only towards the very large scales (k < ∼ 0.02 h −1 Mpc) we can find that the observed monopole tends to be larger than the mock catalogues (both md patchy and qpm). This hints towards the discrepancy in the monopole found in configuration space (see the previous section). Although the patchy method can potentially yield accurate two-point statistics up to k ∼ 1 h −1 Mpc (see Kitaura et al. 2014;Chuang et al. 2015b), we have restricted the study to lower ks, as the analysis of BAOs and RSDs will not be done beyond k = 0.3 h −1 Mpc, and . Bispectra and reduced bispectra for LOWZ mocks and observed galaxies for different configurations indicated above each panel. The red solid line corresponds to the mean and the red shaded region to the 1-σ contour of 100 md patchy mocks. The black dots correspond to the BOSS DR12 data with the error bars taken from the md patchy mocks.
the computation of power spectra for thousands of mocks with large grids becomes very expensive.
This fitting procedure had, however, as a consequence that the three-point correlation function is slightly less precise at angles close to θ ∼ 0 and θ ∼ π, as can be seen in Fig 6, which prior to this operation was fully compatible with the reference catalogue. In fact the reference BigMul-tiDark catalogue used in this study showed a highly discrepant quadrupole, as compared to the observations. This has been deeply analysed and a better agreement has been found based on an improved HAM procedure applied to the BigMultiDark simulation (see Rodríguez-Torres et al. 2015, companion paper), which however was not available at the moment of the generation of the md patchy mocks. The HOD model adopted in the qpm mock catalogues assumed about 10% satellite galaxies. This yields a compatible quadrupole for the CMASS galaxies. However, as these catalogues were not iteratively calibrated for different redshift slices, their agreement with the LOWZ galaxies is less accurate.
A detailed analysis of the bispectra is presented in Figs. 9 and 10 demonstrating a reasonable agreement between the mocks and the observations for different configurations of triangles across a wide range of scales, given the high uncertainties introduced by the mask, selection function, and cosmic variance.

Cosmic evolution
The cosmic evolution modelled in the md patchy mocks was achieved by fitting the clustering of 10 redshift bins for the full redshift range spanning about 5 Gyr. This implied running structure formation with ALPT for each redshift, i.e., modelling the growth of structures and the growth rate, and additionally fitting the galaxy bias evolution and the nonlinear RSDs. The evolution of clustering for both sets of mocks in the full redshift range is shown in Fig. 11. While the correlation function for CMASS galaxies does not show strong differences along the CMASS redshift range, this evolution is very apparent for the LOWZ sample. Fig. 12 shows the comparison between the mocks and the observations for different LOWZ in more detail. The qpm mocks do not include a detailed cosmic evolution within LOWZ or CMASS being based on mean redshifts for each case. This explains why these mocks lose accuracy in the two-point statistics towards low redshifts.
We investigate now the cosmic evolution of the covariance matrices derived from the md patchy mocks 12 com- with bins i and j, mock sample l, and Ns being the number of simulations. The correlation matrices for different redshift bins CMASS, and combined sample) will be made publicly available with the publication of the galaxy catalogue.
shown in Fig. 13 were constructed upon the covariance matrices following We find that the correlation matrices vary in subsequent redshift bins. First, the correlation matrices are increasingly correlated close to the diagonal for both the monopole and the quadrupole towards lower redshifts, as expected from gravitational evolution coupling different scales. This is seen in Fig. 13 as the diagonal red band becomes broader es-    MultiDark PATCHY BOSS DR12 combined sample P 0 (k) P 2 (k) P 4 (k) Figure 15. Multipole moments based on the combined sample: monopole P 0 , quadrupole P 2 and hexadecapole P 4 for different redshift bins (see key for redshift ranges), where colour bands are the mean of md patchy and symbols correspond to the measurements on the DR12 combined sample. These measurements are used in the wedges analysis of galaxy clustering in Grieb et al. (in prep.).
pecially comparing the highest redshift bin with the lower ones. Secondly, we find that moderate off-diagonal correlations present at higher redshifts disappear towards lower redshifts. And thirdly, we can see that the correlation between the monopole and the quadrupole at large scales becomes maximal in the redshift bin 0.43 < z < 0.55, as can be seen in the white region in the lower right and upper left blocks. This "triangular" correlation is expected from linear theory (see Eqs. 7 and 9 in Chuang & Wang 2013).
Further calculations of the correlation functions including qpm mocks are shown in companion publications (Gil-Marín et al. 2015a,b, companion papers).
Additionally we show in Fig. 14 the angular correlation function and in Fig. 15 the multipole moments (including the hexadecapole) for different redshift bins based on the combined sample showing good agreement between the md patchy mocks and the data.

FUTURE WORK
We have taken advantage in this survey of the characteristic bias of LRGs, being massive objects residing in high density regions. This work confirms that threshold bias is an essential ingredient to explain the clustering of LRGs. This facilitates our analysis, since the low density filamentary network did not need to be accurately described, and it has permitted us to rely on low resolution (augmented Lagrangian) PT based methods. This will no longer apply for upcoming surveys based on emission line galaxies residing in the whole cosmic web. One could improve the methodology presented in this work by substituting the structure formation model based on PT with a more accurate one (e.g. cola). Whether this is necessary, or whether more efficient alternative approaches are sufficient (e.g. ALPT with muscle corrections), will be investigated in future works.
Nonlocal bias was only considered in the mass assign-ment step, but neglected in the generation of the full galaxy population. This may become important to model for emission line galaxies, and needs a deeper analysis. The approximate "halo exclusion" modelling is mainly responsible for the deviation in the clustering of the most massive objects, and could be improved by taking their full distribution of relative distances, instead of taking a sharp minimum separation for each mass bin, as is done here.
Another aspect which still needs to be improved in the catalogues is the clustering on sub-Mpc scales. We have randomly assigned positions of dark matter particles to the mock galaxies without considering that some of them are satellites of central galaxies. This implies that these mocks are not appropriate for fibre-collision analysis. For the time being we will leave the mock catalogues as they are, since most of the studies are not affected by this. Nevertheless, we would like to stress that this aspect can easily be corrected by assigning to a fraction of the mock galaxies close positions to the major most massive ones in the neighbourhood, without the need of redoing the catalogues. The qpm mocks better model fibre collisions, as the HOD adopted in this work successfully reproduced the fraction of close satellites and central galaxies (Gil-Marín et al. 2015a, companion paper).
Also the photometric calibration systematics, presumably responsible for the excess of power in the data towards large scales, require further investigation.
We have considered one fiducial cosmology. It would be, however, interesting to provide sets of mock catalogues running over different combinations of cosmological parameters.
Let us finally mention that we have ignored in this study super-survey modes, which may be especially relevant for the analysis of the power spectrum at very large scales (Takada & Hu 2013;Li et al. 2014a,b;Carron & Szapudi 2015).
We aim at addressing all these issues in future works.

SUMMARY AND CONCLUSIONS
We have presented 12,288 mock galaxy catalogues for the BOSS DR12, including all relevant physical and observational effects, to enable a robust analysis of BAOs and RSDs. The main features of these mock catalogues are as follows: • large number of catalogues: 2,048 for each LOWZ, CMASS, and combined LOWZ+CMASS and northern and southern galactic cap, • accurate structure formation model on scales of a few Mpc, • accurate galaxy bias model including nonlinear, stochastic, threshold bias, and a nonlocal bias dependence on the tidal field tensor and the exclusion effect separation of massive objects, • modelling redshift evolution of galaxy bias, growth of structures, growth rate, and nonlinear RSDs, • and additional survey features, such as geometry, sector completeness, veto masks and radial selection functions.
The same degree of accuracy is achieved for the BOSS DR11 md patchy mocks, for which only 6,000 lightcone mock catalogues were produced (1,000 for each LOWZ, CMASS, and combined LOWZ+CMASS and northern and southern galactic cap).
The md patchy mocks have shown a better match to the data than the qpm mocks in terms of two-and threepoint statistics. Investigating the origin for these differences can be interesting as the physical models, and in particular the galaxy bias, adopted in each method are quite different.
We note that neglecting the stochastic bias considered in the md patchy mocks, modelling the deviation from Poisson shot noise (predominantly over-dispersion), could underestimate the clustering uncertainties.
The mock catalogues have enabled a robust analysis of the BOSS data yielding the necessary error estimates and the validation of the analysis methods. In particular the studies include the following: • a full clustering analysis (Sánchez et al. in prep., Grieb et al. in prep.: see Fig. 15), • a tomographic analysis of the large-scale angular galaxy clustering, where full light-cone effects (e.g. growth, bias and velocity field evolution) are essential (Salazar-Albornoz et al. in prep.: see Fig. 14), • a study of the BAOs reconstructions (see and Fig. 8 showing the performance on the md patchy mocks), • and a RSD analysis (Gil-Marín et al. 2015a, companion paper;Beutler et al in prep.).
We have demonstrated that the md patchy BOSS DR12 mock galaxies match, in general within 1-σ, the clustering properties of the BOSS LRGs for the monopole, quadrupole, and hexadecapole of the two-point correlation function both in configuration and Fourier space. In particular we achieve a high accuracy in the modelling of the monopole up to k ∼ 0.3 h Mpc −1 . We have furthermore shown that we also obtain three-point statistics with the same level of accuracy as N -body based catalogues at scales larger than a few Mpc, which are close to the observations. The good agreement between the models and the observations demonstrates the level of accuracy reached in cosmology, our understanding of structure formation, galaxy bias, and observational systematics.
All the mock galaxy catalogues and the corresponding covariance matrices will be made publicly available together with the release of the BOSS DR12 galaxy catalogue. edges the support of the 100 Talents Programme of the Chinese Academy of Sciences. GY wishes to thank MINECO (Spain) for financial support under project grants AYA2012-31101 and FPA2012-34694. He also thanks the Red Española de Supercomputación for granting computing time in the Marenostrum supercomputer, in which part of this work has been done. AGS, SSA and JNG acknowledge support from the Transregional Collaborative Research Centre TR33 The Dark Universe of the German Research Foundation (DFG). AJC is supported by supported by the European Research Council under the European Community's Seventh Framework Programme FP7-IDEAS-Phys.LSS 240117. Funding for this work was partially provided by the Spanish MINECO under project MDM-2014-0369 of ICCUB (Unidad de Excelencia 'María de Maeztu').
The massive production of all MultiDark patchy BOSS DR12 mocks has been performed at the BSC Marenostrum supercomputer, the Hydra cluster at the Instituto de Física Teórica UAM/CSIC and NERSC at the Lawrence Berkeley National Laboratory.
The BigMultiDark simulations have been performed on the SuperMUC supercomputer at the Leibniz-Rechenzentrum (LRZ) in Munich, using the computing resources awarded to the PRACE project number 2012060963. We want to thank V. Springel for providing us with the optimized version of GADGET-2.
Numerical computations for the power spectrum multipoles and bispectrum were performed on the Sciama High Performance Compute (HPC) cluster which is supported by the ICG, SEPNet and the University of Portsmouth.