The Chandra COSMOS-Legacy survey: Source X-ray spectral properties

We present the X-ray spectral analysis of the 1855 extragalactic sources in the Chandra COSMOS-Legacy survey catalog having more than 30 net counts in the 0.5-7 keV band. 38% of the sources are optically classified Type 1 active galactic nuclei (AGN), 60% are Type 2 AGN and 2% are passive, low-redshift galaxies. We study the distribution of AGN photon index and of the intrinsic absorption N(H,z) based on the sources optical classification: Type 1 have a slightly steeper mean photon index than Type 2 AGN, which on the other hand have average intrinsic absorption ~3 times higher than Type 1 AGN. We find that ~15% of Type 1 AGN have N(H,z)>1E22 cm^(-2), i.e., are obscured according to the X-ray spectral fitting; the vast majority of these sources have L(2-10keV)>$1E44 erg/s. The existence of these objects suggests that optical and X-ray obscuration can be caused by different phenomena, the X-ray obscuration being for example caused by dust-free material surrounding the inner part of the nuclei. ~18% of Type 2 AGN have N(H,z)<1E22 cm^(-2), and most of these sources have low X-ray luminosities (L(2-10keV)<$1E43 erg/s). We expect a part of these sources to be low-accretion, unobscured AGN lacking of broad emission lines. Finally, we also find a direct proportional trend between N(H,z) and host galaxy mass and star formation rate, although part of this trend is due to a redshift selection effect.


INTRODUCTION
A proper understanding of the properties of the supermassive black holes (SMBHs) in the center of galaxies, and of their evolution across cosmic time, requires unbiased samples of active galactic nuclei (AGN), both ob-scured and unobscured (i.e., sources with hydrogen column density N H,z below and above the 10 22 cm −2 threshold conventionally adopted to separate unobscured and obscured sources, respectively), over a wide range of redshifts and luminosities. Multiwavelength data are also required to avoid selection effects. Mapping the typical AGN population, i.e, those moderate luminosity sources that produce a significant fraction of the X-ray background emission (see, e.g., Gilli et al. 2007;Treister et al. 2009), is possible only with surveys that combine depth, to detect AGN up to z∼6, and area, to find statistically significant numbers of sources at any redshift.
X-ray data are strategic in the AGN selection process, for several reasons. First, at X-ray energies the contamination from non-nuclear emission, mainly due to star-formation processes, is far less significant than in optical and infrared wavelengths (Donley et al. 2008;Lehmer et al. 2012;Stern et al. 2012). Moreover, Chandra and XMM-Newton can select both unobscured and obscured AGN, and can also detect a fraction of Compton thick AGN, i.e., sources with hydrogen column densities, N H ≥10 24 cm −2 , up to redshift z∼2-3 (Comastri et al. 2011;Iwasawa et al. 2012;Georgantopoulos et al. 2013;Buchner et al. 2015;Lanzuisi et al. 2015). Therefore, combining X-ray and optical/NIR observations of AGN allows one to study simultaneously the properties of the accreting SMBHs and their host galaxies.
The proper characterization of AGN X-ray spectra requires observations with high signal-to-noise ratio (S/N) to properly model many spectral features, such as the so-called "soft excess", warm absorbers, emission and absorption lines different from the Iron Kα line at 6.4 keV, and a reflection component (see, e.g., Risaliti & Elvis 2004, for a review of these features). However, AGN spectra with low S/N can be modelled in the 0.5-10 keV band with an absorbed power-law, where the intrinsic absorption is caused by the gas and dust surrounding the SMBH, or by the host galaxy itself. The Iron K α line at 6.4 keV can also be properly modelled in low S/N AGN spectra. Therefore, the X-ray spectroscopy of large numbers of AGN, combined with extended multiwavelength coverage, makes possible to study the distribution of parameters such as the intrinsic absorption (N H,z ), the Iron Kα equivalent width and the power-law photon index (Γ), and to look for trends between these quantities and redshift or intrinsic X-ray luminosity.
The primary power-law component of the AGN X-ray spectra is caused by inverse Compton scattering emissions of UV photons. These photons are first emitted by the disc and then up-scattered by the hot corona electrons that surround the disc (Haardt & Maraschi 1991;Siemiginowska et al. 2007).
A proper analysis of the different X-ray spectral parameters requires large datasets, with both good Xray statistics and complete multiwavelength characterization. Since developing these types of datasets is not trivial, even the dependences of the photon index and the intrinsinc absorption from other quantities (e.g., redshift, X-ray or bolometric luminosity) are still debated, as are the X-ray spectral properties of optically classified Type 1 and Type 2 AGN (see, e.g., Mateos et al. 2005;Page et al. 2005;Shemmer et al. 2006;Young et al. 2009;Sobolewska & Papadakis 2009;Lusso et al. 2012;Lanzuisi et al. 2013;Fotopoulou et al. 2016).
The Chandra COSMOS-Legacy survey (Civano et al. 2016), with its relatively deep average coverage of ∼160 ks over 2.15 deg 2 , for a total of 4.6 Ms, provides an unprecedented dataset to study the X-ray properties of AGN over a wide range of redshifts and luminosities.
Moreover, the COSMOS field ) has been covered with extended multiwavelength photometric (Capak et al. 2007;Koekemoer et al. 2007;Sanders et al. 2007;Schinnerer et al. 2007;Taniguchi et al. 2007;Zamojski et al. 2007;Ilbert et al. 2009;McCracken et al. 2010;Laigle et al. 2016) and spectroscopic (Lilly et al. 2007(Lilly et al. , 2009Trump et al. 2007) observations, thus enabling to identify and characterize ∼97% of the X-ray sources (Marchesi et al. 2016a). Therefore, a complete analysis of the X-ray spectral parameters for different classes of optical sources is possible in Chandra COSMOS-Legacy. In this work, we present the X-ray spectral analysis of the 1855 extragalactic sources with more than 30 net counts in the 0.5-7 keV band in Chandra COSMOS-Legacy. In Section 2 we describe the Chandra COSMOS-Legacy survey, the X-ray catalog and the optical/IR properties of the X-ray sources. In Section 3 we present the spectral extraction procedure, while in Section 4 we describe the different fitting models we used, and the results of the fitting. In Section 5 we discuss the fit parameters distribution. Finally, in Section 6 we discuss the properties of the z>3 subsample, and in Section 7 we summarize the results of this work. Throughout the paper, we assume a cosmology with H 0 = 71 km s −1 Mpc −1 , Ω M = 0.3 and Ω Λ = 0.7. Errors are at 90% confidence if not otherwise stated.
2. THE Chandra COSMOS-Legacy SURVEY 2.1. The X-ray catalog The Chandra COSMOS-Legacy X-ray catalog is described in Civano et al. (2016). The catalog is the result of 4.6 Ms of observations with Chandra on the 2.2 deg 2 of the COSMOS field. The final catalog includes 4016 sources, detected in at least one of the following three bands: full (F; 0.5-7 keV), soft (S; 0.5-2 keV) and hard (H; 2-7 keV). Each source was detected in at least one band with a maximum likelihood detection value DET ML>10.8, i.e., with probability of being a spurious detection P<2 × 10 −5 . The survey flux limit in each of the three bands is f =1.2 × 10 −15 erg s −1 cm −2 in the 0.5-10 keV band, f = 2.8 × 10 −16 erg s −1 cm −2 in the 0.5-2 keV band and f =1.9 × 10 −15 erg s −1 cm −2 in the 2-10 keV band. Fluxes have been computed assuming as a model a power-law with no intrinsic absorption and photon index Γ=1.4 17 . Fluxes in the full (hard) band are computed over the 0.5-10 keV (2-10 keV) energy range, instead than over the 0.5-7 keV (2-7 keV) one, for an easy comparison with other works in the literature. In Figure 1 we show the 0.5-7 keV net counts distribution for the 4016 Chandra COSMOS-Legacy sources. 1949 sources have more than 30 net counts and 923 sources have more than 70 net counts. These two counts thresholds are based on previous X-ray spectral analysis (see, e.g., Lanzuisi et al. 2013Lanzuisi et al. , 2015: in spectra with more than ∼70 net counts, we can perform a fit leaving the two main fit parameters (the power-law photon index Γ and the intrinsic absorption N H,z ; see Section 4.1 for further details) free to vary, recovering uncertainties <30% for the vast majority of the sources (see also Figure 5, right panel). We instead choose the ∼30 net counts threshold as a limit for which only one of the two parameters can be constrained, with the other one fixed. In the following sections, we will describe the X-ray spectral properties of the 1855 extragalactic sources (i.e., excluding the 64 stars and the 30 sources with no redshift available) with more than 30 net counts in the 0.5-7 keV band: from now on, we will refer to this sample as the CCLS30 sample. We will also refer to the sample of 887 extragalactic sources with more than 70 net counts in the 0.5-7 keV band as the CCLS70 sample. Figure 1. Distribution of 0.5-7 keV net counts for the whole Chandra COSMOS-Legacy survey. Red dashed lines mark the two different thresholds adopted in the X-ray spectral analysis, i.e., 30 and 70 net counts.

Chandra COSMOS-Legacy optical/IR counterparts
The optical/IR counterparts of the whole Chandra COSMOS-Legacy sample are described in Marchesi et al. (2016a). 1273 sources out of 1855 in CCLS30 (68.6%) have a z spec , the remaining 582 have a z phot , obtained using the SED fitting procedure described in Salvato et al. (2011) and based on χ 2 minimization, using the publicly available code LePhare (Arnouts et al. 1999;Ilbert et al. 2006). The spectroscopic completeness is significantly higher in CCLS70, where 732 out of 887 sources (82.5%) have a z spec . In our analysis, we classify the sources in the CCLS30 on the basis of the optical spectroscopic classification, when available; otherwise, we use the best-fitting template derived in the SED fitting procedure adopted to estimate the photometric redshifts. The sources are divided as follows: 1. 696 Type 1, unobscured AGN. Broad-line AGN (BLAGN) on the basis of their spectral classification, i.e., sources with lines having FWHM≥2000 km s −1 , or sources with no spectral information, and SED best-fitted by an unobscured AGN template.
3. 37 Galaxies. Objects with rest-frame, absorptioncorrected 2-10 luminosity L 2−10keV <10 42 erg s −1 , no broad lines in their spectra (non-BLAGN), or sources with no spectral type and SED best-fitted by a galaxy template. Nine of these sources are part of a sample of 50 dwarf galaxies (i.e., having mass 10 7 < M * <10 9 ), candidate Type 2 AGN that are being analyzed in a separate paper (Mezcua et al. in preparation), while other ten are part of a sample of 69 early-type galaxies analyzed in Civano et al. (2014). In the remaining part of our analysis, we do not show the properties of these 37 sources.
4. 11 sources have a low-quality spectroscopic redshift, from which was possible to estimate only the z value, with no information on the spectral type, and lack of SED template best-fitting information. Therefore, for these 11 sources no type information is provided.
We point out that there is an excellent agreement between the spectral and the SED template best fitting classification, while their both available. 86% of the BLAGN also have SED best fitted with an unobscured AGN template, and 96% of the non-BLAGN have SED best fitted with an obscured AGN or a galaxy template. The lower agreement for BLAGN is not surprising, given that BLAGN SEDs, especially those of low-luminosity AGN, can be contaminated by stellar light (Luo et al. 2010;Elvis et al. 2012;Hao et al. 2014). A summary of the average redshift and X-ray properties of the three subsamples is shown in Table 1.

SPECTRA EXTRACTION
We first extract a spectrum in each of the fields where a source was observed, using the CIAO (Fruscione et al. 2006) tool specextract. We used CIAO 4.7 and CALDB 4.6.9. The specextract tool creates a source and background spectrum for each input position, together with the respective response matrices, ARF and RMF. For the source spectral extraction we use a circular region with radius r 90 , i.e., the radius which contains 90% of the PSF in the 0.5-7 keV band; r 90 was computed using the CIAO tool psfsize srcs, for each source in each observation where the source has been observed (1 to 16 observations 18 ). The most common number of observations per source is 4 (463 sources, 25%), and 331 sources (18%) have been observed in 8 or more fields.
To extract the background spectrum, we use event files where the detected sources have been previously removed, to avoid source contamination to the background. The background spectra have been extracted from an annular region centered on the source position and with inner radius r 90 +2.5 ′′ and outer radius r 90 +20 ′′ . These radii were chosen to avoid contamination from the source emission, and to have enough counts to obtain a reliable background spectrum. As a result, the mean (median) number of background counts in the 0.5-7 keV band is 149.1 (154.7), and only 78 sources (i.e., ∼4% of the CCLS30 sample) have less than 50 background counts in the 0.5-7 keV band. These 78 sources are mainly located in low-exposure pointings.
18 23 Chandra COSMOS-Legacy fields have been observed in two or three separate observations, due to instrumental constraints.
All the spectra obtained for a single source have finally been combined in a single spectrum, using the CIAO tool combine spectra. We set the bscale method parameter, i.e., the parameter which determines how are the background counts combined, to "counts", because this algorithm is the suggested one to have background counts and backscale values properly weighted when the background is going to be modeled rather than subtracted 19 .

SPECTRAL FITTING
The spectral fitting was performed using the CIAOmodelling and fitting package SHERPA (Freeman et al. 2001). All the fits were performed using the Cstat statistics, which is based on the Cash statistics (Cash 1979) and is usually adopted for low-counts spectral fitting, since in principle does not require counts binning to work. The main difference between Cstat and the original Cash statistics is that the change in Cstat statistics adding or subtracting a parameter to a model (∆C) is distributed similarly to ∆χ 2 . Therefore, it is possible to use the reduced Cstat, Cstat ν =Cstat/DOF, where DOF is the number of degrees of freedom of the fit, as an estimator of the fit goodness; a good fit should have Cstat∼1. It is also worth mentioning that, given that Cstat ν is a good estimator of the fit goodness only if the fitted spectra have more than 1 count per bin, to avoid empty channels. We binned our spectra with 3 counts per bin for an easier visual inspection of the fits. Cstat does not work with background-subtracted data, being a maximum likelihood function and assuming a purely Poissonian count error. For this reason, our analysis requires a proper modelling of the background, to find the best-fit which is then included in the final model of the source+background spectrum. We describe the model we adopted to fit the Chandra ACIS-I background in Appendix A.

Source modelling
We now describe the procedure we adopted to find the best-fitting model for each source in our sample. We started from a basic model, an absorbed powerlaw, then we added further components, looking for a statistically significant improvement in the fit, such as ∆Cstat=Cstat old -Cstat new > 2.71 (see, e.g., Tozzi et al. 2006;Brightman et al. 2014, which validated this value with extended simulations), where Cstat old is the Cstat value of the best fit of the original model, while Cstat new is the Cstat value of the best fit of the model with the additional component. It is worth noticing that 19 http://cxc.harvard.edu/ciao/ahelp/combine spectra.html ∆Cstat=2.71 corresponds to a fit improvement with 90% confidence only if Cstat ν ∼1 (Brightman et al. 2014), which is a true statement for the majority of our fits ( Figure 3).
For all fits, we fixed the Galactic absorption to the average value observed in the direction of the COSMOS field (N H,gal =2.5 × 10 20 cm −2 ; Kalberla et al. 2005).
1. For the 968 sources with 30<cts<70, we fitted an absorbed power-law with fixed photon-index Γ 1 =1.9, keeping the rest frame absorbed column, N H,z , free to vary.

2.
We then fitted all the 1855 sources with an absorbed power-law with Γ 1 and N H,z free to vary. 296 of the 967 sources with <70 net counts have a best-fit significantly improved with respect to the fit with Γ 1 =1.9.
3. For a subsample of sources, all of which are obscured AGN with N H,z > 10 22 cm −2 , we found an improvement to the fit adding to the model a second power-law, with Γ 2 =Γ 1 , no intrinsic obscuration and normalization free to vary. This second power-law models the AGN emission unabsorbed by the torus and/or a scattered component, i.e., light deflected without being absorbed by the dust and gas. This second normalization, norm 2 is always signicantly smaller than the first one, norm 1 , with the ratio norm 2 /norm 1 ranging between 3 × 10 −2 and 0.15. 57 sources have best-fit significantly improved with respect to the model with a single power-law, 29 of which have fixed Γ=1.9, while the other 28 have Γ free to vary. 4. A fraction of spectra are expected to have an excess in the single power-law fit residuals around 6-7 keV (rest-frame), this excess being due to the iron Kα emission line at 6.4 keV. For this reason, we added to our absorbed power-law fit an emission line at 6.4 keV, modelled with a gaussian having line width σ=0.1 keV, and we refitted the spectra of all the 1855 sources in CCLS30. We freezed the redshift value of the line for those sources with a spectroscopic redshift, while we left the redshift free to vary within z + ∆z, with ∆z=0.5, for the photometric redshifts. More than 90% of the sources with only a photo-z in CCLS30 have ∆z <0.5, so we choose this value as a conservative threshold. We find that 130 (82) sources have best-fit significantly improved with respect to the single power-law fit in CCLS30 (CCLS70). Moreover, 10 (6) of the sources in the >30 (>70) counts sample are best-fitted by a double power-law with emission line model. We discuss how the iron Kα line equivalent width (EW) correlates with the best-fit photon index Γ and N H,z in Section 5.6.
In Figure 2 we show an example of each type of best-fit. In Table 2 we report the number of sources for each class of best-fit, for the CCLS30 and the CCLS70 samples, and for the sample of sources with more than 30 and less than 70 net counts in the 0.5-7 keV band. In Table 3 we show the same best-fit division, for Type 1 and Type 2 sources, and for galaxies.  Number of sources for each class of best-fit, for sources with more than 30, more than 30 and less than 70, and more than 70 net counts in the 0.5-7 keV band.
In Figure 3 (right panel) we show the distribution of Cstat versus DOF for the sources in CCLS30. The red solid line indicates the case Cstat ν =1, i.e., Cstat=DOF, while the red dashed lines show the Cstat value, at a given DOF, above (below) which there is 1% probability to find such a high (low) value if the model is correct. More than 98% of the sources in our sample lie within the two dashed lines, therefore suggesting that most of the fits are acceptable. Indeed there are 16 sources (0.9% of the whole sample) below and 21 sources (1.1%) above the dashed lines, a fraction consistent with random noise fluctuations.

Modelling results
In Figure 4 we show the distribution of two main spectral parameters, Γ and N H,z , for both the CCLS30 (left) and the CCLS70 (right) samples, for all the sources for which we left both parameters free to vary. It is worth noticing that the observed dispersion on Γ is significantly smaller for those sources with N H,z <10 22 cm −2 , i.e., classified as unobscured (σ >30 =0.47, σ >70 =0.31), with respect to those sources with nominal or upper limit at N H,z >10 22 cm −2 (σ >30 =0.83, σ >70 =0.47). Moreover, at N H,z >10 22 cm −2 the errors on Γ are larger, since constraining Γ becomes more difficult for sources with larger column density. This discrepancy is mainly due to the fact that obscured sources have on average less net counts than unobscured ones, and therefore their best-fit parameters are less constrained. In CCLS30, sources with N H,z >10 22 cm −2 have mean (median) net counts in the 0.5-7 keV band cts = 124.7 (90.2), while sources with N H,z <10 22 cm −2 have cts = 293.4 (153.4).
In Figure 5 (left) we show the distribution of the photon-index Γ as a function of the number of 0.5-7 keV net counts for sources in CCLS70. The black solid line shows the mean Γ value for the 877 sources in the CCLS70 sample, Γ =1.68. The mean Γ of the whole CCLS30 population, i.e., taking into account also the 609 sources with Γ=1.9 and the 345 sources with less than 70 net counts and Γ free to vary which we do not plot in Figure 5, is Γ =1.66. Sources with peculiar Γ values (i.e., Γ >3 or Γ <1), which mainly contribute to the Γ distribution dispersion, have for the most part less than 70 net counts and therefore their best-fit estimates are expected to be affected by larger uncertainties than those of brighter sources. A significant fraction of objects with less than 70 net counts (162 out of 967, ∼17%) have flat spectra (i.e., Γ <1). Sources with such a photon index are candidate reflection dominated, Compton Thick (CT) AGN. However, these objects also have, on average, larger uncertainties on Γ. Therefore an extensive analysis of these objects, including a fit with more complex model than those used in this work, is required to determine how many of them are actual CT AGN, and will be performed in Lanzuisi et al. (in prep.).
The relation between number of counts and error on Γ can be seen in Figure 5 (right). The fraction of sources with relative error, ∆err=err Γ /Γ, larger than 30% is ∼22% for sources with >30 counts, but significantly drops to ∼8% (∼4%) for sources with >70 (>100) counts.

FITTED PARAMETERS DISTRIBUTION
5.1. Intrinsic absorption column density, N H,z In Figure 6 we show the N H,z distribution, for both Type 1 (blue) and Type 2 (red) AGN. Nominal values are shown with solid lines, while the 90% confidence upper limits distributions are plotted with dashed lines.
We computed the Kaplan-Meier estimators of the mean values of N H,z for Type 1 and Type 2 sources in CCLS30 and CCLS70, using the ASURV tool, Rev 1.2 (Isobe & Feigelson 1990;Lavalley et al. 1992), which implements the methods presented in Feigelson & Nelson (1985), to properly take into account 90% confidence upper limits. We report these mean values in Table 4.
Type 1 AGN are significantly less obscured than Type 2 sources. In CCLS70, 45 Type 1 AGN (10.3% of the whole Type 1 AGN population) have a 90% confidence intrinsic absorption value N H,z >10 22 cm −2 (i.e., above the threshold usually adopted to distinguish between obscured and unobscured sources), while 165 Type 2 AGN (38.3%) have N H,z >10 22 cm −2 at a 90% confidence level, and other 35 have N H,z >10 22 cm −2 within 1σ. In CCLS30 the fraction of obscured sources slightly increases in both Type 1 (106 sources, 15.2%) and Type 2 sources (460 sources, 41.4%). The fractions do not change significantly if we take into account only sources with spectroscopic classification, therefore ruling out a pure SED-fitting template misclassification. We summarize the number of obscured sources per AGN type in Table 5.
Finally, in Table 6 we report the fraction of sources Type 1 Type 2 Galaxies Fit n CCLS30 n 30−70 n CCLS70 n CCLS30 n 30−70 n CCLS70 n CCLS30 n 30−70 n CCLS70 Γ=1 .9  177  177  0  412  412  0  20  20  0  Γ free  460  66  394  566  199  367  12  5  7  Double PL  11  6  5  45  28  17  1  1  0  Fe Kα  46  9  37  80  37  43  4  2  2  2PL+Fe Kα  2  1  1  8  3  5  0  0  0   Table 3 Number of sources for each class of best-fit, for sources with more than 30, more than 30 and less than 70, and more than 70 net counts in the 0.5-7 keV band.  The red solid line represents the Cstat=DOF trend (i.e., the ideal Cstat ν =1), while the dashed lines indicate, at any DOF, the Cstat above (below) which a 1 per cent probability to find such high (low) Cstat values is expected.  Table 5 Number and fraction of objects with N H,z >10 22 cm −2 , for Type 1 and Type 2 AGN, for sources in CCLS30 and in CCLS70. We also computed numbers and fractions for sources with reliable spectral type only. The fraction is computed on the total number of sources of the same type.  Table 6 Ratio between sources with an upper limit on N H,z and total number of sources, and fraction of sources with a 90% confidence upper limit on N H,z , for Type 1 and Type 2 AGN, and for all sources with optical classification. Figure 7 we show the distribution of Γ for the CCLS70 sample, for Type 1 (blue dashed line) and Type 2 (red solid line) sources. We do not plot the CCLS30 histogram because for a significant fraction of sources in the 30-70 net counts sample we fixed Γ=1.9 (see Table  2).

Photon index, Γ In
The mean and σ of the Γ distribution for Type 1 and Type 2 AGN in CCLS70 are the following: The error on the mean is computed as err= σ N , where σ is the distribution dispersion and N is the number of sources in each sample.
The probability that the two distributions are drawn from the same population, on the basis of a Kolgomorov-Smirnov (KS) test, is P =1.9×10 −9 . Therefore, assuming that we are properly constraining Γ (which might be not true for sources with high N H,z , see Figure 4), we find that Type 2 source have flatter photon index than Type 1 sources, a result already found in Lanzuisi et al. (2013).
To better understand if the difference in Γ between-Type 1 and Type 2 AGN may be caused by extreme objects, we compute the mean and σ on Γ of the Type   1 and Type 2 samples taking into account only those sources with 1< Γ < 3. To this end, we exclude from our analysis very soft objects and candidate highly obscured, reflection dominated sources. In this subsample, Type 1 AGN have mean (median) photon index Γ =1.77±0.01 (1.75), with dispersion σ=0.28, while Type 2 AGN have mean (median) photon index Γ =1.70±0.02 (1.66), with dispersion σ=0.34. The difference between the two samples is now smaller, mainly because the Type 2 sample does not contain anymore the candidate CT-AGN, which caused both a flattening of the average Γ and an increasing in the dispersion. However, the KS-test still excludes that the two distributions are drawn from the same population, with probability P =2.2×10 −6 .
Finally, we point out that a fraction of Type 2 AGN are expected to be unobscured, low Eddington ratio sources lacking of broad emission lines (see, e.g., Trump et al. 2011;Marinucci et al. 2012), and sources with low Eddington ratio are also expected to have flatter photon index (see, e.g., Shemmer et al. 2008;Risaliti et al. 2009). To verify if this is the case for CCLS70, we divide the Type 2 sources in obscured or unobscured using the N H,z =10 22 cm −2 threshold, selecting only sources with 1< Γ <3 to reduce the number of sources where N H,z is most likely poorly constrained. We find that this obscured Type 2 AGN subsample contains 185 sources and has Γ =1.76±0.03, with dispersion σ=0.39, while the unobscured one contains 115 sources and has Γ =1.71±0.03, with dispersion σ=0.28. Therefore, at least part of the observed discrepancy between Type 1 and Type 2 AGN photon index distributions may be driven by the Γ-λ edd relation. However, such a result needs to be confirmed using the actual λ edd , since Γ measurements may be biased in obscured objects.

Flux and luminosity
We report in Table 7 the mean and σ values of the 2-10 keV flux distribution of Type 1 and Type 2 sources. The two distributions are similar, but the hypothesis that the two populations are drawn from the same parent population is rejected for CCLS30 (p-value= 6.3×10 −4 ), even though it is worth mentioning that the KS-test does not take into account the uncertainties on the flux measurement, which can be significant for the sources with low counts statistics in CCLS30.
We report the mean and σ values of the Type 1 and Type 2 AGN 2-10 keV, absorption-corrected luminosity distributions in Table 8. Type 1 AGN are on average more luminous than Type 2 AGN, for both the net counts thresholds we adopted. The difference between the two distributions is mainly due to the fact that, as can be seen in Figure 8, Type 1 AGN are on average at higher redshifts ( z =1.74 for CCLS30, z =1.57 for CCLS70) than Type 2 AGN ( z =1.23 for CCLS30, z =1.15 for CCLS70), and our sample is flux-limited, i.e., sources at higher redshifts also have higher luminosities. Nonetheless, the difference between the two distributions is also an indication of a trend with 2-10 keV luminosity of the fraction of Type 1 to Type 2 AGN: in any sample complete in both z and L X , the fraction of Type 2 AGN decreases for increasing luminosities, at any redshift, as already observed in several works (e.g., Lawrence & Elvis 1982;Ueda et al. 2003;Hasinger 2008;Buchner et al. 2015;Marchesi et al. 2016a Logarithm of the 2-10 keV intrinsic, absorption-corrected luminosity distribution mean and standard deviation σ, for Type 1 and Type 2 AGN, for sources in CCLS30 and CCLS70. All values are in erg s −1 .

Photon index dependences
We searched for a potential correlation between the photon index and the redshift or the X-ray luminosity. In Figure 9, left panel, we show the distribution of Γ as a function of redshift, for the CCLS70 sample. Type 1 sources show a weak anti-correlation between z and Γ, with Spearman correlation coefficient ρ=-0.15 and pvalue p= 2.3×10 −3 for the hypothesis that the two quantities are unrelated. In Type 2 sources, instead, the correlation coefficient is smaller, ρ=-0.07, and the p-value 2-10 keV rest-frame, absorption-corrected luminosity as a function of redshift for the sources in CCLS30. Type 1 AGN are plotted in blue, Type 2 AGN are plotted in red, galaxies are plotted in green. Sources with spectroscopic (photometric) redshift are plotted with a circle (cross). The solid line represents the sensitivity limit of the Chandra COSMOS-Legacy survey. The four sources below the sensitivity limit have less than 40 net counts in the 0.5-7 keV band and their parameters are likely poorly constrained (e.g., N H,z may be underestimated and consequently the rest-frame luminosity would be underestimated too). p=0.12 does not allow to rule out the hypothesis of no correlation.
The results obtained for the whole CCLS70 sample do not change significantly if we exclude from the computation sources with peculiar photon index (i.e., taking into account only objects with 1< Γ <3), for both Type 1 (ρ=-0.14 and p-value p= 3.8×10 −3 ) and Type 2 AGN (ρ=-0.08 and p-value p=0.10).
In Figure 9, right panel, we show the distribution of Γ as a function of the intrinsic, absorption-corrected 2-10 keV luminosity, for the CCLS70 sample. We do not find any evidence of correlation in both Type 1 (blue, ρ=0.01 and p-value p=0.78) and Type 2 (red, ρ=0.01 and p-value p=0.90) objects, and the lack of correlation remains also in the 1< Γ <3 subsample. In Figure 10 (left panel) we show the distribution of N H,z as a function of redshift for sources in CCLS30. As can be seen, N H,z minimum value significantly increases at increasing redshifts. Such a result was already observed, by Civano et al. (2005), Tozzi et al. (2006) and Lanzuisi et al. (2013) among others, and is due to the fact that moving toward high redshifts, the photoelectric absorption cut-off moves outside the limit of the observ-ing band, 0.5 keV. Consequently, the measure of low N H,z values becomes more difficult.

Trend with 2-10 keV luminosity
In Figure 10 (right panel) we show the distribution of N H,z as a function of 2-10 keV rest-frame absorptioncorrected luminosity. As can be seen, and as already discussed in previous sections, Type 2 AGN are dominant in the region of the plot with N H,z >10 22 cm −2 and L 2−10keV <10 44 erg s −1 , while Type 1 sources are dominant in the region with N H,z <10 22 cm −2 and L 2−10keV >10 44 erg s −1 .
In CCLS30, 268 sources out of 1844 sources (∼15%) lie in the obscured quasar region, i.e., have 2-10 keV luminosity L 2−10keV >10 44 erg s −1 and 90% confidence significant intrinsic absorption N H,z >10 22 cm −2 . 83 out of these 268 sources (∼31%) are classified as Type 1 AGN, and 61 of these 83 Type 1 sources have optical spectroscopy available. The mean redshift of these 61 objects is z =2.03. A significant fraction of obscured Type 1 AGN (∼15% of the whole population at L 2−10keV >10 44 erg s −1 ) was also found by Merloni et al. (2014) in XMM-COSMOS. We further discuss these obscured Type 1 AGN in the next section.
In CCLS30, a total of 199 out of 1111 (17.9%) Type 2 AGN have N H,z <10 22 cm −2 . i.e., consistent with being unobscured AGN on the basis of their X-ray spectrum. The fraction of unobscured Type 2 AGN is matter of extended debate in literature, varying from only a few percent (<5%) in Risaliti et al. (1999), Malizia et al. (2009) and Davies et al. (2015) to 30% (Merloni et al. 2014) and up to 66% (Garcet et al. 2007). The first scenario suggests that optical and X-ray obscuration tend to occur at the same time, while the second points to a stronger independence between the obscuration processes in the two different energy ranges. The fraction we obtain is in an intermediate regime between the two scenarios, in good agreement with the results of Panessa & Bassani (2002), Akylas & Georgantopoulos (2009) and Koulouridis et al. (2016). We will further discuss the implications of this result in the next section.

Trend with 2-10 keV luminosity of the AGN obscured fraction
In Figure 11 we show the distribution of N H,z as a function of 2-10 keV absorption-corrected luminosity, in bins of 0.5 dex in both N H,z and luminosity. The color of each bin identifies the fraction of Type 2 sources f 2 =N 2 /N all , where N 2 is the number of Type 2 sources and N all is the total number of sources in each bin. In the left panel we show the results for the whole sample (i.e., taking into account both the spectroscopic and the photometric type), while in the right panel only sources with optical spectroscopic classification are taken into account. It is worth noticing that there is a general good agreement between the combined classification and the one spec- troscopically based, therefore all the following discussion cannot be related to a bad SED-fitting classification.
We expect that a fraction of optically classified classified Type 1 sources with X-ray properties consistent with those of obscured quasars to be broad absorption lines (BAL) quasars, with broad, blue-shifted absorption lines in their optical/UV spectra. Of the candidate BAL quasars with optical spectrum available, 41 out of 61 have z>1.5, i.e., high enough to observe the UV features (e.g., CIV at 1549Å) in optical spectra. However, the analysis of these objects is beyond the purpose of this work, and these aspects will be investigated in a forthcoming paper (Marchesi et al. in preparation). Moreover, a fraction of these obscured Type 1 sources may not be BAL quasars, being instead sources with dustfree gas surrounding the inner part of the nuclei, therefore causing obscuration in the X-rays and not in the optical band (Risaliti et al. 2002;Maiolino et al. 2010;Fiore et al. 2012;Merloni et al. 2014).
In the previous section, we showed that 199 (∼18%) of the 1111 CCLS30 Type 2 sources have N H,z <10 22 cm −2 , i.e., consistent with being unobscured AGN. In Figure 11 we can see how the distribution of these objects is strongly dependent to their X-ray luminosity, i.e., the fraction of Type 2 AGN with respect to the total number of sources in a bin is higher at log(L 2−10keV )=[42-43] (75-90%), and drops at L 2−10keV >10 44 erg s −1 (<15%), where there are only 19 unobscured Type 2 AGN (9.5% of the whole Type 2 AGN sample). A similar result was found by Merloni et al. (2014) in the XMM-COSMOS survey, where ∼40% of the sources with log(L 2−10keV )=[42.75-43.25] are unobscured Type 2 AGN, while the fraction of unobscured Type 2 AGN is <10% at L 2−10keV <10 44 erg s −1 . This result is expected, since the XMM-COSMOS sample is a "bright" subsample of the Chandra COSMOS-Legacy one, since XMM-COSMOS has a flux limit ∼3 times higher than Chandra COSMOS-Legacy, so there is a significant overlap between the sample used by Merloni et al. (2014) and CCLS30.
We point out that, according to our classification, a fraction of Type 2 sources are expected to be narrow-line Seyfert 1 galaxies (NLSy1, Osterbrock & Pogge 1985;Goodrich 1989), i.e., AGN with the properties of Seyfert 1 galaxies, but with only narrow, rather than broad, HI emission lines. These objects are by definition unobscured, although lacking of broad lines. However, NLSy1 galaxies usually have very steep photon indexes (see, e.g., Brandt et al. 1997), and only 16 out of 199 (8.4%) unobscured Type 2 AGN in CCLS30 have Γ >2, and only one source has Γ >2.5. Therefore, we expect that not more than 10% of the unobscured Type 2 AGN in CCLS30 are NLSy1 galaxies.
Besides NLSy1 galaxies, there are at least two possible reasons to explain the relatively high fraction of unobscured Type 2 AGN at low luminosities: (i) it is possible that a fraction of low luminosity sources is wrongly classified. A similar effect was described in Oh et al. (2015), who analyzed the spectra of galaxies at z <0.2 in the Sloan Digital Sky Survey Data Release 7 (SDSS DR7), N H,z 90% confidence upper limits are plotted as triangles. Right : N H,z as a function of 2-10 keV absorption-corrected luminosity, for all sources in CCLS30, for Type 1 (blue) and Type 2 (red) AGN, and for galaxies (green). N H,z 90% confidence upper limits are plotted as triangles. The horizontal black dashed line marks the threshold (N H,z =10 22 cm −2 ) usually adopted to divide unobscured and obscured sources. The vertical black dotted line marks the luminosity threshold, L 2−10keV =10 42 erg s −1 , used to divide AGN from star-forming galaxies, while the vertical black dashed line marks the threshold usually adopted to separate quasars from Seyfert galaxies, L 2−10keV =10 44 erg s −1 . and found a significant fraction of previously misclassified BLAGN, i.e., sources with a stellar spectral continuum and a broad Hα emission line. With this new classification, the number of Type 1 AGN in SDSS DR7 increased by 49%. However, we point out that Oh et al. (2015) analyzed low-luminosity AGN at z <0.2, i.e., a sample significantly different to CCLS30, where only 39 sources out of 1855 (2.1%) have z <0.2. (ii) AGN with low Eddington ratio, and therefore low luminosity, lack broad emission lines, even if they are intrinsically unobscured (see, e.g., Trump et al. 2011;Marinucci et al. 2012). If this is the case, the drop in BLAGN at N H,z <10 22 cm −2 and L 2−10keV <10 44 erg s −1 would imply an Eddington ratio threshold of λ Edd ∼10 −3 -10 −2 , assuming average BH masses M BH =10 8 -10 9 M ⊙ . This would have strong consequences for the unification scheme. 5.6. Iron Kα equivalent width dependences 141 CCLS30 sources are best-fitted with a model which includes an iron Kα emission line at 6.4 keV. For each source, we compute the emission line equivalent width, EW, a measure of the line intensity, computed as follows: where F l (E) is the flux of the emission line at the energy E, and F c (E) is the intensity of the spectral continuum at the same energy. The mean (median) equivalent width of the 141 CCLS30 sources is EW=0.49±0.03 (0.41) keV, with dispersion σ=0.39. We also checked how much the assumption we made on the line width, which we fixed to σ=0.1, affected the mean EW value. To do so, we refitted the 141 spectra using σ=0, i.e., the scenario where the line shows no relativistic broadening. We find that in this case the iron Kα mean (median) equivalent width only slightly decreases, being EW=0.44±0.03 (0.36) keV, with dispersion σ=0.34.
In Figure 12 (left panel), we show the EW distribution as a function of the photon index Γ for the 101 sources for which we compute Γ (i.e., excluding those sources for which we fixed Γ=1.9). If we fit the data with a linear model (red dashed line), EW= aΓ + b, we found evidence of a significant inverse correlation, with a=-0.22±0.05.
However, a fraction of sources with prominent Iron Kα emission line are expected to be CT AGN. A proper characterization of these objects requires fitting models more complicated than those used in this work, to properly take into account Compton scattering processes. For this reason, we are performing an extended analysis of the CCLS30 sample using more appropriate torus models and a MCMC analysis to estimate the probability of a source a certain N H,z value (Lanzuisi et al. in preparation). As a preliminary result, we find that 12 out of the 141 CCLS30 sources with significant iron Kα emission line have a significant probability to be CT AGN  Figure 11. N H,z as a function of 2-10 keV absorption-corrected luminosity. The colormap shows the ratio between the number of Type 2 sources (N 2 ) against all sources (N all ), in each bin of N H,z and luminosity. N 2 /N all =1 is plotted in red, N 2 /N all =0 in blue. In the left panel we show the results for the whole sample (i.e., taking into account both the spectroscopic and the photometric type), while in the right panel only the spectroscopic type is taken into account. or heavily obscured sources (i.e., with Log(N H,z )>23.5). We plot these sources as black stars in Figure 12, left panel. If we re-fit the sample without these sources, which are likely to have a wrongly computed Γ using our basic models, the correlation between EW and Γ disappears (a=0.10±0.08; red solid line in Figure 12). This result is due to the fact that heavily obscured AGN are poorly fitted by basic models like those used in this work, which try to mimic the flat spectra of obscured sources with a flat photon index and no obscuration.
In Figure 12 (central panel) we show the EW distribution as a function of N H,z . 99 sources (7%) have only an upper limit on N H,z , while the remaining 30% have a N H,z value significant at a 90% confidence level. We do not find any significant correlation between EW and N H,z . We remark that a similar result is not unexpected, since a trend between EW and N H,z is observed only at N H,z >10 23 cm −2 (see, e.g., Makishima 1986), and only 12% of the sources analyzed in this section have intrinsic absorption values above this threshold. Finally, candidate CT AGN (plotted with full markers) have on average low N H,z values, a further indication that a standard spectral fitting procedure does not work properly with these extreme sources.

The X-ray Baldwin effect
We select the 33 Type 1 AGN in CCLS30 best-fitted with a model containing an Iron Kα line to check for the presence of the so-called "X-ray Baldwin" effect, i.e., the existence of an anti-correlation between the Iron Kα line EW and the AGN 2-10 keV luminosity (Iwasawa & Taniguchi 1993).
The existence of this anti-correlation is confirmed by our data, as can be seen in Figure 12 (right panel); the Spearman correlation coefficient is ρ=-0.72, with p-value p= 9.9×10 −6 , for the hypothesis that the two quantities are unrelated. The best-fit to our data is expressed by the relation EW(Kα)∝ L X (2-10 keV) −0.34±0.07 , in fair agreement with the result obtained by Iwasawa & Taniguchi (1993), which measured a trend expressed by the relation EW(Kα)∝ L X (2-10 keV) −0.20±0.03 . 5.7. Host galaxy mass and star formation rate dependences for Type 2 AGN Suh et al. (in prep.) computed host galaxy properties such as mass (M * ) and star formation rate (SFR) for Type 2 AGN in Chandra COSMOS-Legacy. These quantities have been computed using SED-fitting techniques: the host galaxy properties are derived using a 3-component SED-fitting decomposition method, which combines a nuclear dust torus model (Silva et al. 2004), a galaxy model (Bruzual & Charlot 2003) and starburst templates (Chary & Elbaz 2001;Dale & Helou 2002). The SFR is then estimated by combining the contributions from UV and total host-galaxy IR luminosity computed with the SED fitting (L 8−1000µm ).
We first study the trend of the photon index Γ as a function of M * and SFR for Type 2 objects in CCLS70; we also perform a separate analysis on sources spectroscopically classified as non-BLAGN sources and on sources with only SED template best-fitting classification. We performed a Spearman correlation test for each of the subsamples we described: we report the results in Table 9. In the case of M * , we find no evidence of correlation in the SED template best-fitting subsample and in the whole Type 2 sample, and a weak evidence of anti-correlation in the spectroscopic sample, although the hypothesis that the two quantities are uncorrelated cannot be ruled out (p-value=0.06). We obtain a similar result while correlating Γ and SFR: in this case, the hypothesis that Γ and SFR are uncorrelated in the spectroscopic sample is ruled out with 97% confidence. The SED template best-fitting and the whole Type 2 sample do not show evidence of correlation.
Sample ρ M * p-value M * ρ SF R p-value SF R Spec -0.12 0.06 -0.14 0.03 SED 0.03 0.73 0.04 0.66 All -0.06 0.24 -0.07 0.16 Table 9 Spearman correlation coefficient ρ and p-value for the photon-index Γ in relation to M * or SFR, for sources with spectroscopic classification, sources with only best-fit SED template and for the whole CCLS70 Type 2 sample.
In Figure 13 we show the distribution of the intrinsic absorption N H,z as a function of M * (left) and SFR (right), for the 1011 Type 2 objects in CCLS30. Sources spectroscopically classified as Type 2 AGN are plotted as red circles, while sources with only SED template bestfitting classification are plotted with black squares. Upper limits on N H,z are plotted with triangles. We studied the existence of a correlation between N H,z and M * or SFR computing the Spearman correlation coefficient using the ASURV tool, Rev 1.2 (Isobe & Feigelson 1990;Lavalley et al. 1992), which implements the methods presented in Isobe et al. (1986), to properly take into account the 565 sources having only a 90% confidence upper limit on N H,z . We report the results of the fit in Table 10: we find a significant correlation between N H,z and both M * and SFR, i.e., the objects with higher N H,z values are also those with higher M * and SFR. We obtain the same result fitting separately only the 533 sources with spectral type and the 478 with SED template bestfitting type. We point out that SFR and M * are correlated, i.e., more massive galaxies have higher SFR, therefore the correlation with N H,z can be intrinsic only for one of the two parameters, more likely SFR. Finally, we performed a partial correlation analysis to understand how much of the observed correlation between N H,z and M * (or SFR) is driven by a redshift selection effect, i.e., if the observed correlation is due to the fact that at high redshifts we observe only sources with significant N H,z values (as shown in Figure 10, left panel), and with high values of M * and SFR. To do so, we compute the partial Spearman correlation coefficient between N H,z and M * (or SFR), conditioned by the distanceż, using the equation: (Conover 1980). The partial correlation coefficient we obtain are ρ(a, b,ċ) M * =0.06 and ρ(a, b,ċ) SF R =0.11. Following Equation 6 of Macklin (1982), these values correspond to confidence levels of σ M * =1.58 and σ M * =2.68. Therefore, the observed relation between M * and N H,z seems to be mainly driven by a redshift selection effect, while the relation between SFR and N H,z remains significant, although at <3σ, even taking into account the redshift contribution.
Sample ρ M * p-value M * ρ SF R p-value SF R Spec 0.17 3 × 10 −4 0.29 0 SED 0.13 3 × 10 −3 0.19 0 All 0.14 0 0.24 0 Table 10 Spearman correlation coefficient ρ and p-value for the intrinsic absorption N H,z in relation to M * or SFR, for sources with spectroscopic classification, sources with only best-fit SED template and for the whole CCLS70 Type 2 sample.

HIGH-REDSHIFT SAMPLE
In this section, we summarize the results of the spectral fitting of the 20 Chandra COSMOS-Legacy sources at z ≥3 in CCLS70. 15 of these sources have z spec , the remaining 5 have z phot . An extended analysis of the Chandra COSMOS-Legacy z ≥3 sample, which contains 174 sources, is reported in Marchesi et al. (2016b).
We first fitted our spectra with the best-fit model obtained with the procedure described in Section 4.1. For 19 out of 20 sources the best-fit is an absorbed power-law model, while for cid 83 a Fe Kα emission line is also required. For each source, we estimate the spectral slope Γ and the intrinsic absorption N H,z . We report the results of this fit in Table 11. The mean (median) photon index is Γ=1.50±0.08 (1.49), with dispersion σ=0.35.
15 out of the 20 sources only have an upper limit on N H,z , while the other 5 have a significant N H,z value at a 90% confidence level.
We then repeated the fit, this time using the pexrav model, which takes into account the presence of a reflection component caused by cold material close to the black hole accretion disk. We do so because at z >3 the reflection component, which contributes to the spectral emission at energies greater than 30 keV in the restframe, is observed in the 2-10 keV band. Therefore, a lack of the reflection component in the fit produces an artificial flattening in the photon index estimate. We fix the reflection parameter to 1: since is R=Ω/2π, where Ω is the solid angle of the cold material visible from the hot corona, R=1 is the case where the reflection is caused by an in infinite slab illuminated by the isotropic corona emission. The results of this second fit are reported in Table 11: the presence of a reflection component implies a general steepening of the spectral slope, which now has mean (median) value Γ=1.65±0.07 (1.65), with dispersion σ=0.32.
12 out of the 20 sources have only an upper limit on N H,z fitting the spectra with the pexrav model. Two sources, lid 1577 and cid 507, have N H,z >10 23 cm −2 even if the 90% confidence error is taken into account.
In Figure 14 we show the evolution with redshift of the X-ray spectral slope Γ, both without (red circles) and with (blue squares) the contribution of a reflection component: the photon index distribution has a large spread and no clear trend with redshift is observed.
6.1. Stacking of low statistics spectra To complete our analysis of the z ≥3 sample, we perform a fit stacking the spectra of the 154 sources with less than 70 net counts in the 0.5-7 keV band.
Each spectrum was corrected for background, detector response and Galactic absorption in the same way as done in Iwasawa et al. (2012), and rebinned into 1 keV intervals in the 3-23 keV band in the galaxy rest frame. The spectral stacking is then a straight sum of these individual spectra. As for the sources in CCLS70, we first fit the data with an absorbed powerlaw and then with a pexrav model with reflection parameter R=1. We report the results of the fits in Table 12, for the whole sample and for different subsamples bsased on the optical classification of the sources. The stacked spectrum obtained combining all the 154 sources contains 3583 net counts in the 3-23 keV rest frame band, has Γ=1.44 +0.16 −0.14 and N H,z =5.22 +4.41 −3.23 ×10 22 cm −2 while fitting the data with an absorbed power-law, and Γ=1.63 +0.16 −0.13 and N H,z =7.02 +4.57 −3.32 ×10 22 cm −2 while fitting with a pexrav model. The indication of a flattening in the spectral slope, with respect to a typical value Γ=1.9, is confirmed even if we stack separately the spectra on the basis of their optical classification, taking into account both the classification in Type 1 and Type 2 AGN and the presence or the absence of a spectroscopic redshift.
To study how much the low counts population affects the stacking results, we re-fitted the data removing from the stacking those sources with less than 10 counts in the 3-23 keV rest-frame band: the difference with respect to the fit to the whole sample is completely negligible.
Finally, we repeat the fit after removing from the stacking also those sources having more than 45 counts in the 3-23 keV rest-frame band, to estimate how much the brightest sources effect the fitting. We find a good agreement, at a 90% confidence level, between the spectral parameters Γ and N H,z computed with and without the faintest and brightest sources in the sample.

CONCLUSIONS
We analyzed the X-ray spectra of the 1855 Chandra COSMOS-Legacy extragalactic sources with more than 30 net counts in the 0.5-7 keV band (CCLS30). 1273 out of 1855 sources (∼69%) have a spectroscopic redshift, while the remaining 582 have a photometric redshift.
90% of the sources are well fitted with a basic powerlaw model, while the remaining 10% showed a statistically significant improvement while adding to the basic model further components, such as an iron Kα line at 6.4 keV and/or a second power-law. The source spectra have been fitted together with the background spectra, which we reproduced with a complex multi-component model, described in Appendix A.
We now summarize the main results we obtained.
1. 37.7% of the CCLS30 sources are classified as Type 1 AGN and 60.3% are classified as Type 2 AGN, on the basis of either their spectroscopic or their SED template best-fitting classification. Finally, 2.0% sources are classified as low-redshift, passive galaxies.
2. The majority of sources in CCLS30 (67.2%) have only a 90% confidence upper limit on N H,z (see Figure 6, left panel). Type 2 AGN are also significantly more obscured than Type 1 AGN. 41.4% of Type 2 AGN are obscured (i.e., with N H,z >10 22   Summary of the X-ray spectral fitting with an absorbed power-law for the 20 Chandra COSMOS-Legacy sources with z ≥3 and more than 70 net counts in the 0.5-7 keV band. "spec" indicates if the redshift is spectroscopic (Y) or photometric (N). Left : Evolution with redshift of the photon index Γ, without (red circles) or with (blue squares) the contribution of a reflection component, for the 20 Chandra COSMOS-Legacy sources with z≥3 and more than 70 net counts in the 0.5-7 keV band. Right : same as in the left panel, but binned in four different bins of redshift. Note that the y-axes ranges are different in the two panels. Spectral parameters obtained fitting the stacked spectra of AGN at z ≥3 and with less than 70 net counts in the 0.5-7 keV band. N H,z is in units of 10 22 cm −2 . "spec" and "phot" indicate sources with spectroscopic redshift available and with only photometric redshift, respectively. "10-45 cts" indicates that we stacked only source having between 10 and 45 net counts in the 3-23 keV rest-frame band. The source net counts are computed in the 3-23 keV rest-frame band. The majority (∼71%) of Type 2 sources have instead LX below the quasar threshold and are mainly Seyfert galaxies. However, this is not an intrinsic difference, but is caused by having a flux-limited sample, i.e., sources at higher redshifts also have higher luminosities, and by the fact that Type 1 AGN are on average at higher redshifts ( z =1.74 for CCLS30) than Type 2 AGN ( z =1.23). Nonetheless, the difference between the two distributions may also suggest a trend with 2-10 keV luminosity of the fraction of Type 1 to Type 2 AGN, Type 2 AGN being more numerous at lower luminosities.
6. A significant fraction of optical Type 2 sources lie in the L 2−10keV =[10 42 -10 44 ] erg s −1 , N H,z <10 22 cm −2 area, i.e., these sources have unobscured AGN X-ray properties. In CCLS30, 172 Type 2 AGN lie in the unobscured AGN area (15.5% of the whole Type 2 AGN population), while the fraction slightly increases (24.0%) if we take into account only objects with a spectral type. The fraction of unobscured Type 2 AGN strongly decreases with increasing 2-10 keV luminosity ( Figure 11) and can be explained by a misclassification of lowluminosity BLAGN and/or by the lack of broad emission lines in intrinsically unobscured, low accretion AGN.
7. For the 141 CCLS30 sources best-fitted with a model which includes an iron Kα emission line at 6.4 keV, we computed the emission line equivalent width, EW. The mean (median) equivalent width of the 141 CCLS30 sources is EW=0.49±0.38 (0.39) keV. Fitting the EW distribution as a function of Γ with a linear model (EW= aΓ+b), we find a significant anti-correlation, with a=-0.22±0.05 ( Figure 12, left panel). However the correlation between EW and Γ disappears if we exclude from the sample the 12 candidate CT AGN we obtained on the basis of more complex fitting models, which include proper characterization of the dusty torus surrounding the SMBH. This implies that basic models may fail in fitting heavily obscured sources, fitting the flat spectra of these objects with flat, unphysical photon indexes and no N H,z .
8. We searched for a correlation between Γ and N H,z and the host galaxy mass (M * ) and star formation rate (SFR) for Type 2 AGN. M * and SFR have been computed using SED fitting techniques (Suh et al. in preparation). We do not find any significant correlation between Γ and either M * or SFR. We instead find a significant correlation between N H,z and both M * and SFR: objects with higher N H,z values also have higher M * and SFR (Figure 13). However, a partial correlation test showed that these relations are partially driven by a redshift selection effect.
9. We studied the properties of the 20 CCLS70 sources at z≥3. The mean (median) slope of these 20 sources is Γ=1.50±0.08 (1.49). 15 out of the 20 sources have only an upper limit on N H,z . Two sources have N H,z > 10 23 cm −2 . We also repeated the fitting using the pexrav model, to take into account the potential presence of a reflection component close to the black hole accretion disk. The reflection component affects the spectrum at energies greater than 30 keV in the rest-frame, i.e., inside the observed 0.5-7 keV energy range we use in our fitting. The presence of a reflection component implies a general steepening of Γ, with Γ =1.65±0.07. The evidence for a photon-index flatter than the standard value in our z ≥3 sample is confirmed by a fit of the stacked spectra of the 154 sources not in CCLS70, which have Γ=1.44 +0.16 −0.14 , while fitting the data with a powerlaw and Γ=1.63 +0.16 −0.13 adding a reflection component to the model. This work is the first step towards large sample of sources at high redshift (z>0.5) and medium-low luminosities (L 2−10keV =[10 42 -10 43 ] erg s −1 ), where X-ray spectral properties can be constrained. The next generation of X-ray missions, such as Athena+ (Nandra et al. 2013) or X-ray Surveyor (Vikhlinin et al. 2012), will provide even larger samples of such sources, significantly reducing selection biases and allowing statistical studies for column densities and SFR, searching for correlation between the central engine and the galaxy properties. Moreover, the improved statistics will allow to detect iron emission lines in the high-z universe (Georgakakis et al. 2013) and even measure redshifts from X-ray features (Comastri et al. 2004). are plotted as blue downwards triangles, sources with upper limit in Chandra but not in XMM-Newton are plotted as green leftwards triangles. Finally, upper limits on N H,z in both samples are shown as red stars. We also plot the 1:1 relation as a black solid line and the standard threshold adopted to divide unobscured from obscured sources (N H,z =10 22 cm −2 ) as black dashed lines. 304 sources out of 1010 (30%) have a N H,z value significant at 1σ in both Chandra and XMM-Newton. For these sources, there is a general good agreement in the N H,z estimates, at any range of values. 417 sources (41%) have instead an upper limit on N H,z in both Chandra and XMM-Newton.  Figure 16. Left : photon index Γ obtained fitting the XMM-Newton spectra as a function of Γ computed for CCLS70, for the 287 sources with Γ left free to vary in the spectral fitting. The 1:1 relation is plotted with a red solid line; the y = x× 1.2 (y = x× 1.4) relation is plotted with a red dashed (dotted) line. Right : intrinsic absorption N H,z obtained fitting the XMM-Newton spectra as a function of N H,z computed for CCLS30, for the 1010 sources in CCLS30 with XMM-COSMOS counterparts. Sources with N H,z significant value significant at 1σ are plotted as magenta circles, sources with upper limit in XMM-Newton and significant detection in Chandra are plotted as blue downwards triangles, sources with upper limit in Chandra and significant detection in XMM-Newton are plotted as green leftwards triangles, and sources with upper limits on N H,z in both samples are shown as red stars. The 1:1 relation is plotted as a black solid line, and the standard threshold adopted to divide unobscured from obscured sources (N H,z =10 22 cm −2 ) as black dashed lines.
The fraction of sources with a significant N H,z value in XMM-Newton and an upper limit in Chandra (205, 20%) is higher than the fraction of sources with significant value in Chandra and upper limit in XMM-Newton (88, 8%). Moreover, 108 sources are classified as obscured in XMM-Newton while have only an upper limit on N H,z in Chandra , and 59 sources are obscured in Chandra and have an upper limit in XMM-Newton. In the whole sample, 351 sources (35%) are obscured according to the XMM-Newton spectral information, in reasonable agreement with the Chandra results, where 329 sources (33%) have N H,z >10 22 cm −2 .
In conclusion, there is a general good agreement between the measures on N H,z obtained with Chandra and XMM-Newton. While 15-20% of the sources have significantly different N H,z values from the two different telescopes, this discrepancy can be explained by instrumental effects, as observed in the photon index measurements, or by a lack of statistics.