A spatial panel wage curve for Spain

Most empirical studies on the Spanish wage curve have ignored the possible spatial interaction effects between the regions. This paper reconsiders the Spanish wage curve using more recent data than previous studies and taking into account the role of regional spillovers. From a methodological perspective, we apply the two-step procedure proposed by Bell et al. (2002) to estimate a dynamic wage curve with spatial spillovers. In a first stage, we use microdata from the Spanish Social Security Records (Muestra Continua de Vidas Laborales) to obtain composition-corrected wages that are used in a second stage to estimate a wage curve over the period 2000–2010 allowing for spatial effects of unemployment across regions. Opposite to previous studies, we find that the wage equation is highly autoregressive and that regional spillovers are relevant to explain the relationship between unemployment and wages in the Spanish provinces.


Introduction and objectives
During the first half of the nineties, Blanchflower and Oswald (1990, 1994a, b, 1995 developed an ambitious research program on the relationship between individual wages and local unemployment rates using several databases for a wide set of countries at the individual level. The main result of these studies was the finding of an empirical regularity, which was called "the wage curve". This curve establishes an inverse relationship between individual wages and the local unemployment rate. More precisely, a worker living in an area of high unemployment earns a lower wage than those with identical characteristics living in an area with less unemployment. A more surprising result is that the elasticity of wages to unemployment is very close to −0.10, a value that is quite stable among countries-with diverse institutional frameworks-, among a range of time periods and for distinct samples or databases. Blanchflower and Oswald (2005) updated the analysis for the US labour market and found similar results. This finding contradicts macroeconomic studies which have concluded that one of the causes of European unemployment during the eighties and nineties was its relatively low wage flexibility when compared to the United States, Japan or Canada (Layard et al. 1991). In fact, this result seems to be quite robust as shown by Nijkamp and Poot (2005). These authors applied meta-analytic techniques on a sample of 208 elasticities for 30 countries derived from 17 different studies published between 1990 and 2001 and they found that an unbiased mean estimate (i.e., accounting for study heterogeneity such as the use of grouped versus microdata or disaggregated gender analysis) of the wage curve elasticity is about −0.07. Babecky et al. (2008) extended the research by Nijkamp and Poot (2005) by compiling a larger number of studies (64) and estimates (more than 1,400) but, more interestingly, to consider a larger number of countries (41) and more recent time periods including studies published since 2001 up to 2007. The analysis of more recent works was particularly relevant as wage curve estimates were available for a wider set of emerging economies and economies in transition (having clearly different institutional frameworks), but also to consider the potential effect of labour market reforms carried out in the 1990s. They conclude that there is evidence of significant differences among countries, time periods and groups of workers that cast doubt on the generalised validity of the "wage curve empirical law".
In the most recent years, the research on this topic has advanced along two parallel lines: consolidating the theoretical basis of the wage curve and considering different approaches to the specification of the wage curve.
From a theoretical perspective, the negative relationship between wages and regional unemployment is contradictory-at least, at first sight-with the theory of compensatory wage differences as given initially by Adam Smith and more formally expressed at the territory level by Harris and Todaro (1970) and by Hall (1972). According to the latter, there is a positive relationship between the two variables: the wage rate is higher in the areas of high unemployment to equalize annual earnings or wages plus unemployment insurance, in the different regions. So, the theoretical foundations of the wage curve are related to non-competitive labour market models.
In particular, earlier works have offered a wide variety of explanatory models, including implicit contracts, union bargaining, efficiency wages, and labour turnover costs (Blanchflower and Oswald 1994b;Campbell and Orszag 1998;Montuenga and Ramos 2005;Campbell 2008). However, there is currently a wide consensus that the most plausible explanations are related to efficiency wages and/or labour turnover costs (Blien et al. 2013).
From a methodological perspective, recent studies are increasingly using dynamic panel data using different methods to control for composition effects and spatial econometric techniques to account for inter-regional spillovers. As Longhi et al. (2006) highlight much of the existing wage curve literature have considered local labour markets as nonspatial "replications" of labour market outcomes within the national economy. However, if the wage curve is related to worker heterogeneity and monopsonistic competition, employment opportunities in surrounding regions and the interregional cost of commuting or migration also matter to explain the relationship between unemployment and wages. Longhi et al. (2006) found that spatial effects matter in the wage curve for Western Germany, an aspect that was found by the early work of Buettner (1999) and that has been reconfirmed by Elhorst et al. (2007) for East Germany and by Falk and Leoni (2011) for Austria.
From this perspective, the analysis of the wage-local unemployment relationship for Spain is especially relevant because of the institutional characteristics of its labour market. Although recent reforms have tried to improve the functioning of the Spanish labour market, during the last decades it has been characterised by an intermediate collective bargaining system with high coverage, high firing costs, a quite generous benefit system, and a very high volatility of employment with very high rates of unemployment during economic crisis. Consequently, a low elasticity of wages to unemployment or even the absence of a wage curve is expected. However, fixedterm contracts represent a very important share of the total private sector employees, implying a high turnover, and a certain monopsonistic power of firms to settle wages according to the evolution of the unemployment rate. An additional reason to analyse the relationship between wages and local unemployment for Spain is the limitations of previous studies on this topic in the light of this recent literature.
The results from earlier studies of the Spanish wage curve did not deviate significantly from the international previous literature (see, for instance, Canziani 1997; García-Mainar and Montuenga-Gomez 2003;Montuenga et al. 2003;or Sanromá and Ramos 2005). Recent studies have, however, started to consider the use of more sophisticated econometric methods and spatial aspects of the wage curve. In particular, García-Mainar and Montuenga-Gómez (2012) have estimated a dynamic wage curve for Spain, using panel data coming from the eight waves of the European Community Household Panel (ECHP) for the period 1994-2001. Estimated results seem to reveal that, contrary to most earlier empirical research for other countries, a static wage curve models fits well the case of Spain, since the autoregressive parameter is non-significantly different from 0, and with the elasticity wages to unemployment of −0.07. This can be interpreted as wages being low-degree sensitive to changes in unemployment, but wages adjusting very rapidly to such changes, at least during the period under consideration. However, as authors recognise this result must be treated with caution, given that the period analysed, first, may be too short to capture the dynamic behaviour of wages and, second, coincides with an expansive phase of the Spanish economy, during which unemployment fell sharply, employment increased strongly, and real wage growth was controlled. Bande et al. (2012) have estimated regional wage equations for the Spanish regions using data from the structure of earnings survey for 1995, 2002 and 2006 and considering the spatial heterogeneity of the curve. Although they work with only three groups of regions, their results show clear differences with regions with higher unemployment rates exhibiting a lower wage flexibility. This evidence is consistent with the model of collective bargaining in Spain because it reflects the important imitation effects that generate a weak sensitivity of individual wages to local labour market conditions. Taking this previous research as a starting point, the objective of the paper is twofold: first, to test the validity of the wage curve when considering more recent data for Spanish regions including a long expansion period from 2000 to 2007 and the first years of the economic crisis up to 2010 that have been characterised by a fast and strong increase in unemployment rates and, second, to analyse the robustness of the results to a wage curve specification based on a dynamic panel with spatial spillovers. In particular, our study focuses on Spanish NUTS III provinces using microdata from the Muestra Continua de Vidas Laborales (MCVL) and recently developed spatial econometric panel data techniques will be applied to estimate a dynamic wage curve including interregional spillovers. The main advantage of using panel data in relation to the use of cross-sectional data is that it permits to control for unobservable heterogeneity by the inclusion of individual, regional and time fixed effects. Moreover, although space has always played an important role in this literature, the existence of regional linkages have not been taken into account until some recent studies (Longhi et al. 2006;Elhorst et al. 2007) and no previous evidence exists for Spain. If spatial dependence is present (as it is expected to be on regional data), it should be removed from data because the violation of independence assumption may lead to misleading conclusions. The use of spatial econometrics techniques will permit to overcome this problem.
The rest of the paper is structured as follows: first, we described the data sources used in the analysis; second, the applied methodology is described and, next, we focus on the obtained empirical evidence. Last, the paper ends with some final remarks and ideas for further research.

Data
To design our study, we use the seven waves of the continuous sample of working histories (Muestra Continua de Vidas Laborales; hereafter MCVL) from 2004 until 2010. The data come from the registry of the social security system (SSS) for active people in the labour market and provides information on the computerized records of the Spanish Social Security and the Continuous Municipal Register. Since 2004, this database has provided annual information on more than one million people who have had some kind of work relationship with the social security every year, regardless of the duration or the nature of the relationship. It is an administrative data set with longitudinal information for a 4 % non-stratified random sample of the population who are affiliated with Spain's SSS). The MCVL is only representative of the population related to the SSS in the year of extraction, and it is not representative of the past: although it contains information on previous social security contributions by the individuals selected. So far, the MCVL reproduces the labour history of the affiliated starting from their first job. The MCVL is an appropriate database to study the labour market in Spain and, in fact, has several advantages when compared with other data sources such as for example the wage structure survey (WSS), because it provides more a higher territorial detail (the MCVL reports residential data at the level of municipality) and it has a longitudinal structure (for more details about the MCVL, see García-Perez 2008). The data set provides information of all of the historical relationships of any individual with the SSS (in terms of work and unemployment benefits). We have also information regarding the type of contract, sector of activity, qualification and earnings, date when entering or leaving the job market, part-time or full-time status and firm size. Moreover, the database contains information on gender, nationality, country of birth, residence, date of birth and level of education. In addition, the MCVL gives details about the establishment (location, number of workers, industry and sector) in which a worker is hired. The temporal dimension of the MCVL panel (2005)(2006)(2007)(2008)(2009)(2010) allows us to track the entry in the labour market of individuals and permits to avoid the attrition problem that we obtain when we to study the entries and exits of individuals in the job market using only one wave and looking back in the past of the worker relationship with the SSS. To build our sample we have merged the wave of 2010 with the wave from 2009 until 2005, and since we know all the work history of workers we go back until 2000. Considering years before to 2000 increase the risk of attrition, because the database is not representative at these years but only for the years where extraction is done. We have deleted people that not report a labour relation with the SSS. The main outcome variable of interest in our dataset is the wage of Spanish people. The MCVL has information on monthly earning paid mandatory to pension schemes of SSS and the days worked. Based on this information we calculate the logarithm of the daily wage. Regarding wages, we have used the daily wage, calculated as the ratio of annual earnings to days worked. We have eliminated observations when the daily earnings were below the minimum base or exceeded the maximum base to avoid the problem of censured data. As unemployment data by provinces from the Instituto Nacional de Estadística (INE)'s labour force survey that are required in our second step estimation are only available at the quarterly frequency, only four data points per year could be used. In particular, we have kept people who report a wage in February, May, August or November for more than 2 years in the considered period. This allows us to take in consideration the seasonality across the year. In addition, this permits to include individual fixed effects in the first stage of our modelling approach. If the individual reported more than one job, we take the one with greatest earning. Quarterly averages of provincial Consumer Price Indexes used to deflate wages were obtained from INE. We have only considered individuals from 15 to 65 years old and we have taken a random sample of 97,903 individuals with a total of 3,144,929 observations. Table 1 depicts the summary description of the variables that will be used to estimate the wage equation. In particular, apart of daily wages, we consider age, schooling levels, tenure, occupation, public sector worker, part-time, permanent contract, firm size (proxied by the number of workers in the firm), activity sector and province of residence (NUTS III regions).

Methodology
The specification of the wage curve used by (Blanchflower and Oswald 1990, 1994a, b, 1995 consisted in regressing the logarithm of individual wages on a number of control variables related to personal and job characteristics and the regional unemployment rate. However, a difficulty arises because this equation includes an explanatory variable of interest (the regional unemployment rate) that is defined at a higher level of aggregation than the dependent variable (individual). As Moulton (1986) shows, the estimation of this kind of equation will bias upward the values of the test of individual significance for this variable. Moreover, the inclusion of additional variables in order to correct for the possible omission of relevant variables at the regional level usually induces collinearity problems. For these reasons, more recent studies usually follow a two-step procedure as in Bell et al. (2002). The first step consists of a Mincer equation estimated at the individual level, including time-varying, individual fixed effects and regional dummies. These dummies can be interpreted as average wages in the local labour market, corrected for composition effects. In particular, the logarithm of wages was regressed on a number of control variables related to personal and job characteristics together with the regional dummies: where ln(w irt ) is the natural logarithm of the wage of the individual i that lives in region r at time t, Z irt is a set of individual factors that can affect wages of the individual, such as the level of schooling, his/her experience or other characteristics such as occupation, γ t is a time trend that control for all common shocks to the considered regions while δ rt are region specific effects that can be interpreted as average wages in region r at time t corrected for composition effects. As each individual is observed at least for 2 periods, individual fixed effects (τ i ) have also been included in equation to control for time-invariant unobserved heterogeneity. This implies, however, that some usual controls in the Mincer equation such as gender cannot be included when estimating Eq. (1). Finally, ε irt is a random error term which follows a normal distribution with zero average and constant variance. In the second step, the wage curve is estimated using the composition corrected wages, δ rt obtained in the first step, as the endogenous variable and the natural logarithm of the regional unemployment is introduced as explanatory variable together with time and regional fixed effects: As highlighted by Longhi et al. (2006) and Elhorst et al. (2007), most empirical analysis of the wage curves assume that regions are isolated economies. However, theoretical and empirical arguments suggest that regions, as well as not being homogeneous, are also not independent. From a theoretical point of view, the labour market conditions in neighbouring regions could influence commuting and/or migration decisions that, at the same time, could affect wage bargaining in the considered region (see, for instance, Longhi 2012). Moreover, from an empirical perspective and as highlighted by Elhorst (2003), if the influence of spatial linkages is ignored, results could be biased and hence conclusions could be misleading. For this reason, Eq. (2) is augmented including the possibility that the evolution of unemployment in neighbouring regions could also affect wages in the considered regions. Recent studies adopting a similar approach incorporate substantive spatial dependence, meaning that spatial effects propagate to neighbouring regions by means of endogenous as well as exogenous variables. Taking this into account, Eq. (3) is augmented including the spatial lag of the endogenous variable (values of the endogenous variable observed in neighbouring regions), but also of the regional unemployment rate: where μ is the spatial autoregressive coefficient, θ is the parameter associated to regional spillovers in the unemployment rate and m jr is each of the elements of the spatial weights matrix M 1 that describes the spatial arrangement of the different regions. In our empirical analysis, we will consider two different time-invariant spatial weight matrixes: first, a binary contiguity matrix has been used, and second, geographical distance has been used to define the elements of M, more precisely the inverse of great-circle distance 2 between province capitals 3 , which is exogenous to the analysed relationship 4 . As usual in the literature, and in order to normalize the outside influence upon each region, in both cases, the weight matrix has been standardized such that the elements of a row sum up to one. Last, and following Baltagi et al. (2009Baltagi et al. ( , 2012, Eq. (3) is enlarged with the lag of the composition corrected wages to account for wage inertia: The next section shows the results of estimating models described up to now.

Results
In this section, we present the results of estimating the models discussed in the previous one. In particular, the results of estimating Eq. (1) by panel data fixed effects 1 Although the spatial weight matrix is usually denoted by W , here and in order to avoid confusion with wages, we follow the notation by Kelejian and Robinson (1997) and the spatial weight matrix is denoted by M. 2 The great-circle distance is the shortest distance between any two points on the surface of a sphere and it has been computed using STATA's globdist command. 3 Latitude and longitude data for capital cities of the Spanish provinces have been obtained from the Instituto Geográfico Nacional (http://www.ign.es/ign/es/IGN/BBDD_GRAVIMETRICO.jsp). 4 The results are robust to different specifications of the matrix such as the inverse of the distance to the square. The detailed results are available from the authors on request. are shown in Table 2. As we can see in Table 2, variables related to educational levels and tenure were significant and showed the expected signs, detecting a positive relationship between human capital and wages. Concerning the individual of reference, workers with educational levels above elementary levels receive higher wages. The accumulation of professional experience also has a positive effect on wages. People who work only part-time received significantly lower wages than those who work full-time. The dummy variables related to the occupations expressed the effect of job characteristics as well as the additional required qualification. We had taken low qualified jobs as the base category. The estimation results showed that, when controlling for other factors, workers in high qualified jobs earned about significantly more. The information related to industry sectors permitted, on the one hand, to control for the effect of the various productive and employment structures in the various provinces and, on the other, to provide more information about job characteristics not previously considered.
In Eq.
(2), we use the estimates of time-varying region specific effects from Eq. (1) as the endogenous variable and the natural logarithm of the regional unemployment is introduced as explanatory variable together with time and regional fixed effects.
The results of estimating Eq.
(2) are shown in Table 3. In this sense, it is worth mentioning that, although our empirical specifications incorporate regional externalities on the basis of theoretical considerations, we apply the usual modelling approach: first, we have started estimating a basic specification, without spatial lags of the endogenous or the exogenous variables and including time-period and regional fixed effects. Second, a Hausman test to select between fixed and random effects and the joint significance of the effects has been calculated. Next, we have computed the LM and robust LM statistics [proposed by Anselin et al. (2006) and adapted by Elhorst (2009) to the context of panel data] in order to test for the null hypothesis of no spatial lag of the endogenous variable and no spatial error in the models. In the case that both groups of tests lead to the non-rejection of the null hypothesis, this will imply that there are no geographical spillovers in the wage curve. However, in case the null hypothesis of no spatial lag or no spatial error is rejected, then it will be necessary to include a spatial lag of corrected wages or to consider a spatial error model, respectively.
Column 1 of Table 3 shows the results of estimating the basic specification of the wage curve including regional fixed effects, a time trend and seasonal dummies (Eq. 2). The model is estimated by GMM following the Arellano and Bond (1998) proposal for dynamic panel models. According to these estimates, we find an elasticity of wages to unemployment of −0.068, lower than the value found by Blanchflower and Oswald (1994a, b) but close to the value found in the meta-analysis by Nijkamp and Poot (2005). A Hausman test for choosing between the random and the fixed effect specification clearly discriminates in favour of the latter and the LR test clearly rejects the hypothesis of the no joint significance of the regional fixed effects. If we look at the results of the LM and robust LM tests using the spatial weight matrix based on contiguity, the LM test for no spatial error rejects the null at the 1 % significance level while the robust test for no spatial lag rejects it any significance level, while the robust test is also more significant for no spatial lag. Taken into account the results of   1,534.14*** *** p < 0.01; ** p < 0.05; * p < 0.1 these tests, as there are problems of spatial dependence, the estimates are inconsistent (Anselin 1988). 5 Column 2 of Table 3 shows the results obtained when including the spatial lag of the endogenous variable using the spatial weight matrix based on contiguity (Eq. 3) and applying the GMM method proposed by Han and Phillips (2010) for spatial dynamic panels. The results are not substantially different to the previous ones. The spatial lag of the endogenous variable is positive and statistical significant, with a value clear below unity, and unemployment rate enters the equation with negative and significant coefficient. The value of the coefficient is lower than in column 1, but the total effect of unemployment including direct and indirect effects is still around −0.07. As pointed out by Lesage and Pace (2008) and Corrado and Fingleton (2012), the presence in the model of spatial interdependencies and simultaneous feedback lead to a total effect that differs from the usual regression coefficient (β). In particular, the total effect of unemployment on composition-corrected wages is the sum of the direct effect, the impact in the considered region arising from changes in unemployment in the same region, and the indirect effect, the impact in the considered region due to changes in unemployment in the neighbouring regions and also taking into account the fact that it is a dynamic model. It is calculated in the following way: where I is the identity matrix and L is the time lag operator. As the outcome of this expression varies according to the region considered, these effects are summarised by their mean. The inclusion of the spatial lag of the unemployment variable (column 3 of Table 2) shows negative and significant geographical spillovers. The overall effect of the elasticity of wages to unemployment increases up to −0.09. Columns 4 and 5 of Table 2 show the results of estimating similar models but using the spatial weight matrix based on the inverse of distance. No significant change from previous results is appreciated.
Column 1 of Table 4 shows the results of estimating the dynamic specification of the wage curve (Eq. 4). According to the results in column 1, there is a high degree of inertia in regional wages, that is related to the peculiarities of the Spanish collective bargaining system where sectoral agreements at the regional level predominate and interregional wage differences persist (Simón et al. 2006). The elasticity of wages to unemployment is negative and statistically significant and its value is substantially lower (−0.021). However, the long run response is higher than the one found in the static model: −0.138. The inclusion of the spatial lag of corrected wages in models 2-3 and 5-6 change the picture obtained when using the static model. In particular, the spatial lag of unemployment is no longer statistically significant, while the spatial lag of corrected wages is the most relevant explanatory variable. However, this could be due to multicollinearity between the two spatially lagged variables (wages and unemployment). In fact, in models 4 and 7, neighbours' unemployment seems to affect regional wage dynamics.

Final remarks
The objective of this paper was twofold: first, to obtain recent estimates of the Spanish wage curve, and second, to test whether the application of dynamic models which are augmented by a spatially weighted unemployment rate fit the Spanish data. Our results confirm the finding of a dynamic wage curve, i.e. a significant coefficient on lagged wages that, however, is far from unity, with a wage elasticity with respect to unemployment relatively small but significant (−0.02) in the short-run but more than double and close to −0.1 in the long-run. It is remarkable that the spatial lags of unemployment rate and wages are relevant to explain regional wage determination. Table 4 Dynamic wage curve with regional spillovers  In fact, as previously mentioned, the high relevance of neighbours' wages implies a very low regional wage differentiation that, as previously mentioned, it is related to the peculiarities of the Spanish collective bargaining system (Simón et al. 2006). Further research will explore whether spatial heterogeneity is also present in the relationship between unemployment in wages for Spain. According to Deller (2011), in some Southern US counties no wage curve is found, a result in line with the Harris-Todaro model while in the Midwest and a range from Montana south to Texas along with most of the California-Nevada border region a wage curve is found. For the Spanish case, Bande et al. (2012) also advances in this direction. To analyse whether there are spatial differences in the relationship between wages and unemployment, geographically weighted regression (GWR) could be applied in this context as in Deller (2011). The main advantage of GWR is that it allows regional variations in the coefficients, but considering also the influence of neighbouring regions (Fotheringham et al. 2002). In fact, when applying this technique it is assumed that parameters of neighbouring regions are similar. So, in the estimation of the parameters for a particular region, only a subset of neighbouring regions is used. However, although there are some recent methodological proposals on how to apply geographically weighted regressions when using panel data, there is no clear consensus on how to implement them (Yu 2012;Bruna and Yu 2013).