Human Capital Spillovers, Productivity and Regional Convergence in Spain

This paper analyses the differential impact of human capital, in terms of different levels of schooling, on regional productivity and convergence. The potential existence of geographical spillovers of human capital is also considered by applying spatial panel data techniques. The empirical analysis of Spanish provinces between 1980 and 2007 confirms the positive impact of human capital on regional productivity and convergence, but reveals no evidence of any positive geographical spillovers of human capital. In fact, in some specifications the spatial lag presented by tertiary studies has a negative effect on the variables under consideration.


Introduction and objectives
Assessing regional convergence is an important issue both at the supranational (see, for example, Arbia et al., 2009, for an analysis of the EU regions) and national levels. Spain provides a good example at the level of an individual country. The meta-analysis conducted by López-Bazo et al. (2001) included 19 studies and the number of published reports examining regional convergence in Spain has grown since then. The presence of significant gaps between Spain's northern and southern regions and the process of political decentralization in the country over the last few decades have attracted the attention of scholars interested in analysing the evolution and sources of this regional convergence (de la Fuente, 2002).
In this paper, we analyse the impact of human capital accumulation as a factor in accounting for regional differences in productivity and regional convergence in Spain. Indeed, the importance of human capital for economic growth has been highlighted in a number of studies. Mankiw et al. (1992) considered human capital an additional production factor in their proposed development of the Solow model, while endogenous growth models (Lucas, 1988;Romer, 1989) directly relate human capital to the adoption of technology. The main conclusion to be drawn from this strand of the literature is that countries and regions with higher levels of human capital can expect higher growth rates than territories with lower levels. However, despite the theoretical predictions of these models, the empirical evidence is inconclusive with studies reporting non-significant or even negative effects of human capital on growth (de la Fuente, 2006).
A number of explanations have been forwarded to account for this; however, the main criticism would appear to be that most studies rely primarily on human capital stock, typically proxied by the average number of years of schooling or the percentage of population having successfully completed their secondary or tertiary studies 1 . Recent papers have also suggested that different levels of schooling can have different effects on growth. Specifically, Petrakis and Stamatakis (2002) report that primary and secondary education matter more for growth in less developed countries than they do in more developed economies, where higher education acquires greater importance. Similar results are found by Vandenbussche et al. (2006) and by Pereira and St. Aubyn (2009). The only study to our knowledge to have considered this issue at the regional level is Di Liberto (2008). Turning her focus on Italy's regions, she reports that primary education seems to be important in the south, while tertiary studies have a negative impact on the country's northern regions. These results suggest that Italy has been unable to harness the positive returns from higher levels of education, with economic growth being more closely linked to low-tech activities in which a highly skilled labour force does not play a significant role. Yet, the impact of human capital is not confined to just one particular territory: human capital in one region can also have an influence on its neighbours. Various studies in the field of urban (Rauch, 1993;Rosenthal and Strange, 2008) and regional (Fingleton and López-Bazo, 2006;López-Bazo et al., 2004) Olejnik (2008) reports that the level of human capital in neighbouring locations has a negative influence on the level of per-capita income in a given region. According to Olejnik (op. cit.), a possible explanation for this is that an increase in the level of human resources in one region is caused largely by the migration of the educated population between neighbouring regions, which tends to have a negative impact.
Drawing on this earlier research, the objectives of the paper are twofold: first, to test the influence of different levels of schooling on productivity and regional convergence and, second, to analyse if there are any differences in the impact of human capital on neighbouring regions according to the former's composition. Our study focuses on Spain's NUTS III provinces for the period 1980-2007. The role played by human capital in promoting Spanish economic growth has been analysed in a number of studies with mixed findings. Thus, for example, while Gorostiaga (1999) found the estimated coefficient of human capital to be negative and significant, Serrano (1999) and Galindo-Martín and Alvárez-Herranz (2004) found that human capital has a positive impact on the production function. Other regional analyses, such as those conducted by de la Fuente (2002) and Freire-Serén (2002) have used NUTS-II data (autonomous communities) which is probably not the most appropriate regional dimension to test for the existence of geographical spillovers. This paper seeks to make an additional contribution by applying recently developed spatial econometric panel data techniques. The main advantage of using panel data as opposed to cross-sectional data is that it enables us to control for unobservable heterogeneity by including regional and time fixed effects. Moreover, although space (the geographical location of a unit of analysis in relation to others) has always played a key role in studies of this type, the existence of regional linkages has not been taken into account until recently (Fingleton, 2003, Abreu et al., 2005and Rey and Janikas, 2005. If spatial dependence is present (as is expected in regional data), it should be removed from the data since any violation of the independence assumption could result in misleading conclusions. The use of spatial econometrics techniques allows us to avoid this problem.
The rest of the paper is structured as follows. First, the methodology used in the study is described. Next, in section three, details regarding our data sources and the definitions adopted for our variables are provided while the empirical results are shown in section four.
The paper concludes with a summary of our main findings.

Methodology
In order to analyse the contribution of human capital to the growth of regional productivity, we draw on the model developed by de la Fuente and Domenech (2002), which has been used extensively in the literature on regional growth and human capital (de la Fuente et al., 2003 for Spain, Ciccone, 2004 for Italy and the Committee of the Regions, 2005 for France and Germany). The model is built around a regional production function and a technical progress relation that allows for the diffusion of technical know-how across regions. Specifically, we assume that the educational attainment of the population is one of the inputs in a constantreturns Cobb-Douglas aggregate production function. Based on the availability of statistical sources that permit the use of panel data, the log specification of the production function is as follows: where y it is the log of output per employed worker in province i at time t, k it is the log of the stock of physical capital per employed worker and h it is the log of the average number of schooling years.  t are time specific effects that control for all common shocks to the regions under consideration 3 while  i are regional specific effects 4 that control for all unobservable region-specific time invariant effects. The regional fixed effects may capture permanent differences in relative total factor productivity that will presumably reflect differences in R&D investment and other omitted variables. Finally,  it is an independently and identically distributed error term for i and t with zero mean and variance  2 .  k and  h are the parameters that summarise the factor contribution to regional productivity.
Given that here our specific objective is to analyse the impact of different levels of schooling, we decompose the level of human capital into three components that indicate the relative contribution of primary (p it ), secondary (s it ) and tertiary (t it ) studies to the log of the average number of years of schooling in a particular province at time t. 5 Equation (1) is then modified to take into account the potentially different effect of each of these components: As Fingleton and López-Bazo (2006) stress, most empirical analyses that seek to estimate production functions at the regional level consider regions as isolated economies. However, theoretical and empirical studies suggest that regions are neither homogeneous nor independent. If we ignore the influence of location on growth, our results could well be biased and any conclusions, therefore, misleading. For this reason, we have chosen to expand equation (2) to include the interaction between regions. Recent studies adopting a similar approach, such as Arbia et al. (2009), incorporate substantive spatial dependence, which means that spatial effects are propagated to neighbouring regions by means of endogenous as well as exogenous variables. Such specifications are strictly linked to theoretical growth 3 We have chosen to control for time period fixed effects even though we are aware that, as stressed by Elhorst (2009), applied researchers often find only weak evidence in support of spatial effects when time-period fixed effects are also being considered. The explanation seems to be that most variables tend to increase and decrease in parallel in the different regions over time (i.e., in the presence of a common business cycle). 4 The regional specific effects may be treated either as fixed effects or as random effects. In line with the literature, the decision as to whether to include fixed or random effects is taken here on the basis of the results of the Hausman test. Its joint significance is also tested using Likelihood Ratio (LR) tests. 5 The average number of years of schooling is a weighted average of the number of years of schooling associated with each educational level and the proportion of workers at each level. The contribution of each level to the average number of years of schooling is the result of multiplying the proportion of people with that level by the number of years required to actually obtain that level. For example, if the number of years associated with primary education is 5, secondary 10 and tertiary 15, and the relative proportions of workers are 10%, 60% and 30%, then the average number of years of schooling is 11 and the contributions made by each level are 0.5, 6 and 4.5 respectively. models that consider spatial externalities in the form of technology transfer or knowledge diffusion resulting from the accumulation of factors in the surrounding area (see, for example, López-Bazo et al., 2004).
If we now take this into account, equation (2) can be augmented to include spatial lags of the endogenous variable (i.e., the values of the endogenous variable observed in the neighbouring regions): where  is the spatial autoregressive coefficient and w ij is each of the elements in the spatial weights matrix W that describes the spatial arrangement of the different regions. In our empirical analysis, we consider geographical distance in order to define the elements of W, or more specifically the inverse of the great-circle distance 6 between provincial capitals 7 , which is exogenous to the relationship being analysed 8 . Following the literature, and in order to normalize the outside influence upon each region, the weight matrix has been standardized so that the elements in a row sum up to one.
Equation (3) is also augmented with spatially lagged independent variables that enable us to identify the existence of geographical spillovers among the regions under consideration. Specifically, the model that included physical capital and human capital spillovers would be as follows: The great-circle distance is the shortest distance between any two points on the surface of a sphere and was computed here using STATA's globdist command. 7 Latitude and longitude data for the capitals of the Spanish provinces were obtained from the Instituto Geográfico Nacional (http://www.ign.es/ign/es/IGN/BBDD_GRAVIMETRICO.jsp). 8 The results are robust to different specifications of the matrix, including the inverse of the distance to the square and the binary contiguity matrix for the 47 continental provinces. Detailed results are available from the authors on request.
Equation (4) can also be transformed to derive convergence equations. Here, growth in a region over a given period is inversely related to its initial income as a result of the mechanism of convergence towards its steady state caused by decreasing returns to capital accumulation 9 . Regional fixed effects and the additional variables in the specification (physical capital and human capital) control for factors determining differences in the steady states across regions. In particular, the convergence equation in the context of this model would be as follows: As Temple (2001) highlights, this specification is preferred to the analysis of the relation between the change in output and the change in education as in this case causality could run from output (or anticipated output) to education, and not vice versa. As long-run changes in average educational attainment are driven by government policy, it seems plausible that as output and tax revenues increase, governments will often allocate more resources to education, and attainment will rise for a transitional period. This critique does not, however, apply to the specification between output growth and the initial level of human capital as considered here. Moreover, the use of the number of years of schooling (rather than enrolment rates) and panel data means that it is less likely that reverse causation can account for a positive and significant effect of human capital on growth (de la Fuente and Domenech, 2006) 10 .
In the fourth section of the paper, we estimate the production function in levels and the convergence equation in order to test both the direct impact of human capital and its impact via geographical spillovers on regional development. In both cases, we apply Maximum 9 The derivation of the convergence equation from a production function framework with regional externalities can be found in López-Bazo et al. (2004), pp. 46-50. 10 An additional issue related to the estimation is the potential endogeneity of y it-1 in equation (5). However, the literature analysing the role of geographical spillovers has systematically ignored this issue. A notable exception is Badinger et al. (2004). Likehood (ML) procedures in estimating spatial panel data models as implemented in the MATLAB routines by Elhorst (2009) 11 . One advantage of ML procedures over the Instrumental Variables/Generalized Method of Moments (IV/GMM) proposed by Kelejian et al. (2006) is that the latter usually have to include spatially lagged independent variables, a requirement that would not allow us to test the influence of spatial spillovers.

Data sources, variable definition and preliminary analysis
As stated above, here we are analysing the influence of human capital on Spanish regional productivity and convergence during a period in which there was a marked accumulation of education and physical capital combined with an opening up of the country's trade following integration within the European Union. These routines are freely available at http://www.regroningen.nl/elhorst/software.html 12 http://www.ine.es/jaxi/menu.do?type=pcaxis&path=%2Ft35/p010&file=inebase&L=0 13 http://www.fbbva.es/TLFU/microsites/stock08/fbbva_stock08_index.html 14 http://www.fbbva.es/TLFU/microsites/stock08/mult/El_stock_de_capital_NM_2005.pdf 15 http://www.ivie.es/banco/caphumser07.php 16 Two different calculations for the average number of years of schooling are provided in the dataset in order to take into account the reforms introduced in 1990 by the Ley Orgánica General del Sistema Educativo. The data used in the paper are based on the educational levels prior to the reform (Ley General de Educación -1970) as most workers in our sample were educated under this earlier system. levels of education has made it possible to decompose the variable into three componentsnamely, primary, secondary and tertiary education. Figure 1 shows the evolution in the standard deviation of the log of regional GDP per worker (the usual tool to check for sigma-convergence) between 1980 and 2007. As we can see, regional disparities in labour productivity have decreased substantially in the study period: the value of the coefficient of variation has dropped from values around 8% in 1980 to 3% in 2007, although there has been a gradual stagnation since the mid-nineties. A similar conclusion can be reached when annualized growth rates of GDP per worker between 1980 and 2007 are regressed on the initial levels ( Figure 2). An analysis of the evolution in capital stock per worker and human capital indicators shows that both factors have positively influenced growth between 1980 and 2007 (Figures 3 and 4), but the reduction in regional differences has been much more intense in terms of schooling indicators. FIGURES 1 to 4 Table 1 shows the results of estimating beta convergence regressions when using panel data for the 1981-2006 period. The first column of the table shows the results for productivity without any control (unconditional convergence) and when including regional and time period fixed effects (conditional convergence). The results reinforce the idea of regional convergence -at a speed of 3.4% in the first case and 5.4% in the second 17,18 . The time required for the provinces to make up half the gap which separates them from a common steady state is 20.5 and 12.8 years when compared with their own steady state. Note that while the speed of conditional convergence for capital stock stands at around 3.8%, the value for the average number of years of schooling is above 8%. The average number of years of schooling of employed workers rose from 6.5 in 1980 to more than 11 by 2007. The speed of convergence, interpreted as the annual rate of convergence, is measured as -ln(1+T· y )/T where T is the number of years making up the period under consideration. Half life, defined as the time required for the economies to make up half the gap separating them from the steady state, is calculated as -ln(2)/ln(1+· y ). 18 As highlighted by Islam (1995), the natural rate of convergence in a panel data setup is generally believed to be substantially higher than the usual 2%. In particular, and according to the meta-analysis reported by Abreu et al. (2005), panel data usually provide a speed of convergence of around 6 percent. One possible explanation is that this approach allows (unobserved) technological differences across countries to be controlled for. Higgins et al. (2006) also argue that when the focus is on smaller regions, the speed of convergence increases.
The preliminary analysis of data seems to confirm the results of de la Fuente (2002) regarding the relevance of physical and human capital accumulation as a source of convergence between the Spanish regions in the period under review. In the next section, we estimate the models discussed in section 2 in order to confirm this preliminary evidence.

Results
In this section, we present our results after estimating the models discussed in section 2 on the data for the 50 Spanish provinces between 1980 and 2007. Specifically, the results of estimating the production function in levels are shown in Table 2, while the results of estimating convergence equations are shown in Table 3.
Here, it is worth mentioning that, although our empirical specifications incorporate regional externalities in terms of their theoretical considerations, we adopt the usual modelling approach: we begin by estimating a basic specification, without any spatial lags of the endogenous or exogenous variables and then we include, first, time-period and, second, regional fixed effects. Next, a Hausman test is calculated to select between fixed and random effects, on the one hand, and the joint significance of the effects, on the other. We also compute the LM and robust LM statistics (proposed by Anselin et al., 2006, andadapted by Elhorst, 2009, in the context of panel data) in order to test for the null hypothesis of no spatial lag of the endogenous variable and no spatial error in the models. In the case that both groups of tests should lead to the non rejection of the null hypothesis, this would imply that there were no geographical spillovers in the production function and the convergence equation.
However, in the case that the null hypothesis of no spatial lag or no spatial error are rejected, it would then be necessary to include a spatial lag of regional productivity or to consider a spatial error model, respectively.
Column 1 in Table 2 shows the results of estimating the basic specification of the production function when regional and time-period fixed effects are included. According to these estimates, we find that both physical capital stock and the average number of years of schooling for tertiary studies enter the equation with positive and significant coefficients. The magnitude of the coefficient for physical capital is around 0.7 which is clearly higher than estimates in previous studies (for example, de la Fuente et al., 2003, estimated the effect of physical capital at around 0.3). A Hausman test for choosing between the random and the fixed effect specification clearly discriminates in favour of the latter and the LR tests clearly reject the hypothesis of the no joint significance of the regional fixed effects. If we look at the results of the LM and robust LM tests, the LM test for no spatial error rejects the null at the 10% significance level while the robust test for no spatial lag rejects it at any significance level, while the robust test is also more significant for no spatial lag. Taking into account the results of these tests, the estimates are inconsistent as there are problems of spatial dependence (Anselin, 1988).

TABLE 2
Column 2 in Table 2 shows the results obtained when including the spatial lag of the endogenous variable. The results are not substantially different to those reported above. The spatial lag of the endogenous variable is positive and statistically significant and the physical capital stock and tertiary studies indicator enter the equation with positive and significant coefficients. The inclusion of the spatial lags of the explanatory variables (column 3 in Table   2) shows positive and significant geographical spillovers associated with physical capital, but spillovers associated with tertiary studies are negative and significant.
In the case of convergence, column 1 in Table 3 shows the results of estimating the basic specification of the convergence equation with regional and time-period fixed effects. As with the production function, a Hausman test has clearly discriminated in favour of regional fixed effects. As we can see from this table, the coefficient associated with the initial level of GDP per worker is negative and statistically significant, a result that reinforces our previous evidence of the existence of a convergence process between Spanish regions in the study period. The estimated speed of convergence is 7.2%. Physical capital stock and the indicator associated with secondary studies both have a positive and statistically significant influence on regional economic growth. However, the average number of years spent in primary and tertiary studies are not significant at the usual levels. Again, if we look at the results of the LM and robust LM tests, the conclusion is that the spatial lag model should be preferred in statistical terms to the spatial error models and, as a consequence, our results are inconsistent.

TABLE 3
The inclusion of the spatial lag of the endogenous variable does not substantially affect the results (column 2 in Table 3). The coefficient associated with this variable is positive and statistically significant, which implies that economic growth in neighbouring provinces exerts a positive influence on convergence. However, the speed of convergence is not affected by the inclusion of this variable. In the case of geographical spillovers (column 3 in Table 3), physical capital stock exerts a positive and significant effect on growth, while a high level of tertiary education in neighbouring provinces affects the growth rate of the province under consideration negatively -a similar result to that obtained in the production function specification. The spatial lag of the endogenous variable is now only significant at the 10% level.
In short, the empirical analysis reported in this section allows us to affirm that the accumulation of physical capital has a positive effect on regional productivity and growth, not only for the specific province under consideration but also for its neighbours. In the case of human capital, our results depend on the particular level of education: tertiary and secondary studies have a significant and positive effect on productivity and growth, respectively.
Primary education failed to exert any positive influence on either productivity or growth. The results are also robust in terms of the negative geographical spillovers from tertiary studies. A possible explanation for the negative effect of tertiary studies on a neighbouring region's growth (in a context of reduced geographical mobility of workers) is that the regions compete for highly qualified jobs in high added value sectors (Olejnik, 2008;Di Liberto, 2008). Fischer et al. (2009) provide a complementary explanation for these negative spillovers: they argue that it is relative regional advantages in human capital that matter most for labour productivity so, ceteris paribus, if neighbouring regions increase their human capital, the region under consideration will find itself in a worse relative position. Finally, it is worth mentioning that this evidence confirms our initial hypothesis regarding the different effects of the three levels of education. Moreover, our results are in line with those of Di Liberto (2008) for Italy and Pereira and St Aubyn (2009) for Portugal.

Final remarks
This paper has considered the effects of human capital spillovers in the Spanish regions between 1980 and 2007. Specifically, we have tested the influence of different levels of schooling on regional productivity and growth and, then, analysed whether there were any differences in the effects of this human capital on neighbouring regions reflecting its composition.
To do this, we have specified a standard production function and convergence equation and we have applied recently developed spatial panel econometric techniques to estimate the relationships under consideration. We detected a positive impact of physical capital on regional productivity and growth both in the region being considered and in those that neighbour it. The composition of human capital was also found to improve regional productivity and growth: tertiary studies have a significant and positive effect on productivity and likewise secondary studies have a similar impact on growth, but primary studies were found to have no effect on the variables under review. The results also point to the existence of negative geographical spillovers associated with tertiary studies. A possible explanation for this result is that regions compete for highly qualified jobs in high added value sectors (which means it is the relative level of human capital that actually matters) or, alternatively, that neighbouring regions attract qualified workers so as to exploit agglomeration economies.
Further research needs to be devoted to analysing the mechanisms that lead to the results reported here.