Overeducation and Local Labour Markets in Spain

The objective of this paper is to analyze the influence of individual variables and some characteristics related to spatial mobility in regional labour markets on overeducation in Spain. With this aim, we use microdata from the Spanish Budget Family Survey to estimate a logit model for overeducation probability taking into account the problem of selection bias and the presence of data of different levels (individuals and territory). The obtained results permit us to conclude that the size of local labour markets and the possibility of extending the job search to other labour markets through commuting are relevant factors to explain overeducation in the Spanish labour market. In spite of the differences in terms of labour market institutions, our results are very similar to the ones obtained for other countries.


INTRODUCTION AND OBJECTIVES
One aspect that has been scarcely considered in the literature on overeducation is its relationship with the territory. The link between both is related with the hypothesis of the differential overeducation.
The idea is that overeducation will basically affect married women as their job search is restricted to the local labour market where they live, while the husband could search for a more adequate job according to his schooling in a wider labour market. Some studies, such as Büchel and Battu (2003), Büchel and Van Ham (2003), Hensen et al. (2009) and Quinn and Rubb (2011) have highlighted the role of regional labour market (mainly aspects related with commuting) as a potential explanatory variable of overeducation and not only of the differential overeducation of married women. In fact, Büchel and Van Ham (2003) and Hensen et al. (2009) have obtained empirical evidence in favour of the relevance for overeducation of regional variables related with the spatial distribution of employment and with the size of local labour markets. Their results show that the possibility of acceding to wider geographical areas when searching for job decreases the probability of being overeducated for men and women (and not only for married women).
The objective of this paper is to test the influence of individual variables and some characteristics related to spatial mobility of the labour force in regional labour markets on overeducation in the Spanish economy. With this aim, we use micro data from the Encuesta de Presupuesto Familiares 1990-91 (Family Budget Survey). Although the primary purpose of this survey is to analyse the consumer expenditure of the Spanish families, it also provides information about wages, individual and workplace characteristics. This dataset provides the greatest territorial detail when compared with more recent databases of similar content. In particular, data from this survey allows us to control the territorial dimension at a provincial level -the territorial administrative unit corresponding to the NUTS-III level of classification. This territorial unit is sufficiently small to allow us to assume that provinces approximate local labour markets. Our results do not support the hypothesis of the differential overeducation, but confirm that the size of the local labour market and the possibility of searching job in a wider area through commuting are important factors when explaining overeducation in the Spanish labour market.
The rest of the paper is structured as follows. First, in the next section, the literature on the topic is briefly summarised. In the third section, the database is described and a measure of overeducation is 2 calculated for the Spanish case. In the fourth section, the applied econometric methodology is explained and the empirical results are shown. Last, the paper concludes summarising the main results.

LITERATURE REVIEW
Taking as a starting point the seminal contributions by Freeman (1976) and by Duncan and Hoffman (1981), overeducation has been analysed in several countries 1 . The Spanish case has not been different and since the study by Alba-Ramírez (1993), different authors have analysed the relevance of overeducation in the Spanish economy. However, an aspect that has not been considered in this literature is the relationship between overeducation and territory. The hypothesis of differential overeducation was first introduced by Frank in 1978 for the United States. The objective of this study was to explain wage differences between men and women taking into account the role of overeducation. Frank (1978) assumes that individuals try to maximize their income when they search for jobs and minimize overeducation (the difference between their educational levels and the educational requirements of the job vacancy). This search process takes place in the global labour market for single individuals -as they can easily migrate-but not for couples. In this last case, a joint decision must be taken where the family member with a higher educational level and working a higher number of hours (the one with higher income is usually the husband in the context of Frank's study) searches job in the global labour market minimizing his/her overeducation. However, the other family member (usually the wife) searches the best possible job taking into account the limitations of the local labour market where the family lives. Only by chance, the wife could also minimize her overeducation in the case of an optimum match. This spatial restriction would usually produce that the wife's overeducation is higher than the husband's one.
Moreover, in this case, the wife's overeducation will be related with the size of the local labour market, as the smaller it is, the lower will be the number of job vacancies that she could accede.
Taking into account this hypothesis and according to Frank (1978), wage differences by gender would be related to overeducation, as it would affect predominantly to married women. The empirical evidence obtained by this author is in favour of this hypothesis: the smaller the local labour market, the higher wage differences by gender.
Two decades later and in a new context of generalisation of studies on overeducation, McGoldrick and Robst (1996) tested directly Frank's hypothesis instead of estimating wage equations. Using data for the United States in 1985, they calculated overeducation measures using three different methods and, in the three cases, they found that married women had higher levels of overeducation than their husbands, but the size of the local labour market did not have any significant effect on this difference.

3
Opposite to Frank (1978), they did not find evidence in favour of the hypothesis of differential overeducation García-Serrano and Malo (1997) test this hypothesis for the Spanish case using data from the Encuesta de Estructura, Conciencia y Biografía de Clase of 1991. Their results do not totally confirm the hypothesis of differential overeducation. They do not find a clear incidence of the gender on educational mismatch, although this variable is statistically significant when interacts with the age or with a variable picking up the fact of living together. So, they find that there are differences in Spain in terms of the educational mismatch, but this mismatch is not explained by the size of the local labour market. In fact, the size of the local labour market and being a women living with the couple negatively affect the probability of being infraeducated, but not the probability of being overeducated. Büchel and Battu (2003) open a new research line extending Frank's model to consider the possibility of commuting. Using this new framework, they test the hypothesis of the differential overeducation using data for 1995 of the German economy to calculate a subjective measure of overeducation and considering the role of spatial mismatch. Their results seem to confirm the hypothesis as the probability of being overeducated is higher for married women living in rural areas (less than 20000 inhabitants). According to these authors, the probability that a woman finds a job in line with her educational level not only depends of the size of the local labour market (the original idea by Frank) but also of the possibilities of acceding other local labour markets through commuting. When the commuting distance is included in the model, the probability of being overeducated decreases for higher distances. Moreover, and more relevant, there is no evidence in favour of the hypothesis of differential overeducation. The results show that, once the commuting distance is controlled, the risk of being overeducated is higher for couples in rural areas for both members of the couple and not specifically for the woman.
Taking into account these previous results, Büchel and Van Ham (2003) develop a theoretical framework relating the existence of overeducation at the individual level (both for men and women) with the availability of job opportunities. Following Simpson (1992), they highlight that an individual searching for job in a particular local labour market has three options when in this market there is no appropriate job for him/her: The first option is not to accept the job and continue the search (unemployment); the second option consists in accepting a job in this local labour market but with lower educational requirements than the ones he/she has (overeducation); and, the third options consists in accepting a job in a different local labour market, probably assuming a commuting distance higher than desired. The central aspect of the analysis by Büchel and van Ham (2003) consists in analyzing which is the role of job opportunities in local labour markets (unemployment rates) and commuting (availability of private transport and commuting time) to explain the probability of being overeducated. As in Frank (1978), geographical restrictions play a key role in explaining overeducation, but Büchel and Van Ham (2003) extend the effect to all workers (and not only women) and they consider the possibility of extending the job search to other local labour markets through commuting. Their results for the German labour market show that regional variables related with the spatial distribution of employments explain overeducation. Moreover, a higher mobility of individuals (owning a car or making longer commuting) permit to increase the "effective" size of the labour market, which decreases the probability of being overeducated. A similar result is found by Hensen et al. (2009), although they have not explicitly considered gender differences when analysing the relationship between geographic mobility and education-job mismatch in the Netherlands 2 .
More recently, Quin and Rubb (2011) have used the Panel Study on Income Dynamics (PSID) for the United States, a longitudinal dataset that permits to control the timing of migration and its results in terms of the education-employment match. Their results suggest that migration often leads wives to exit fulltime paid employment in larger numbers than husbands. In fact, migration tends to lower the level of overeducation for men more robustly than for women, a result supporting the differential overeducation hypothesis. In fact, their analysis suggests that an overeducated wife may be more willing to migrate to improve her husband's career and/or to exit the labour force rather than remain overeducated.
Taking these contributions into account, the objective of the paper is to identify the explanatory factors of being overeducated and, in particular, to analyse the effect of some of them that are related with the territory. In a first stage, the hypothesis of differential overeducation by Frank (1978) will be tested, and next, the effects of different territorial variables, such as the local labour market size or the possibilities of commuting, would be considered. For the Spanish case, the empirical literature about commuting decisions would be, a priori, in favour of the hypothesis of differential overeducation. Artís et al. (2000) estimate a multinomial logit model to analyse commuting using data for Catalonia for 1991. Their results show that the probability of commuting is clearly lower for wives, especially those that are also mothers. Romaní et al. (2001) analyse the interactions between the location of housing and the location of the working place, confirming the previous results regarding the marital status and the motherhood. Casado (2000) obtains similar results using data for the Comunidad Valenciana for 1991. It is worth mentioning that according to this author, the probability of commuting reduces when the number of sons increases. These results are in line with the asymmetric domestic work sharing found in Spain in that moment. Moreover, it is possible to think that commuting distances would also be different, as there is evidence that the fact of being the main income provider in the household increases the probability of commuting and of commuting using private transport (car) (Matas, 1991 andMatas et al. 2009). This evidence regarding differential commuting makes even more interesting the analysis in the paper.
Before showing the results of the econometric analysis, in the next section the used database is described and the results of calculating a measure of overeducation for the Spanish economy using this database are shown.

The Spanish Family Budget Survey
The estimates presented here are based on individual data from the Encuesta de Presupuestos unemployed. This is the sample used for the econometric analysis of the following section.

The measurement of the educational mismatch
Taking into account that the Encuesta de Presupuestos Familiares does not contain subjective information about overeducation but it provides detailed information of the schooling levels and the occupation, we have used the statistical method 3 to obtain our overeducation measure. The EPF disaggregates 14 different schooling level and ten of them can be applied to individuals older than 16.
It also provides detailed information on 81 occupations, a 2-digit disaggregation that is lower than the optimum (3-digits of occupational classifications). For this reason, it is not possible to apply the corrected mode criterion as the number of categories where the mode will not be, at least, the 60% of total individuals is quite high. For this reason, we have used the average criterion.

6
The results of applying this procedure to 21.359 employed individuals that were not carrying formal studies during 1990-1991 show that the 14.6% of them were overeducated. The comparison with the results of previous studies available for that period (table 1) permits to observe that when the objective method is applied to the Spanish Labour Force Survey (EPA) the results are clearly lower, while the subjective method applied to the Encuesta de Estructura, Conciencia y Biografía de Clase (EECBC) provides a higher value. The results of the statistical procedure will be in the middle. The studies applying this method -with the average criterion-to databases referring to 1991 find a similar percentage to the one found here: a 15.9% for the EECBC and an 8.9% for the EPA. The criterion of the corrected method by Oliver and Raymond (2002) also provides a similar value to the one found here. So, in spite of the limitations of the applied procedure, the results found when using data from the Encuesta de Presupuestos Familiares are in line with the ones found by previous studies for the Spanish economy in the same period.
An empirical regularity in the literature of overeducation is that the probability of being overeducated is higher for more educated workers. We have calculated the relevance of overeducation in the sample of 4.889 employed individuals with, at least, secondary education 4 . The incidence of overeducation is four points higher than the global one: 18.7% and 14.6%, respectively. By gender, the result also changes, as now overeducation is slightly more relevant in women (19.1%) than in men (18.5%).
Taking into account that unemployment is higher among women, one could think that a higher unemployment rate could push women to accept jobs with lower educational requirement than the one they have. Both phenomena (employment opportunities and overeducation) could be clearly interrelated, aspect that is further developed in the next section. Source: Blanco (1997) and own elaboration

Methodology
The objective of the paper is to identify the explanatory factors of being overeducated, and, in particular, to analyse the possible effects of some variables related to the territory. In particular, we want to explain the behaviour of a dichotomous value taking value 1 when the individual is overeducated and value 0 on the contrary. In this context, the multiple lineal regression models are not appropriate. It is much more interesting to answer questions such as: Which is the probability of being overeducated taking into account the characteristics of the individual, of his/her job or the territory where he/she lives?. This kind of questions can be easily answered using logit models that permit to estimate (using maximum likelihood techniques) the increase in probability of being overeducated after marginal variations of the explanatory variables.
However, one additional aspect that should be considered is the presence of the explanatory variables related to individual characteristics and other related to regional characteristics. If the presence of data of different levels is not considered, the inference from the model could be seriously affected. In this context, the usual solution consists in the specification and estimation of multilevel models (Goldstein, 1995;Hox, 1998 or Goldstein andRasbash, 1996). The results in the next section rely on the specification and estimation of this kind of models. 5 An additional aspect that should be mentioned before starting the empirical analysis is that it is impossible to know for an unemployed worker if he/she is overeducated or not as they cannot be assigned to any particular occupation. However, it is possible that an unemployed worker prefers to extend his/her job search when he/she is offered a job with lower educational requirements than his/hers. If this possibility is not considered in the empirical analysis, the results could be incorrect. This problem is known in the literature as "selection bias". To solve this problem, we have applied the two steps procedure by Heckman (1979). The first step of the procedure consists in analyzing the probability of being employed, while the second consists in the analysis of the probability of being overeducated including an additional explanatory variable known as Heckman's lambda (obtained in the first step as the inverse of the Mills' ratio 6 ). Taking into account the binary nature of both variables, logit models would be used in the two steps.

Results
In this section, the results of estimating different models to identify the determinants of being employed and being overeducated in the Spanish labour market are presented. The results of testing the hypothesis of differential overeducation and the influence of variables related to the territory on overeducation for the Spanish economy are shown in tables 2 and 3.

9
The results of estimating a logit model explaining the probability of being employed or not (the first stage of the Heckman procedure) are shown in the first column of each table. The results of estimating a logit model for overeducation without taking into account the problem of selection bias are shown in the second column.
These results are shown to analyse the effects of introducing the Heckman's lambda in the model. In this sense, the third column of each table belongs to the estimation of the second stage of the Heckman's procedure.
Regarding the test of the differential overeducation for the Spanish case, the results are shown in table 2.
With the aim of testing this hypothesis, we have included in each of the three previously mentioned models, variables related to individual characteristics but also related with the local labour market, the size of the town where the individual lives or the unemployment rate 7 .
The educational level and the experience have positive and significant effects on the probability of being employed, while the fact of being woman and living in couple, or being son and living with parents reduce that probability. Local labour market conditions are also relevant, as the provincial unemployment rate is negative and statistically significant. From the results of estimating this model, we have calculated the values of Heckman's lambda for each individual that will be used as an explanatory variable in the following models in order to control for selection bias.
The results of estimating explanatory models for overeducation are shown in the second and third columns of table 2. The results in both columns only differ in terms of the inclusion/not inclusion of Heckman's lambda, which is not statistically significant. Two of these results deserve a brief comment. On one hand, the probability of being overeducated increases with the educational level. This is a usual result in the international literature, but also for the Spanish economy. On the other hand, potential experience does not affect the probability of being overeducated. Although perhaps this is not a rigorous test, this result does not confirm the substituibility between education and other forms of human capital postulated by the human capital theory 8 .
In order to test the hypothesis of differential overeducation, we have introduced as explanatory variables interactions between the gender, the marital status and the size of the residence town taking as base category the married women living in small towns. The hypothesis of differential overeducation would not be rejected if the rest of categories showed a lower probability of overeducation. Although most signs are negative, the only negative and significant variables are the ones regarding married men living in big towns and single women living in big towns 9 , with no significant differences among the other categories. These results permit to reject the hypothesis by Frank (1978), a similar result to the ones by McGoldrick and Robst (1996) and by García Serrano and Malo (1997) and by Büchel and Battu (2003) after introducing the possibility of commuting.  However, the results suggest that overeducation could be more related with the size of the local labour market (town) than with the gender or the marital status. This is just the hypothesis by Büchel and van Ham (2003). Following these authors, we have analyzed the effect on the probability of being overeducated of the size of the local labour market and the possibilities of acceding to a higher area to search for jobs. In particular, we have proxied the local labour market size by including in the model the number of inhabitants in the town of residence of the individual. The available information in the EPF has been summarised in two categories: town of more of 50.000 inhabitants or province capital and the rest of towns. The possibility of widening the spatial job search in order to achieve a better match that would avoid overeducation has been considered through the inclusion in the model of two variables: one at the individual level, the availability of private transport, and other provincial, the number of road kilometres respect to the number of cars 10 .
The results of the first stage of the Heckman procedure (first column of table 3) are similar to the previous ones. The results regarding overeducation (columns 2 and 3) confirm that the probability of being overeducated is higher for more educated workers.
Heckman's lambda is statistically significant at a 10% level, which can be interpreted as favourable evidence in the Spanish labour market of the previously mentioned strategy consisting in not accepting an "overeducated" job but waiting for an optimum one. This strategy is consistent with the fact that in Spain there is a very important family protection network against unemployment and a quite generous unemployment benefit system (in 1990-91 and in relative terms with other developed countries -OCDE, 1994). However, if unemployment is high and long-lasting, there will be limits to the validity of the strategy, and to the significance of lambda.
Again, experience is not significant reinforcing the previously found evidence contrary to the substituibility between human capital components. It is worth mentioning that the gender does not seem to have any effect on overeducation. Although overeducation measures can be different between men and women, once other characteristics are controlled, there are no significant differences 11 .
In the case of territorial variables, the most relevant for this study, they clearly affect overeducation with the expected signs. On one hand, the fact of living in a small town and as a result having less job opportunities, increases the probability of being overeducated. It is especially interesting the fact that the possibility of searching job in a wider area, reduces significantly the probability of overeducation. In fact, the availability of private transport 12 and the level of infrastructure of the considered region (proxies by the number of road kilometres by vehicle) are statistically significant and with the expected negative sign 13 . So, the probability of being overeducated is partially explained by spatial factors, such as the size of the local labour market or the possibility of "increasing" its size through searching job in more distant labour markets thanks to the availability of private transport and a good road network.

13
These results are in line with the ones obtained for the German labour market by Büchel and Van Ham (2003) and for the United States by Quin and Rubb (2011) and reinforce the idea that the spatial dimension of the problem is more relevant than the differential overeducation of women. Moreover, the obtained results can be understood as possible differences in terms of overeducation between men and women could be determined by the difficulties of women to commute due to their higher responsibilities in home duties, especially when they have short-age children.

CONCLUSIONS
The objective of this paper has been to test the influence of local labour markets and other variables related with the territory and spatial mobility on overeducation in the Spanish economy.
With this aim, we have used microdata from the Encuesta de Presupuestos Familiares 1990-91 and applied the most adequate econometric techniques (multilevel logit models with selection bias correction). The obtained results have confirmed that the probability of being overeducated increases with the schooling level, but this probability is not related to the potential experience of the individual. This result permits to reject the existence of substituibility between the different human capital components. A second result to highlight is that in the Spanish labour market, workers could have preferred not to accept a job where they would be overeducated and continue searching for an optimum job. In the particular case of variables related to territory, the hypothesis of differential overeducation of Frank (1978) has been rejected, as the risk of overeducation is not higher for married women living in small towns. This result is in line with the available evidence for other countries. But, the most relevant result of this study is that overeducation can be partly explained by spatial factors, such as the size of the town where individuals live or the possibility of acceding to a bigger labour market due to the availability of private transport making commuting much easier or the existence of adequate transport infrastructures (road). If this is true, individuals that are more affected by overeducation are not married women, but individuals living in small towns and geographically limited in their process of searching job as a result of their low commuting.
In spite of the striking differences in terms of labour market institutions, the results for Spain are in line with the ones obtained for the German labour market by Büchel and Van Ham (2003) and for the United States by Quinn and Rubb (2011), who highlight the relevance of the spatial dimension of overeducation as opposite to the differential overeducation of married women. In fact, possible differences in terms of overeducation between men and women could perhaps be explained by the lower spatial mobility of women.