Relatedness, external linkages and regional innovation in Europe

ABSTRACT Relatedness, external linkages and regional innovation in Europe. Regional Studies. This paper analyses if the generation of new knowledge benefits from the combination of similar or dissimilar pieces of existing technologies, in terms of their technological content (related versus unrelated variety), for the case of European regions. Specifically, it analyses the relevance of variety in the case of local knowledge as well as in the case of the knowledge coming from other regions. At the local level, it shows that while related variety is conducive to regional innovation, unrelated variety does play a role, too, when it comes to radical innovations. Conversely, it also shows that external knowledge flows have a higher impact, the higher the similarity between these flows and the extant local knowledge base.


INTRODUCTION
It is now an established fact in the literature that the combination and recombination of previously unconnected ideas lead to new knowledge production, subsequent technological innovations, and ensuing economic growth and wellbeing (Aghion & Howitt, 1992;Jones, 1995;Weitzman, 1998). Following a well-settled tradition in evolutionary economic geography, this paper argues that not all types of formerly existing knowledge are equally and successfully combined, and the results of such processes depend on the kind of knowledge put in contact in terms of their technological contentthat is, the degree of knowledge relatedness (Boschma, Eriksson, & Lindgren, 2008).
The innovation and economic geography literatures have long tried to understand whether firms located in agglomerations mainly learn from other local firms in the same industry or from other local firms in a range of other industries (Glaeser, Kallal, & Scheinkman, 1992). The former dates back to Marshall's (1920) contributions on the benefits arising from spatial concentration. The latter relates to Jane Jacobs' contributions on cities, externalities and innovation (Jacobs, 1969; see also Glaeser et al., 1992). From her work we learn that a diversified economy brings benefits to local firms because it generates new knowledge and innovation steaming from the cross-fertilization of ideas across different industries. Following Frenken, Van Oort, and Verburg (2007) and a large number of studies after them, we argue that Jacobs' concept of diversification needs to be more thoroughly elaborated, by differentiating between diversification of related industries and diversification of unrelated industriesor related versus unrelated variety. Regions hosting related industries, with different but connected knowledge bases, can engage in recombinant innovation. On the contrary, the combination of unrelated technologies is more difficult to succeed into the production of new ideas.
An issue that has been generally under-investigated by this particular literature is the role played by knowledge linkages across the space in introducing variety into regions. While most of the related literature is usually silent on the role of linkages across regions, thus implicitly assuming that innovation production draws mainly from geographically localized knowledge sources (Audretsch & Feldman, 2004), some scholars have recently posited that, at some point, co-located agents may start to combine and recombine local knowledge that eventually becomes redundant and less valuable. As a result, processes of negative lockin may begin to occur (Boschma, 2005;David, 1993). Conversely, firms looking for external sources of knowledge may find that the knowledge they require is available beyond the boundaries of the region where the firm is located (Bathelt, Malmberg, & Maskell, 2004;Bergman & Maier, 2009). In the ongoing age of globalization characterized by predominantly open economies, it is naïve to assume that agents in regions source their knowledge inputs only from their immediate vicinity. In this scenario, this paper argues that not only does being connected to the outside world matters, but also the degree of diversity between the external knowledge that is brought into the region and the existing knowledge base is important (Boschma, Heimeriks, & Balland, 2014).
In order to fill this gap, the paper estimates a knowledge production function (KPF) for the case of European regions, trying to ascertain what type of knowledge recombinationrelated or unrelatedis more conducive to regional innovation. Different from previous studies, the paper takes into account the geographical breadth of such knowledge. That is, it not only analyzes the relevance of local variety but also how the external-to-the-region knowledge flows fit into the local knowledge base. While adding the external dimension is crucial, this concern has been generally neglected by the related variety literature, and contributions introducing a 'more geographical wisdom in the study of regional diversification' are still scarce (Boschma, 2016;Content & Frenken, 2016). Only Boschma and Iammarino (2009), for Italian regional growth, and Tavassoli and Carbonara (2014), for Swedish regional innovation, have concluded that it is not enough being connected to the outside world, but different, yet related, connections provide real learning opportunities and boost economic outcomes.
As a second objective, this study incorporates the idea that the combination of related technologies is not always a necessary condition for regional diversification, as unrelated diversification may occur, too. The paper uses regional innovation as an outcome variable, which allows us to regress not only innovation quantity but also innovation quality (i.e., breakthrough innovations) on related variety, unrelated variety and connectedness with other regions. 1 This allows us to test whether breakthrough innovations draw more on unrelated and distant pieces of knowledge, as ideas with high impact tend to stem from knowledge cross-fertilization and the combination of unrelated technologies (Fleming, 2001;Saviotti & Frenken, 2008). Again, very few systematic evidence exists in this respect, Tavassoli and Carbonara (2014) and Castaldi, Frenken, and Los (2015) being the exceptions, for the cases of Swedish regions and US states respectively.
In sum, this paper draws on these ideas and studies the relevance of the degree of relatedness among previous existing pieces of knowledge for the generation of new ideas, while differentiating between relatedness in the local technological structure of regions (related versus unrelated variety) and relatedness between the internal knowledge base and the extra-regional sources of knowledge.
Using regional innovation intensity as an outcome variablepatents per capita, contrary to a large part of the related literature, which focuses on economic growth or employmentis an important departure from the majority of studies, for several reasons. First, while most studies conclude that related variety facilitates knowledge spillovers, which are conducive to innovation, this specific relationship is barely tested, but rather implicitly assumed to exist in the link between the regional structure of employment or exports with economic growth. However, recent studies suggest that growth effects of related variety may be specific to knowledge-intensive industries only (Content & Frenken, 2016). In consequence, we focus here solely on innovation production and on patent-intensive sectors. Moreover, we compute variety indexes using the technological classification provided in patent documents. In particular, we exploit technology information using the international patent classification (IPC) codes contained in patent applications to the European Patent Office (EPO) to build the diversity indexes, establishing a more direct link between regional diversification and its underlying technological nature.
Second, and more importantly, our study is one of the few investigating cross-regional linkages and related variety for which trade data have mostly been used to depict linkages across regions (Boschma & Iammarino, 2009;Tavassoli & Carbonara, 2014). Our focus on innovative sectors allows us to use citations to patents as a cleaner and more direct measure of knowledge flows across the space. Patent citations directly point to the prior knowledge to which the current innovations draw upon, and therefore represent a good proxy for cross-regional linkages and knowledge flows (Jaffe & Trajtenberg, 1999;Schoenmakers & Duysters, 2010).
Finally, using innovation as outcome variable allows us to exploit heterogeneity in patent quality and its relationship with related and unrelated diversification, as mentioned above.
This paper makes use of a large sample of European regions (255 NUTS-2 regions) belonging to 25 countries, which, to our knowledge, correspond to the largest coverage in Europe of studies of this kind. Moreover, the study utilizes data for several years, allowing us to introduce time and region fixed effects (FE) to control for a large number of unobservables.
The paper is structured as follows. The next section reviews the related literature. The third section sets the empirical analysis and describes the data. We give the main results in the fourth section and finally the fifth section concludes.

LITERATURE REVIEW
It is widely accepted in the literature that innovation is a process of accumulation and recombination of previously existing ideas (Weitzman, 1998). A key point is, however, if any potential combination of existing knowledge is equally successful, or only the connection of different, but related, pieces of knowledge is most effective . Besides, it is established that innovation production draws mainly from geographically localized knowledge sources (Audretsch & Feldman, 2004). However, scholars have also signalled that the combination of local knowledge may eventually become redundant (David, 1993), leading firms to look for external sources of ideas. This section discusses theoretical and empirical contributions on the different role of related and unrelated variety on regional outcomes, both within the region and across geographical areas.
Related and unrelated variety at the regional level Much research on the geography of innovation and regional development has addressed the question of whether specialization or diversity boosts local innovation. Proponents of the former argue that firms tend to learn from other firms in the same industry, and therefore specialization facilitates knowledge spillovers and subsequent growth. Meanwhile, advocates of the latter contend that diverse economies facilitate barters of different pieces of knowledge across industries, which are more prone to produce innovations and economic prosperitydespite implying higher communication costs between agents. The concept of diversity is complex and subtle, as first signalled by . These authors pose the central question of whether it is related or unrelated diversity that is most relevant for growth. Related diversity, or variety, facilitates local knowledge spillovers across industries at a lower cost. This is because the cognitive distance across these industries is not too large so that complementarities exist among them in terms of shared competences and capabilities, which enable effective connections as well as sharing knowledge and information. Conversely, unrelated variety may slow down the diffusion of ideas, given that they draw on very different and completely disconnected knowledge bases making it more uncertain and costly to engage in recombinant innovation, thereby hampering the production of new local innovation. Frenken et al.'s (2007) pioneering study shows how related variety impacts regional economic growth in the Netherlands. Results are confirmed by studies in other countries: Bishop and Gripaios (2010) for Great Britain, Boschma and Iammarino (2009) and Quatraro (2010) for Italy, Hartog, Boschma, and Sotarauta (2012) for Finland, and Boschma, Minondo, and Navarro (2012) for Spain. The role of unrelated variety is more controversial. Whereas Bishop and Gripaios (2010) find that unrelated variety affects employment growth in a larger set of industries than related variety, Boschma et al. (2012) and Hartog et al. (2012) do not find any growth effect. Meanwhile,  find that unrelated variety dampens unemployment growth, which the authors interpret as evidence of unrelated industries spreading risks of potential negative shocksknown as the portfolio effect of variety. 2 Despite the emphasis put on earlier studies on related variety as knowledge spillovers facilitator, implicitly these studies assume that variety and employment or economic growth are linked to each other via innovation. Little work has been done, however, on directly examining the impact of technological variety on innovation performance. To our knowledge, only Tavassoli and Carbonara (2014) and Castaldi et al. (2015) analyse the role of related and unrelated variety on regional innovation output for the Swedish and US cases respectively. Their findings suggest that when it comes to variety of knowledge within regions or US states, unrelated variety does not affect regional innovation output in general, whereas the impact is robust and positive for related variety. 3 To reiterate, as  put it, related variety 'improves the opportunities to interact, copy, modify, and recombine ideas, practices and technologies across industries giving rise to Jacobs externalities' (p. 59). Therefore, in search for recombination, agents focus mainly on the technological pieces in which they have prior experience (related variety), since this previous expertise allows them to understand better the nature of the new knowledge. As a consequence, when a region presents a diversity of related technologies, connections are more effectively established given that related technologies are more easily recombined. Therefore, we expect related variety to be crucial in the generation of regional innovation.
In spite of the previous discussion, scholars have argued that truly important innovations may stem from the combination of previously unrelated technologies (Saviotti & Frenken, 2008). This is so because, when combining more different capabilities, despite implying higher costs and risks, it can result in the production of radical breakthroughs, i.e., innovation with a high technological and economic impact (Boschma, 2016). As Fleming (2001) puts it, knowledge producers who experiment with new and unusual components and combinations may arrive at less useful innovations on average, but with large variability, which results in turn in both failure and breakthrough inventions. If successful, unrelated pieces of knowledge become related in the form of a new invention that paves the way for future technological developments and further innovation, leading to 'new operational principles, functionalities and applications' (Castaldi et al., 2015, p. 770). In consequence, we expect unrelated variety to be key in the generation of more radical innovations.
Relatedness and extra-regional linkages An important debate within the geography of innovation literature that has emerged recently is the role of external knowledge in the process of regional knowledge creation. Indeed, the widely accepted assumption that agents usually source their innovations from their immediate vicinity might have limited our understanding of the ways in which knowledge flows across the space and the way in which innovations are generated (Coe & Bunnell, 2003). Thus, the increasing importance of agents' needs to access extra-local knowledge pools to overcome potential situations of regional 'lock-in' has been highlighted (Boschma, 2005;Camagni, 1991;David, 1993;Grabher, 1993). Even local unrelated activities may become related when they are successfully combined, eventually becoming redundant, too (Boschma, 2016;Desrochers & Leppälä, 2011). Thus, recent empirical works have extensively documented the influence of extra-local knowledge sources on firms' and regions' innovative performance and knowledge acquisition (Bottazzi & Peri, 2003;Gertler & Levitte, 2005;Gittelman, 2007;Miguelez & Moreno, 2013;Moreno, Paci, & Usai, 2005;Owen-Smith & Powell, 2004;Rosenkopf & Almeida, 2003;Zhou & Li, 2012).
Yet, not only being connected to the outside world matters, but also the degree of relatedness between the external knowledge that is brought into the region and the existing knowledge base . While the external dimension is crucial to understand regional growth, it has been generally neglected by the related variety literature (Boschma, 2016;Content & Frenken, 2016), with only a few exceptions (Boschma & Iammarino, 2009;Tavassoli & Carbonara, 2014). This paper argues that in the ongoing globalized world characterized by predominantly open economies, it is naïve to assume that agents in regions source their knowledge inputs only from their local environment. Regions lacking certain capabilities could still diversify if they leverage knowledge inputs coming Relatedness, external linkages and regional innovation in Europe 691 from external sources and allow the different unrelated sectors to find their way to interact with related sectors located beyond their regional borders. The scarce extant empirical evidence on the role of relatedness of extra-regional knowledge flows has approached the issue using regional trade dataeither imports or exports (see Boschma & Iammarino, 2009, for Italian regional employment growth; and Tavassoli & Carbonara, 2014, for Swedish regional innovation). Their findings suggest that it is not enough being connected to the outside world, but different, yet related, connections provide real learning opportunities and boost economic outcomes. 4 When the external knowledge basically integrates prior art from the same technologies from within the region, it can be easily absorbed, but the new knowledge will not add much to the existing local one. On the contrary, when the external knowledge brings technologies different from the local ones, it will be more difficult to understand, but once it is integrated, the chances that they lead to successful outcomes are higher. All in all, in analogy to what is said above, we expect extra-regional knowledge inflows to be most effective when they are different, but related, to the local knowledge base.

Empirical model
We test our hypotheses under a KPF framework at the regional level. Our point of departure is the simplest specification of this model: where Y is the innovative output of a given region, which depends on regional R&D expenditures (RD) as well as Z, a number of time-variant controls that account for specific features of region i at time t. Among them, we include measures of variety and relatedness, as explained below. Note that regional differences in size are accounted for by dividing the dependent and explanatory variables by total population. All in all, the following model is suggested: where ln Ypc it is the log transformation of the annual number of patent applications per 1 million inhabitants in region i and year t; ln RDpc it is the log transformation of R&D expenditures per capita in region i and year t; and Z is a number of focal variablesas explained belowand controls. For the latter, we include a proxy for human capital, measured as the share of human resources devoted to science and technology (HRST), as well as two variables accounting for differences in the economic structure of regions: the share of manufacturing employment (ShareInd), and the share of employment in hightechnology manufacturing and knowledge-intensive, high-technology services (High-tech Empl). In addition, d i and d t stand for regional FE and time FE respectively.
In order to consider deviations from the theory, a wellbehaved error term is also introduced: 1 it . Our empirical model (the regional KPF) draws mainly from a large number of contributions in regional science and innovation economics trying to understand the role played by regional innovative efforts (R&D) and the technological structure of regions on regional innovative output. We are aware that our reduced-form model does not account for all possible determinants of regional innovation intensity. Thus, several studies have extended the regional KPF to include a larger number of potential non-technology determinants of regional innovation outputs. For instance, one interesting avenue of research is the role of institutions and social capital on innovation, and more importantly, how they influence regional variety's role in fostering regional innovation (see Boschma, 2016, for a claim to do research in this direction). However, this lies beyond the primary focus of the present analysis. Yet, contrary to still the large majority of empirical studies using the regional KPF, we control for region FEs, and therefore account for all time-invariant features of regions that may influence the regional production of innovations (with institutions or social capital variables, which evolve slowly over time, being partially controlled for through these FEs).

Related and unrelated variety
We start our analysis with a simple model that does not account for the influence of non-local capabilitieswhich will be introduced progressively (see below). Our first enquiry concerns the impact of knowledge diversification on regional patenting activity. In line with previous papers, as a proxy for diversified knowledge we measure variety as well as related and unrelated variety with entropy measures . We borrow from Castaldi et al. (2015) the use of the technological classification of patents in order to construct the measures of regional knowledge variety. Our entropy indicators are computed using information retrieved from applications to the EPO. In particular, we use the IPC system, which provides a hierarchical system of codes for the classification of patents according to the different areas of technology to which they pertaindirectly assigned by the patent office, the EPO in this case. These codes are grouped into eight sections, which are the highest level of hierarchy of the classification. Each section is divided into three-digit classes and four-digit subclasses. The current version of the IPC classification contains 635 technological subclasses. 5 Scholars have reorganized these technological subclasses in meaningful fields and broad fields of technology, similar to the grouping of products or economic activities into sectors (such as the Standard International Trade Classification (SITC) used in trade or International Standard Industrial Classification of All Economic Activities). The aim of this grouping is to allow time and cross-country comparisons of innovation activities, and it is based on minimizing technological heterogeneity within technology fields and broad fields. Here we use the classification built by Schmoch (2008), which grouped subclasses into 35 technology fields (35-field), which are further grouped into five broad fields (five-field), namely: Electrical engineering, Instruments, Chemistry, Mechanical engineering and Other fields. 6 Using the IPC codes and Schmoch's classification of technological fields, the variety variable measures the degree of knowledge diversification through the computation of an entropy measure at the four-digit level (subclasses): where p j is the share of the four-digit sector j. The value of this index will be higher in regions characterized by a high diversified sectoral composition in its knowledge base. We break down this measure in two different indicators. Following , if all four-digit subclasses j fall under a 35-field technology S g , where g ¼ 1, … , G, it is possible to derive the 35-field shares, P g , by summing the four-digit shares p j : Related variety is then measured by the weighted sum of the entropy at the four-digit within each 35-field technology: where: Equation (6) measures the diversity of a region's portfolio at the most fine disaggregation. Thus, it assumes that sectors that belong to the same 35-field technology are technologically related to each other and, as a consequence, can learn from each other through knowledge spillovers. Unrelated variety is proxied by the entropy of the fivefield distribution. Formally, as K is the total number of fivefield sectors (k ¼ 1, … , K ), the unrelated variety index is given by: Thus, equation (7) measures the extent to which a region is diversified in very different types of activities. This measure assumes that technologies that do not share the same broad field (five-field) are unrelated to each other. Theoretically, high levels of this variable are associated with fewer knowledge spillovers. The indices of related and unrelated variety are not opposites. One region can have both a high related variety (diversified into many specific subclasses in each field) and a high unrelated variety (diversified into unrelated broad five-field technologies). In fact, they tend to correlate positively , although it is not always the case. In addition, given the decomposable nature of the entropy measure, variety calculated at different digit levels can be included in a regression analysis without necessarily generating collinearity. Following the empirical model sketched above, we include now the indices proxying for related and unrelated variety in the Z vector including controls that account for specific features of the region: which once inserted into the main equation yields to: ln Note that we introduce the subscript t -1 into all explanatory variables in order to indicate that they have been time lagged one period to lessen endogeneity concerns due to system feedbacks. The subsection below includes further details regarding the construction of all the variables used in the present analysis.

Relatedness and external interactions
Here we extend our baseline model to account for the role of non-local knowledge sources in the process of regional knowledge creation. Although some studies, at the level of European regions, have consistently shown the importance of cross-regional interactions to the process of regional innovation (Maggioni & Uberti, 2009;Ponds, Van Oort, & Frenken, 2010), little attention has been paid to which kind of external interactions may be more beneficial. We conjecture that even if new variety may enter a region thanks to the interactions with other regions in the form of, for example, trade linkages, foreign direct investment (FDI), research collaboration or labour mobilityextra-regional knowledge flows should be related, but not too similar, to the technological base of a region in order to impact the region's outcomes positively.
We directly look at the actual knowledge flows through the use of patent citations as a proxy for these flows. Patent citations point directly to prior art on which the patent is based (Trajtenberg, 1990) and, consequently, represent a 'paper trail' worthwhile for the analysis of knowledge diffusion (Jaffe, Trajtenberg, & Henderson, 1993). Since Jaffe et al.'s (1993) pioneering paper, patent citations have been considered to be useful to depict knowledge linkages between inventions, inventors and applicants along time, geographical space and technological fields, among other dimensions (Hall, Jaffe, & Trajtenberg, 2005;Jaffe & Trajtenberg, 1999;Schoenmakers & Duysters, 2010). In our case, since patents record the residence of the inventors, they are an exceptional source for studying knowledge flows across regions.
To build our variables, we use citations made by inventors resident in the focal region and EPO applications of inventors living outside the region. In particular, we look at backward citations listed in patents produced in a given region and collect the cited patents (alongside their technology codes) with all inventors living outside the region. Even though the use of patent citations does not come without limitations, e.g., some citations are added by the examiner and not the applicant (Alcacer & Gittelman, 2006), they have been widely used in innovation economics as a proxy for knowledge flows (Criscuolo & Verspagen, 2008;Jaffe et al., 1993;Jaffe & Trajtenberg, 1999). Moreover, as citations relate cited patents with citing ones, they include detailed descriptions of technological characteristics and classification into technical domains (Popp, Hascic, & Medhi, 2011) allowing the computation of the necessary indexes.
We use an indicator of RELATEDNESS to account for knowledge inflows that are related, but are not the same, to the actual knowledge base of the region. This indicator is built in a similar fashion to Boschma and Iammarino (2009) where CIT M 4 (j) is the entropy measure obtained with data for extra-regional backward citations in four-digit technologies (subclasses) other than j, but within the same 35-field technology; and PAT 4 (j) is the relative size of the fourdigit patent technology j in the total regional patenting. The idea is that for each four-digit patent technology in a region (e.g., technology C07G), we measure the entropy of the citations to patents from the other four-digit subclasses (e.g., C07K, C12M, C12N, C12P, C12Q, C12R and C12S) pertaining to the same 35-field sector (e.g., the biotechnology field), excluding the focal four-digit subclass itself (i.e., subclass C07G).
In order to complement the analysis, and again in line with Boschma and Iammarino (2009), we also use an index to determine the similarity between the external knowledge entering a region and its existing knowledge base (SIMILARITY). In our case it is computed as the sum of the products of the absolute sizes of the four-digit subclass patents (PAT 4 ( j)), as a proxy of the knowledge stock in a region, and the four-digit subclass extra-regional patents the former have cited (CIT 4 (j)): This measure is maximum when the region is specialized in just one technology and this technology coincides with the extra-regional patents cited. The lowest values are obtained when the region is more diverse in its patent portfolio as well as in the extra-regional patents it cites, and at the same time both profiles are less similar. When a region gets knowledge from other regions, but such knowledge comes from the same technologies present in the region, the knowledge base of the economy will be able to absorb it, but it will not add much to the existing knowledge.
Therefore, we expect SIMILARITY to have little or null effect on regional innovation. With these two indices (RELATEDNESS and SIMI-LARITY) we aim to measure how close the knowledge that flows into a region is to the current regional knowledge stock of a given region in order to infer the role of such relatedness in the creation of new knowledge.

Data
We use a sample of 255 NUTS-2 European regions of 25 countriesthe EU-27 (except Cyprus and Malta, as well as Denmark and Greece, for which we have very little information at the NUTS-2 level) plus Norway and Switzerlandto estimate a regional KPF from 1999 to 2007. Our dependent variable, innovation output, is measured by patent applications, a variable widely used in the literature to proxy innovation outcomes. As widely documented, this proxy presents serious caveats since not all inventions are patented, nor do they all have the same economic impact, as they are not all commercially exploitable (Griliches, 1991). In spite of these shortcomings, patent data have been considered useful for proxying inventiveness as they present minimal standards of novelty, originality and potential profits, and as such are a good proxy for economically profitable ideas (Bottazzi & Peri, 2003). We retrieve patent data at the regional level from the Organisation for Economic Co-operation and Development's (OECD) REGPAT database -July 2013 edition (Maraut, Dernis, Webb, Spiezia, & Guellec, 2008). When patents have been produced by inventors resident in different NUTS-2, they have been fractionally assigned to the different regions, according to the number of inventors out of all inventors listed in a patent living therefractional counting.
We slightly modify our dependent variable in order to account not only for the quantity of patents produced but also for their qualityas explained in previous sections. As largely argued in the related literature, the number of forward citations received presumably conveys information about the relevance of patents, thus providing a way of assessing the enormous heterogeneity in the value of patents (Hall et al., 2005). This extreme is confirmed by several studies that have found strong correlations between the number of forward citations received and the economic value of patents (Harhoff, Narin, Scherer, & Vopel, 1999;Lanjouw & Schankerman, 2004;Trajtenberg, 1990). We therefore use citations as an imperfect, but widely used, proxy for patent quality and weight the number of patents by the number of citations the patent has received in subsequent patent documents. 7 As for the explanatory variables, R&D expenditures data (both private and public expenditures in regions) were collected from EUROSTAT and some national statistical offices. Data to measure ShareInd and High-tech Empl were collected also from EUROSTAT. As for the level of human capital of regions, which likely determines the regions' capacity to transform technological inputs into outputs, we use the variable HRST, which, according to EUROSTAT, includes all tertiary educated workers employed in science and technology occupations (over all workers in the region). 8 As mentioned above, variety indexes are constructed using the information of IPC codes listed in patent documents (again from the OECD REGPAT database -July 2013 edition). Again, based on the available data, there are 635 four-digit patent classes, 35 technological fields and five broad fields. Knowledge flows are proxied through patent citations as explained above. We use unit-record data retrieved from EPO patents to construct the patent citation variables (OECD Citations database, July 2013 edition; Webb, Dernis, Harhoff, & Hoisl, 2005). All the patent data used to build the focal explanatory variables are retrieved for moving time windows of five years. Table 1 provides summary statistics of the variables used in the present analysis whereas the correlation matrix of explanatory variables is given in Table A2 in the supplemental data online. 9 We observe high correlations between some variables, although most do not jointly appear in the same regressions. For the remaining, Table  A5, also online, shows additional regressions in which we remove some of the problematic variables to ensure that our results and conclusions hold.
Further, Figure A1, again online, depicts the distribution of our variables (dependent and explanatory) in mapsas averages of the whole period. Interestingly, even if some of these variables seem to follow the same concentration pattern in core regions of Europe, others seem to be more spread across the space.

Local variety and innovation
We estimate an unbalanced panel model of nine periods (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007). Table 2 provides the two-way FE estimates for the regional KPF model, including all the controls listed in the third section. Columns (i)-(iii) use as the dependent variable the logarithmic transformation of the number of patents per 1 million inhabitants.
In all the cases, the Hausman test rejects the null hypothesis that individual effects are uncorrelated with the independent variables, so the FE model is preferred to the expense of the random effects (results are available from the authors upon request). In general, the KPF holds in the European regional case for the period under consideration. The elasticity of patents with respect to R&D expenditures presents significant values (0.13-0.22), which is in line with the value obtained in the literature (Bottazzi & Peri, 2003;Jaffe, 1989).
With respect to the variety index, results indicate that the variety in knowledge stocks of regions is indeed positively and significantly related to regions' innovation output, similar to the results for the role of variety on employment and productivity (Boschma & Iammarino, 2009). Interestingly, once variety is split into related and unrelated, only related variety is significant. This result indicates that the higher the number of related technologies in a region, the larger the knowledge spillovers and, as a consequence, the more the learning opportunities across them . That is, learning opportunities generated by a variety of technologies within the region are relevant when such technologies are related, which ultimately will generate more knowledge externalities across them. Meanwhile, if the knowledge flows across technologies are far away from each other (unrelated variety), it will be more difficult to assemble them and produce new ideas and innovation.
Columns (iv) and (v) of Table 2 look at patent quality, as explained above. All our results and conclusions with respect to columns (ii) and (iii) hold, except for the case of unrelated variety, which increases considerably its point estimate and now becomes highly significant. It seems, therefore, that when the combination of unrelated technologies is attained, not only is general innovation obtained (as suggested by the positive and significant parameter for related variety) but also knowledge of presumably high value and economic impact can be achieved, which accords with our expectations.
Overall, our results are qualitatively comparable with recent studies that have looked at related variety and innovation in regions (i.e., Castaldi et al., 2015;Tavassoli & Carbonara, 2014), even though we do not share with them neither the regions analyzed (US states and Swedish functional regions respectively) nor their estimation method (negative binomial models). 10 For comparison purposes, Table A3 in the supplemental data online presents negative binomial estimates of our preferred models. This implies taking the number of patents as the dependent variable instead of the number of patents per capita, and then having R&D as a regressor and not R&D per capita. We also add population as a control to account for size effects (in columns iii-vi). Results are comparable with our ordinary least squares (OLS) estimatesalthough Castaldi et al. (2015) do not find evidence of a positive effect of related variety on breakthrough innovations, as we do. Columns (v) and (vi) present random effects estimations to make our paper fully comparable with Tavassoli and Carbonara (2014). As expected, some of the coefficients become larger, making them closer to the ones found by the mentioned authors. 11 Next, as argued in the introductory and theory sections, it is critical to account for extra-local knowledge sources on regions' innovative performance, as well as the degree of relatedness between the external knowledge that is brought into the region and the existing knowledge base . This issue is important from a methodological viewpoint, too, as estimates in earlier regressions could be biased if the external dimension is not accounted for. We discuss this in turn.

Technological relatedness and external linkages
This section looks at the role of external-to-the-region inflows of knowledge. To do so, we introduce a variable accounting for external flows of knowledge which are different, but related, to the local knowledge base (RELATEDNESS), using data on patent citations to build it. For completeness, we also build a variable proxying Relatedness, external linkages and regional innovation in Europe for the amount of incoming knowledge flows that remain within the same technology (SIMILARITY). Table 3 shows the results when the RELATEDNESS and the SIMILARITY indices are included to consider explicitly to what extent the knowledge that flows from other regions is related to the knowledge stock of the host region. The remaining explanatory variables are those of Table 2. Reassuringly, as observed in column (i), the majority of coefficients do not change to a large extent, which indicates that the omission of the external dimension in Table 2 was not biasing our results concerning the role of related and unrelated variety.
From Table 3, column (i), we also learn that, contrary to our initial assumptions, RELATEDNESS does not significantly correlate with regional innovation. Thus, it seems that knowledge inflows that are different, but related, to the local knowledge base, do not create useful interconnections that can end up producing any significant innovation outcome. In turn, and against our expectations, the higher the SIMILARITY between the technological composition of the local knowledge and that of the cross-regional knowledge flows, the higher the impact on the regions' innovative output. In other words, if the knowledge that flows into a region comes from technologies in which the region already patents, there seems to be plenty of opportunities for using such knowledge in a creative way. As in Boschma et al. (2008), we interpret these results as evidence that the knowledge coming from other regions already convey a certain degree of novelty as compared to the local knowledge base, which is not embodied in the technological classification used in the present paper.
Interestingly, when the patents are weighted by their quality (column ii), the coefficient accompanying the RELATEDNESS index increases considerably and becomes statistically significant, suggesting that an extraregional knowledge that is complementary, but not similar, to the existing knowledge base in the region will particularly boost interactive learning that can bring out breakthrough innovations. We conclude, therefore, that in order to generate average innovations, it is necessary to have a certain level of technological similarity so as to have the opportunity to learn and absorb across technologies coming from different regions. Whereas for the generation of more radical innovations, related, but not the same, incoming knowledge flows are also critical.

Robustness analysis
Several robustness analyses are presented in the supplemental data online. In Table A4, we test the theoretical statements discussed above through the use of a more general dependent variable on regional economic performance, such as the annual growth rate of gross domestic product (GDP) per capita. Despite the fact that GDP growth does not reflect a direct measure of innovation, its use avoids potential criticisms derived from the use of patent data to build both the dependent and independent variables, as we did in previous sections. Data on regional GDP per capita are retrieved from EUROSTAT, and the dependent variable is computed as the log of the ratio between per capita GDP at time t 1 and per capita GDP at t 0 . Moreover, regressions include the log of per capita GDP at t 0 as an additional control, as done in much of the growth literature.
Results reported in columns (i) and (ii) concerning related and unrelated variety are in line with much of the related literature for specific countries (see Frenken et al., 2007, for the Netherlands;Iammarino, 2009, andQuatraro, 2010, for Italy;Bishop & Gripaios, 2010, for Great Britain;Hartog et al., 2012, for Finland; for Spain) even if in our regressions, variety indicators are computed using technology fields from patent applications instead of employment by economic activities. The results reported show the significant impact of variety in both related and unrelated technologies. This evidence supports the hypothesis that economic growth benefits from diversification in technologies, too. Note that in previous tables we found that unrelated variety only impacts innovation if weighted by their value using forward citationsbreakthrough innovations. Interestingly, both related and unrelated variety strongly influence regional economic growth, which we attribute to the strong link between economic growth and breakthrough innovations, as witnessed by the recent report of the World Intellectual Property Organization (WIPO) (2015). Results concerning incoming knowledge flows and regional economic growth (column iii) are also consistent with the previous results presented in Table 3. Reassuringly, we have shown that our results are not driven by mechanical correlation between dependent and independent variables, given that the use of an alternative measure not directly retrieved from patent documents, such as per capita GDP growth, does support our key findings.
Finally, as commented in the data section with respect to the high correlation between R&D expenditures and some of our focal variables, we turn now to analyse the robustness of our results compared with potential collinearity problems. In Table A5 in the supplemental data online, we observe that after eliminating R&D expenditures from our models, the results are virtually unchanged. The same is true when the related and unrelated variety variables are suppressed from the equations (columns iv and v). This corroborates that potential collinearity problems do not exert any influence in the obtained results on the impact of variety and external relatedness on regional knowledge production.

CONCLUSIONS
This paper has investigated the role of variety on regional innovation for a sample of 255 NUTS-2 European regions from 25 countries from 1999 to 2007. In particular, it has looked at the differential role played by various degrees of relatedness, across different spatial scales, on regional patenting and on citations-weighted regional patenting.
According to our results, diversity of knowledge, or variety, is critical for regional innovation. However, only knowledge flowing from different but related technologies (related variety) will generate new knowledge that incrementally constructs on established cognitive structures across related technologiesin line with the vast majority of previous studies. Notwithstanding these results, an interesting conclusion arises from our empirical approach when the patenting activity is weighted by the quality of such patents through the forward citations receivedas an attempt to give more importance to breakthrough innovations. In this case, the more diversified across unrelated technologies is a region, the higher is the output in terms of high-quality innovations. Thus, evidence supports the idea that general innovation benefits from diversification in related technologies, whereas more radical innovation also benefits from variety in unrelated technologies.
In addition, since knowledge can also be brought into a region from 'outside', we assess whether the degree of relatedness between incoming knowledge and the local knowledge base influences regional innovation performance. As it is usually done in the related literature, knowledge flows are proxied through the use of backward patent citations. Our results show that extra-regional incoming knowledge flows have a higher impact the higher the similarity between these knowledge flows and the extant local knowledge base, which goes somewhat against our initial expectations. While this is true for the generation of average innovations, again differences emerge when accounting for the impact of the innovations produced: for radical innovations, the technological contents of the extraregional linkages do not necessarily need to be very similar to the local technological base, but a certain degree of relatedness seems to be sufficient. This degree of relatedness assures certain cognitive proximity between agents located at a geographical distance, while at the same time brings in the necessary variety to offer the building blocks for technological revolutions. Regional diversification and relatedness are hot-button issues nowadays not only for academics but also for policymakers. These concepts have become especially relevant recently, as many European regions are still being hit by the economic crisis, which requires promoting new industries and economic activities (Boschma & Gianelle, 2013).
These academic concepts go hand in hand with the Smart Specialization Strategy policy. Smart specialization aims to focus policy support onto key industries and economic activities already building in current national and regional strengths, thus avoiding picking sectors that do not match the actual and potential technological capabilities of regions (Boschma & Gianelle, 2013). The concept of relatedness is thus the appropriate academic tool for smart specialization policies, advocating for the promotion of economic activities related, but different, to the actual technological structure of regions (McCann & Ortega-Argilés, 2013). Notably, our results on the positive effects of unrelated variety as well as the role of similar versus related knowledge inflows from outside the region have important policy implications in the framework of the European Union's smart specialization strategy, and must be accounted for.
Future research should thoroughly look at the effect of regional unrelated variety on breakthrough innovations. On the one hand, it could be interesting to analyse if breakthrough innovations, i.e., those at the upper tail of the citations distribution, in a region actually combine technology classes that are unrelated, defined through co-occurrence analysis (see Boschma, Balland, & Kogler, 2015, as an example of this type of analysis), but present in the region concerned. On the other hand, it is plausible to think that the impact of technological unrelated variety on the generation of breakthrough innovations can be stronger in the long run since the combination and recombination of previously unrelated technologies may imply some time to be fulfilled. Thus, it would be interesting to analyse the time profile of the impact of related and unrelated variety on the probability to produce breakthroughs.

FUNDING
The authors acknowledge the financial support from the project 'Redes de colaboración tecnológica e innovación. Determinantes y efectos sobre la competitividad de las empresas españolas' funded by the Fundación BBVA Ayudas a Proyectos de Investigación 2014. Ernest Miguelez acknowledges financial support from the Regional Council of Aquitaine (Chaire d'Accueil en Economie de

NOTES
1. This paper uses interchangeably 'breakthrough innovations', 'radical breakthroughs' or 'radical innovations'. They all try to convey the idea that not all inventions have the same technological and economic impact, and therefore this innovation quality heterogeneity needs to be taken into account. In the empirical part of this study, heterogeneity is accounted for by weighting the number of patents produced in regions by the forward citations each receives. 2. A complementary perspective is offered by the branching literatureafter Hidalgo, Klinger, Barabasi, and Hausmann (2007) at the country level, which looks at whether variety enhances regional diversificationthat is, renewal and broadening of an economy's industrial base (Xiao, Boschma, & Andersson, 2016). Indeed, as Frenken and Boschma (2007) suggest, regions tend to diversify into economic activities related to the existing portfolio of local industries. Therefore, this idea of regional branching into related manufacturing industries is especially useful for understanding how new economic growth paths may be linked to pre-existing industrial structures in a region (Tanner, 2014). Evidence on how regions diversify over time is now large too and includes the case of Swedish regions (Neffke, Henning, & Boschma, 2011), Spanish regions (Boschma, Minondo, & Navarro, 2013) and US metropolitan areas (Boschma et al., 2015;Essletzbichler, 2015;Kogler, Rigby, & Tucker, 2013) yet showing at the same time that the process of technological transition is relatively slow (Rigby, 2015). 3. Other studies have also looked at the role of variety on patents (Kogler et al., 2013;Rigby, 2015;Tanner, 2016), scientific publications  or new firm formation (Guo, He, & Li, 2015;Colombelli, 2016). 4. Recent case-study work has called attention to the relevance of external linkages for creating knowledge diversification. For instance, Binz, Truffer, and Coenen (2014) look at membrane bioreactor technology and show that networks transcending national borders are of great importance for innovation processesand therefore deserve more attention in theoretical and empirical work. More systematic evidence is presented by Neffke, Hartog, Boschma, and Henning (2014), who argue that the unrelated diversification needed for structural change is mostly created via non-local firms and entrepreneurs, according to the evidence they obtain using Swedish matched employer-employee data. 5. Subclasses are further divided into groups and subgroups, so each IPC code can contain up to 10 digits. 6. See Table A1 in the supplemental data online for the list of the 35 fields and the five broad fields. 7. To compute this variable, we simply multiply the patents by the number of forward citations they received, and add up by region and year. In order to avoid eliminating a patent in case it has not received any forward citation, what we do is multiply the number of patents by the number of citations plus 1, that is: Patents*(Citations + 1). 8. We have experimented with alternative measures of human capital, such as the share of tertiary-educated inhabitants (data from EUROSTAT), but the coefficient associated with this variable tends to be smaller and largely not significant. Results are available from the authors upon request. This result confirms the intuition that only those more directly involved in knowledge and innovation activities are likely to determine the regions' capacity to innovate. 9. In the empirical analyses, because of the existence of zero patents in some cases, a small constant, 1, is added before the logarithmic transformation. 10. Tavassoli and Carbonara (2014) estimate a panelnegative binomial model employing data for the 81 Swedish functional regions (local labour market) over the period 2002-07 and provide robust evidence that related variety of knowledge plays a superior role than unrelated variety. Castaldi et al. (2015), using patent data for US states in 1977-99, provide evidence that innovation in general benefits from diversification in related technologies, whereas states with higher unrelated variety would outperform states with lower unrelated variety in producing breakthrough innovations. 11. Our empirical model (the regional knowledge production function) draws mainly from a large number of contributions in regional science and innovation economics trying to understand the role played by regional innovative efforts (R&D) and the technological structure of regions on regional innovative output. Other approaches have extended the regional KPF to include a large number of potential non-technology determinants of regional innovation outputs. We face a trade-off here between the accuracy of our empirical model (we want all the potential controls to be there) and completeness (we want to analyse a large number of regions and years). For instance, one interesting hypothesis to test would be the role of institutions and social capital on innovation, and more importantly, how they influence regional variety's role in fostering regional innovation (see Boschma, 2016, for a claim to do research in this direction). However, institutions and social capital variables are usually available for fewer regions, or at the NUTS-1 level, or for short periods of time (normally, they are not available on a yearly basis). Given that this is not the primary focus of our analysis, we have chosen to go for a large sample of regions and years to the expense of not adding these type of variables. Yet, contrary to still the large majority of empirical studies using the regional KPF, we control for region fixed effects, and therefore account for all time-invariant features of regions that may influence the regional production of innovations (with institutions or social capital variables, which evolve slowly over time, being partially controlled for through these fixed effects).