Empirical modelling of survey-based expectations for the design of economic indicators in five European regions

In this study we use agents’ expectations about the state of the economy to generate indicators of economic activity in twenty-six European countries grouped in five regions (Western, Eastern, and Southern Europe, and Baltic and Scandinavian countries). We apply a data-driven procedure based on evolutionary computation to transform survey variables in economic growth rates. In a first step, we design five independent experiments to derive a formula using survey variables that best replicates the evolution of economic growth in each region by means of genetic programming, limiting the integration schemes to the main mathematical operations. We then rank survey variables according to their performance in tracking economic activity, finding that agents’ “perception about the overall economy compared to last year” is the survey variable with the highest predictive power. In a second step, we assess the out-of-sample forecast accuracy of the evolved indicators. Although we obtain different results across regions, Austria, Slovakia, Portugal, Lithuania and Sweden are the economies of each region that show the best forecast results. We also find evidence that the forecasting performance of the survey-based indicators improves during periods of higher growth.


Introduction
Agents' expectations about the state of the economy are instrumental for economic modelling. Business and consumer surveys, also known as tendency surveys, are directly addressed to economic agents as a means to measure their expectations. Respondents are asked about the expected direction of change of a wide range of variables (capital expenditures, private consumption, exports, imports, etc.). Accordingly, survey results provide important information about agents' economic expectations, allowing comparisons among different countries' business cycles. On the one hand, sectoral results of the surveys have been often used as partial indicators for the construction of more general aggregate economic indicators and for the estimation of macro magnitudes through their introduction in econometric models (Abberger, 2007;Bruestle and Crain, 2015;Graff, 2010;Hanson et al., 2005;Lehmann and Wohlrabe, 2017;Wilms et al., 2016). On the other hand, survey-based expectations have also been introduced into behavioural equations postulated by economic theory such as the Phillips curve and to evaluate the formation of expectations, as they provide a direct measure of expectations to test the rationality of agents (Altug and Çakmakli, 2016;Bovi, 2013;Jean-Baptiste, 2012;Lee, 1994;Miah et al., 2016;Paloviita, 2006). See Pesaran and Weale (2006) for a review of the uses of survey data for testing and modelling of expectations.
Survey results are available ahead of the publication of quantitative official data, which makes them very useful for monitoring the evolution of the economy. Nevertheless, the fact that survey-based expectations are qualitative in nature has centred research in the development of different approaches to transform survey responses into quantitative measures of agents' expectations. See Driver and Urga (2004), Nardo (2003) and Pesaran (1987) for a review of methods for the quantification of survey results. Recent developments in empirical modelling have allowed to develop conversion approaches based on evolutionary computation. This study extends previous research by Claveria et al. (2016), who proposed an evolutionary-based two-step procedure to generate estimates of economic growth. The authors derived preliminary building blocks defined as simple combinations of survey variables, and then linearly combined the functions to generate estimates of economic growth in Central and Eastern European economies, finding that the forecasting performance of evolved survey-based indicators could be improved by designing ad-hoc quantification procedures for countries with similar characteristics.
These findings have led us to use evolutionary computation to generate indicators of economic growth that combine different survey variables of 26 European countries grouped into five major European regions (Western, Eastern, and Southern Europe, and Baltic and Scandinavian countries). First, we design five independent experiments that link survey expectations to economic growth, limiting the preliminary functions to the main mathematical operations with the aim of facilitating the implementation of the evolved economic indicators. Once we obtain the optimal combination of survey variables that best replicates the evolution of economic activity in each region, we rank the expectations according to the relative weight of each one in the evolved indicators. In a second step, we assess the out-of-sample forecast accuracy of the obtained economic indicators country by country.
Some of the features of empirical modelling are particularly indicated to deal with the problem at hand. First, empirical modelling is especially suitable for finding patterns in large data sets with little or no prior information about the system. Second, empirical modelling allows us to simultaneously evolve both the structure and the parameters of the model without imposing any assumptions regarding agents' expectations. In a recent study, Lahiri and Zhao (2015) found significant improvements in the forecasting performance of quantified expectations when relaxing the assumptions of quantification methods of qualitative survey data.
The empirical modelling approach applied in this research is based on symbolic regression (SR) via genetic programming (GP). While SR is a modelling approach characterised by the search of the space of mathematical expressions that best fit a given dataset, GP is a soft computing search technique for problem-solving (Cramer, 1985). GP is based on the implementation of genetic algorithms (GAs), which are a specific type of evolutionary algorithm (EA). Evolutionary computation can be regarded as a subfield of artificial intelligence, and is being increasingly applied to automated problem solving in economics.
The main aim of this study is twofold. On the one hand, we implement GP to find the optimal combinations of survey expectations to forecast economic growth at a regional level, restricting the integration schemes to the main four mathematical operations so as to obtain easily replicable expressions. This allows us to rank survey variables according to their predictive capacity. On the other hand, we assess the forecasting performance of the evolved economic indicators in each country and compare it to a benchmark model. The structure of the paper is as follows. The next section reviews the existing literature.
In Section 3 we present the methodological approach, describing the data and the experimental set-up. Empirical results are provided in Section 4. Finally, conclusions are drawn in Section 5.

Literature review
Economic expectations have been widely studied (Pesaran, 1987;Visco, 1984;Wren-Lewis, 1986). Tendency surveys ask respondents whether they expect a variable to rise, to remain constant, or to fall. The relationship between quantitative data and survey results was first formalised by Anderson (1952), who regressed the actual average percentage change of an aggregate variable on the percentage of respondents expecting a variable to rise and to fall. Carlson and Parkin (1975) developed the theoretical framework for quantifying survey expectations by assuming that respondents report a variable to go up if the mean of their subjective probability distribution lies above a threshold level, also known as indifference interval (Theil, 1952).
This relationship has been also explored by matching individual responses with firmby-firm realisations, both empirically (Lahiri and Zhao, 2015;Lui et al., 2011a, b;Mitchell et al., 2002Mitchell et al., , 2005aMokinski et al., 2015;Müller, 2010), and experimentally via Monte Carlo simulations (Claveria et al., 2006;Nardo, 2003). Common (1985) used experimental expectations to test the rational expectations hypothesis. Muth (1961) assumed that rationality implied that expectations had to be generated by the same stochastic process that generates the variable to be predicted. While Common (1985) rejected the presence of rational agents, Miah et al. (2016) have recently found survey expectations in 18 emerging economies to be mostly unbiased and efficient. Simulation experiments have also been used to assess the forecasting performance of different quantification methods of survey expectations (Claveria, 2010;Löffler, 1999;Nardo and Cabeza-Gutés, 1999;Terai, 2009).
In this study we fill this gap by linking survey data and economic growth in a SR setting solved by means of evolutionary computation. This approach is based on the implementation of GAs, which adopt Darwinian principles of the theory of natural selection in the context of expensive optimisation (Fogel et al., 1966). GAs are the most common type of EA, and were initially proposed by Holland (1975). GP allows the model structure to vary during the evolution, which makes it particularly indicated for non-linear and empirical modelling. See Banzhaf et al. (2008), Dabhi and Chaudhary (2015) and Poli et al. (2010) for a review of the state of the art in GP.
Most economic applications of evolutionary computing are in finance (Chen and Kuo, 2002;Fogel, 2006;Goldberg, 1989). GAs have been used to predict the financial failure of firms (Acosta-González and Fernández, 2014), to explain the 2008 financial crisis (Acosta-González et al., 2012), to model exchange rates (Lawrenz and Westerhoff, 2003), to evaluate the convergence to the rational expectations equilibrium (Maschek, 2010), to optimize the signals generated by technical trading tools (Thinyane and Millin, 2011), to forecast stock price trends in Taiwan (Wei, 2013), etc. See Drake and Marks (2002) for a review of the applications of GAs in financial forecasting.
Regarding GP, Vasilakis et al. (2013) proposed a GP-based technique to predict returns in the trading of the euro/dollar exchange rate. GP has also been applied to to model short-term capital flows (Yu et al., 2004), to forecast exchange rates (Álvarez-Díaz and Álvarez, 2005), and for stock price forecasting (Chen et al., 2008;Kaboudan, 2000;Larkin and Ryan, 2008;Wilson and Banzhaf, 2009). Wilson and Banzhaf (2009) compared a developmental co-evolutionary GP approach to standard linear GP for interday stock prices prediction. Alexandridis et al. (2017) have recently compared the forecasting performance of GP in the context of weather derivatives pricing with other state-of-the-art machine learning algorithms and classic linear approaches, finding that non-linear methods outperformed the alternative linear models significantly.
Up until now there have been very few applications of GP in macroeconomics. The first GP application is that of Koza (1992), who used GP to solve a SR problem designed to reassess the exchange equation, relating the price level, gross national product, money supply, and the velocity of money. More recent macroeconomic applications of GP have been used with forecasting purposes (Chen et al., 2010;Duda and Szydło, 2011). Ferreria (2011) developed a version of GP known as gene expression programming (GEP).
Recently, Peng et al. (2014) proposed an improved GEP algorithm especially suitable for dealing with SR problems. Gandomi and Roke (2015) compared the forecasting performance of artificial neural network models to that of GEP techniques.
SR is an empirical modelling technique used to construct regression models. Given a predetermined set of operations and functions, SR searches appropriate models from the space of all possible mathematical expressions that best fit the data. Zelinka et al. (2005) introduced analytical programming in order to synthesise suitable solutions in SR.
Given its versatility, SR has been increasingly used in different areas (Barmpalexis et al., 2011;Ceperic et al., 2014;Sarradj and Geyer, 2014;Vladislavleva et al., 2010;Wu et al., 2008;Yang et al., 2015;Yao and Lin, 2009;Zameer et al., 2017), but there have been very few SR applications in macroeconomics. Claveria et al. (2016) implemented SR via GP to derive a set of building blocks used to estimate economic activity. Kľúčik (2012) used SR to estimate total exports and imports to Slovakia. Kotanchek et al. (2010) implemented SR via GP to predict economic activity. By means of SR, Kronberger et al. (2011) identified interactions between economic indicators in order to estimate the evolution of prices in the US. The authors suggested using SR for the exploration of variable interplay when approaching complex modelling tasks, as it provides a quick overview of the most relevant interactions and can help to identify new unknown links between variables.
In this study we design five independent SR experiments and apply GP in order to find the optimal combinations of survey expectations that best fit the actual evolution of economic activity in each region. We also asses the forecast accuracy of the obtained evolved economic indicators and compare it with several benchmarking models.

Data and Methodology
In this study we use SR via GP to formalize the optimal interactions between survey variables that best allow to predict economic growth, restricting them to the main mathematical operations (addition, subtraction, multiplication, and division). In order to do so, we need to combine two types of information: qualitative survey expectations and quantitative official statistics from 2000:Q2 to 2016:Q3. Regarding the former, we make use of survey data on expectations from the World Economic Survey (WES) carried out quarterly by the Ifo Institute for Economic Research. As a proxy of economic activity we use the year-on-year growth rates of the Gross Domestic Product (GDP) retrieved from the Organisation for Economic Co-operation and Development (OECD) (https://data.oecd.org/gdp/quarterly-gdp.htm#indicator-chart).
The analysis is carried out for 26 European economies grouped in five regions based on the criteria used for statistical processing purposes by the United Nations Statistics Division. As a result, Austria, Belgium, France, Germany, Ireland, the Netherlands and the United Kingdom (UK) are grouped as Western Europe (1); Bulgaria, the Czech Republic, Hungary, Poland, Romania and the Slovak Republic as Eastern Europe (2); Croatia, Greece, Italy, Portugal, Slovenia and Spain as Southern Europe (3); Estonia, Latvia and Lithuania as the Baltic countries (4); Denmark, Finland, Norway and Sweden as the Scandinavian countries (5). In Table 1 we present the twelve survey variables used in the study, denoted as it X , where i refers to each country and t to the time period. Survey variables can be divided in judgements, perceptions and expectations, depending on whether they refer to the expected value in the present, in the present compared to last year, or for the next six months. See Kudymowa et al. (2013), Hutson et al. (2014), and Garnitz et al. (2015) for an appraisal of the WES data.
By means of GP we evolve a symbolic expression for each region combining the different survey variables for each country until a stopping criterion is reached. Regarding this criterion, it can either be a predetermined value of the fitness function or a given number of generations. As there is a trade-off between accuracy and simplicity, we have chosen a maximum number of 50 generations as as stopping criterion. In Table 2 we summarize the steps for implementing the experiment in each of the regions.    European countries and found that he question related to production expectations was more useful in improving the forecasting performance than the aggregated confidence and sentiment indicators.  The MASE, proposed by Hyndman and Koehler (2006), allows to scale the forecast errors by the mean absolute in-sample errors obtained with a benchmark model. This statistic presents several advantages over other forecast accuracy measures. On the one hand, it is independent of the scale of the data. On the other hand, it is easy to interpret: values less than one indicate that the average prediction computed with the benchmark model is worse than the estimates obtained with the proposed method.
With the aim of finding an easy to interpret measure to compare the forecast accuracy between two models, Claveria et al. (2015) proposed the PLAE statistic, which is also a dimensionless measure. The PLAE is based on the CJ statistic proposed by Cowles and Jones (1937) for testing market efficiency and the 'percent better' measure proposed by Makridakis and Hibon (2000) to compare the forecast accuracy of the models to a random walk. The PLAE consists on a ratio that calculates the proportion of periods in which the model under evaluation obtains a lower absolute forecasting error than the benchmark model: When comparing the obtained out-of-sample forecasts with the models used as a benchmark (last two columns of Table 3), MASE values show that in the case of Germany, Ireland, Italy and Greece, the forecast accuracy of the evolved indicators does not improve that of the in-sample average prediction of the naive method. The PLAE values obtained in these four countries also highlight that the percentage of out-of-sample periods in which the proposed regional indicator generates lower absolute forecasting errors is very low. Conversely, the high PLAE values obtained for both the Naïve method and the AR model for Croatia, Spain, Lithuania and Sweden are indicative of the good forecasting performance of the generated indicators for these countries. In most cases, the PLAE values obtained for both benchmarks are very similar, with the exception of Bulgaria or Slovakia, where the relative performance of the AR model improves. In Fig. 1 we graphically compare actual and predicted economic growth.    By discriminating between these two states of growth, we can graphically determine whether there are notable differences in the accuracy of the estimates of economic activity across regions. In Fig. 2 we present the boxplots for each region. We want to note that empirical correlation values in the smaller samples containing the extreme values are likely to be higher than in the subsets containing the remaining larger samples.
In Fig. 2 we can observe that the highest correlations during periods of high growth rates are obtained in Western Europe, with the exception of Ireland. It can also be seen that in all regions the performance of the evolved indicators seems to vary depending on the level of dispersion: during periods of average growth the correlation between estimates and actual values is lower than during periods of high growth rates. These In the present study we also find evidence regarding the informative value of surveybased expectations. Our results are in line with recent findings by Altug and Çakmakli (2016), Klein and Özmucur (2010), Kłopocka (2017) and Lehmann and Wohlrabe (2017). Altug and Çakmakli (2016) found survey expectations useful to improve inflation forecasts. Klein and Özmucur (2010) found evidence that survey expectations improved the forecasting performance of autoregressive time series models in European countries. Kłopocka (2017) showed the usefulness of survey indicators to forecast household saving and borrowing rates in Poland. Lehmann and Wohlrabe (2017) found that consumers' unemployment expectations and new orders improved predictions of employment growth in Germany.
While there is ample evidence in the literature in favour of the usefulness of expectations to improve the predictive capacity at the macroeconomic level Dua, 1992, 1998;Batchelor and Orr, 1988;Christiansen et al., 2014;Dees and Brinca, 2013;Girardi, 2014;Hansson et al., 2005;Ivaldi, 1992;Kumar et al., 1995;Leduc and Sill, 2013;Lemmens et al., 2005;Müller, 2009;Qiao et al., 2009;Schmeling and Schrimpf, 2011), several authors have recently proposed refinements in order to enhance the explanatory power of survey expectations in forecasting models. Bruestle and Crain (2015) have showed that controlling for significant versus insignificant changes in consumer confidence improved the accuracy of household expenditure forecasting models. Wilms et al. (2016) have suggested selecting survey indicators from the most predictive industries in order to improve the predictive capacity of survey data. Similarly, Dreger and Kholodilin (2013) have noted that better performing survey-based indicators should be built upon pre-selection methods and data-driven approaches to determine the weights.
In this work we have shown the appropriateness of the SR frame for empirical modelling. SR allows to address complex modelling issues in large data sets where the potential relationships between variables are unknown. In these circumstances, the implementation of SR via evolutionary computation provides researchers with an overview of the most relevant interactions and helps identifying new unknown links between variables. These features make this approach particularly indicated for nonlinear modelling.
By means of GP we have simultaneously evolved the structure and the parameters of the models without imposing any a priori assumptions. In this regard, Bruno (2014) has recently noted the importance of avoiding restrictive assumptions about the functional form when modelling using survey indicators. Thus, a SR via GP approach can be of particular interest when it comes to quantify survey expectations, to construct data-driven survey-based indicators or to test economic hypothesis about the formation of agents' expectations.

Conclusion
This paper proposes an empirical modelling approach to design survey-based economic indicators at a regional level. By means of SR via GP we find the optimal combination of survey variables that best tracks the evolution of the economic activity in twenty-six European countries grouped in five regions (Western Europe, Eastern Europe, Southern Europe, Baltic countries and Scandinavian countries). This data-driven approach based on evolutionary computation allows us to transform qualitative survey expectations into quantitative estimates of economic activity.
We have used survey variables regarding expectations about the economic situation from the World Economic Survey in order to find the most relevant interactions in each region. This exercise allows us to rank the expectations according to the relative weight of each one in the evolved economic indicators. Although results differ across regions, agents' "perception about the overall economy compared to the same time last year" is the best predictor of economic activity.
In a second step, we assess the out-of-sample forecast accuracy of the evolved surveybased indicators in each region. The best forecasting performance is obtained for Austria and the UK in Western Europe, for Slovakia in Eastern Europe, for Portugal in Southern Europe, for Lithuania in the Baltic countries, and for Sweden in the Scandinavian countries. At the regional level we obtain the best results for the Baltic and the Scandinavian countries.
Finally, we evaluate if there are differences in the accuracy of the estimates of economic activity across regions depending on the level of growth. We find that during periods of average growth rates the correlation between estimates and actual values is lower in all regions. The highest correlations during periods of high variability are obtained in Western Europe.
In spite of the novelty of the proposed approach, this research is not without limitations. On the one hand, given that we used a data-driven method, the evolved economic indicators are not grounded in any theoretical background. On the other hand, extending the analysis to other survey data would allow us to examine the extent of the similarities in the derived functional forms. Another issue left for further research is testing whether the implementation of alternative algorithms could improve the forecast accuracy of empirically generated quantitative estimates of expectations.