Getting life expectancy estimates right for pension policy: period versus cohort approach

Abstract In many policy areas it is essential to use the best estimates of life expectancy, but it is vital to most areas of pension policy. This paper presents the conceptual differences between static period and dynamic cohort mortality tables, estimates the differences in life expectancy for Portugal and Spain, and compares official estimates of both life expectancy estimates for Australia, the United Kingdom, and the United States for 1981, 2010, and 2060. These comparisons reveal major differences between period and cohort life expectancy in and between countries and across years. The implications of using wrong estimates for pension policy, including financial sustainability, are explored.


Motivation and roadmap
It is critical to know and apply the best estimate of individuals' remaining life expectancy in many areas of public policy, but this estimate is vital for pension policy in both the public and private sectors. Policy makers and private sector managers need to know with high confidence the median years that the newly born are expected to live and how this estimate will change in the decades to come. For pension policy, the best estimates of remaining life expectancy at retirement are crucial for determining the initial benefit or for pricing retirement income products. With the long-term decrease in age-specific mortality ratesthe flip side of life expectancy estimatesthe long-term increase in life expectancy at a specific age is increasingly anchored in public law and private sector contracts (for example, linking qualifying conditions, initial benefits, or retirement age to an estimate of life expectancy); in turn, the correct estimates are critical for establishing the financial sustainability of public and private sector schemes and for developing new retirement products.
Two main approaches are used to estimate life expectancy: one relies on period life tables, the other on cohort life tables. The period approach is simpler, as it uses mortality information across all ages for a recent period (e.g., a three-year average) to estimate mortality rates, and from these, life expectancy at a specific age. This approach ignores past and likely future improvementsthat is, the trend in the reduction of mortality rates and the consequent increase in life expectancy. The cohort approach incorporates the expected mortality improvement unique to each specific birth cohort, estimating the expected development in mortality rates and life expectancies for each birth cohort, by gender. This approach is much more ambitious and depends on many more assumptions. For this reason, most countries shy away from offering official cohort tables. But even reliable and country-produced official period tables are the exception rather than the rule for most countries across the world. Low-income and many emerging countries typically rely on United Nations (UN) estimates that are grounded in period tables; for projections, they apply a robust cohort-type approach adjusted to typically lowquality data. 2 Thus the differences between the conjectured higher and more reliable cohort life expectancies and their lower and biased period estimates are normally unknown.
This paper is structured as follows. The second section presents the conceptual difference between period and cohort life expectancy (CLE) measures and the methodological approach used by the authors to quantify it. A brief overview of the different estimation methods used in the literature is included. The third section presents the authors' estimates of period and cohort life tables and life expectancy at birth and at assumed retirement age of 65 for Portugal and Spain. The fourth section broadens the international comparison to official estimates of both period and CLE for distant past, recent, and future years for Australia, the United Kingdom, and the United States. These comparisons reveal major differences between estimates of period and CLE in and between countries and across years that amounts to a subsidy rate on benefits of 30% and more if the inadequate lower estimate is chosen. The fifth section discusses main policy implications of these differences, analyzing: the scope of the differences in economic terms between individuals; what this means for public pension schemes' financial sustainability and their balancing mechanism; how these differences add to the observed heterogeneity of longevity in period life considerations; and how this affects recent pension reforms that link life expectancy to scheme parameters. The final section summarizes the results and implications and proposes a simple way forward.

Differences between period and cohort life expectancy measures: methodological approach
Life expectancy is the most common statistical measure of the average remaining lifetime an individual is expected to live, given his current age, year of birth, sex, and other demographic and socioeconomic factors, including education, income, and job (Ayuso et al., 2017a). Life expectancy is critical in assessing a number of public policies, including pension schemes and health care systems. To compute life expectancy, the usual procedure involves building an ordinary life table, a tabular statistical tool that summarizes the survival and mortality experiences of a population and yields additional understanding about longevity prospects. In the past, analytical methods and mortality laws (e.g., De Moivre, Gompertz, Makeham, Weibull, logistic) were used to compute life expectancy estimates (Bravo, 2007).
Period life tables represent the mortality risks experienced by the different cohorts of an entire population during a single, relatively short period of time, usually no longer than three years. The corresponding period life expectancy (PLE) assumes that the mortality rates observed at a given moment in time apply throughout the remainder of a person's life; that is, they neglect any future expected changes in the longevity prospects of the population. PLE is purely a synthetic longevity measure, artificially engineered, that refers to a hypothetical cohort living its entire life according to the mortality rates observed at a single period in time. They are useful if one wants to compare trends in mortality by gender, over time, by socioeconomic risk factors, within regions of a country, or with other countries, but they do not actually represent the longevity prospects of individuals born in a given year. In fact, real individuals age and survive/die as members of cohorts experiencing time-changing mortality rates. 2 See UN (2017) for the presentation of the methodology used in their most recent 2017 demographic projections to 2100, including the patterns of mortality decline (PMD) method. The latter is a dynamic version of the period table or a simplified cohort approach. A companion technical paper by Gu et al. (2017) compares the performance of the PMD method and the three variants of the modified Lee-Carter (MLC) method as applied to age and sex-specific death rates (m x ) from 1950-1955 to 2010-2015 and used to project m x from 2015-2020 to 2095-2100 for 155 countries. Overall, the MLC method regardless of its variants generally worked well for countries with good-quality data, whereas the PMD method performed better for countries with lower-quality data. Their study suggests that the MLC method produces less stable results for future age-sex-specific death rates for countries with relatively low-quality data.
Cohort or generation life tables represent the mortality experienced by a cohort of individuals born during a relatively short period of time (typically one year) over the course of their entire lifetime. They require age-specific probabilities of death computed using mortality data from the cohort only. Although cohort life tables based entirely on observed mortality are quite rare in practice since they require consistent quality data over more than a century, cohort life tables based on a combination of past and expected future mortality for the cohort are more common, particularly in actuarial practice and population projection exercises. Contrary to PLE indicators, CLE measures take into account both observed and projected longevity improvements for the cohort throughout its remaining lifetime and are therefore considered a more appropriate measure of an individual's future longevity prospects.
Let T 0 (t) be the lifetime of an individual from some population born in year t. Let T x (t) be the remaining lifetime of an individual aged x in calendar year t. This individual will so die at age x + T x (t). Denote by n p x (t) the n-year survival probability for an individual aged x in year t, and by n q x (t) the corresponding n-year death probability (the distribution function of T x (t)), i.e., . The force of mortality at age x in calendar year t, denoted as μ x (t), is defined by from which the survival function of T x (t) can be written as Assuming a piecewise constant force of mortality, i.e., μ x+ξ (t + ξ) = μ x (t) for 0 ≤ ξ < 1, the following relation between q x (t) and μ x (t) holds and the maximum likelihood estimate of the force of mortality is given bŷ where m x (t), d x (t) , and E x (t) denote, respectively, the central death rate, the total death count, and the exposure-to-risk at age x in year t. From the above definitions and assumptions, the complete CLE for an x-year old in year t,ė C x (t), is given by the expected remaining lifetime at time t, i.e., whereas the complete PLE for an x-year old in year t,ė P x (t) can be computed usinġ Contrary toė C x (t) that is a random variable and requires mortality projections, the computation oḟ e P x (t) for past t is objective, i.e., it is not subject to model risk. From (6) and (7) it is clear that PLE will only match CLE if age-specific mortality rates do not change over time. If mortality rates are expected to decline (increase) over time, CLE will always be higher (lower) than PLE. During the last two centuries, the life expectancy frontier of developed countries experienced a persistent (almost linear) increase in measured PLE (Oeppen and Vaupel, 2002). Past trends provide overwhelming evidence to suggest that declines in mortality rates are expected to continue in the future, making PLE a systematically lagged CLE indicator, thus systematically underestimating the remaining lifetime of individuals.
Understanding the relationship between period and CLE, quantifying the magnitude of the differences, analyzing how this link has been changing over time, and identifying its determinants are critical issues for pension design and reform. The objective here is to define a measure of the systematic correspondence between period and CLE and express it in terms of the well-known Lee-Carter (LC) stochastic mortality model (Lee and Carter, 1992).
Define by the life expectancy gap,ė Gap x (t), the difference between period and CLE at age x in year t, that is,ė The size of the gap (in years of life) expresses by how much PLE at age x in year t differs from the life expectancy of the cohort attaining age x in year t. In populations experiencing a regular improvement in mortality, the gap will be positive. When positive, the gap represents the average extra lifetime a cohort is expected to enjoy as a result of expected future mortality declines. 3 To quantify the magnitude ofė Gap x (t), period and CLE must be estimated. Computing period and CLE depends on forecasting age-specific mortality rates. In recent years, considerable attention has been paid to the development of stochastic mortality methods, taking into account its three key dimensions: age, period, and cohort (or year of birth). Much of this work emerged from the first and still most widely used age-period mortality modelthe LC model. This model assumes that the force of mortality has a log-bilinear structure combining age and period parameters, the latter representing a general common time trend in mortality to be modeled using time series methods to produce mortality projections and generate prospective life tables. In the original log-bilinear LC model the central death rate m x (t) is of the form where α x denotes the general shape of the mortality schedule, β x represents the age-specific patterns of mortality change, and k t represents the overall mortality time trend index. The sign and magnitude of β x determine which mortality rates will be more impacted by a change in k t . To forecast mortality rates, the authors assume that vectors α x and β x remain constant over time and forecast future values of k t using a standard univariate (random walk with drift) time series model. According to this model, the dynamics of k t follow k t = k t−1 + m + 1 t with iid 1 t Nor(0, s 2 ) (10) The point estimate of the stochastic forecast of the time index at time t 0 + s, given all data available up to t 0 , is given byk with conditional variance V[k t 0 +s |k 1 , . . . , k t 0 ] = ss 2 . Combining equations (6)-(9), the expected life expectancy gap at time t 0 can be expressed in terms of the LC model as followṡ with m x+j (t 0 ) = exp (a x+j + b x+j k t 0 ). Note that since the dynamics of the overall mortality time trend index follows (10), the mortality rates expressed in (9) develop over time following a stochastic process such that the life expectancy gap is also a random variable (Denuit, 2007). Replacing k t 0 +j by its mean estimate, E[k t 0 +s |k 1 , . . . , k t 0 ], the life expectancy gap can be expressed as the following non-random quantity From (12), it is clear that in a LC setting cohort mortality rates are obtained from period mortality rates via a stochastic reduction factor RF(x, t 0 + j) = exp [b x+j (k t 0 +j − k t 0 )] as widely used, e.g., in the UK and in the US. For ARIMA(0,1,0) with drift parameter μ, the reduction factor reduces to the following non-random quantity F(x, t 0 + j) = exp [β x+j jμ]. Assuming all the β x+j 's are positive (as normally observed in empirical studies using this model) and that the estimate of μ is negative (as observed in a mortality decline scenario), the life expectancy gap is an increasing function of μ, i.e., the higher the (negative) slope of the random walk with drift model the higher the positive difference between cohort and PLE at a given age in a given year. 4 With continuous survival improvements, observed life expectancy gaps are consistently positive. However, as shown below, the gap is not constant over time since rates of mortality improvement change due to, for example, age shifts in the profile of mortality decline and/or turning points in the long-term trend of mortality. In very particular circumstances (constant yearly improvements at all ages within a Gompertz mortality model), linear increases in PLE correspond to linear increases in the respective CLE (Missov and Lenart, 2011).
From equation (12), observe that the magnitude and dynamics ofė Gap x (t 0 ) in the LC framework are severely constrained by the model assumptions. 5 The model extrapolates past mortality trends assuming that the age pattern of mortality decline is constant over time and that the overall time trend declines steadily at a constant rate, inducing systematic forecast errors. The underestimation of the life expectancy gap becomes larger if a country experiences long-term trend changes in mortality, namely acceleration in the rates of mortality decline. The model's assumption of time invariance in the rate of change in mortality is also hard to sustain in practice and several attempts have been made to move away from this restrictive assumption (see, e.g., Lee and Miller, 2001;Booth and Tickle, 2008). Alho et al. (2012) provide empirical evidence for eight countries of the deficiency of the time invariance assumption and conclude that the cause of error is due to the pronounced tendency for the rate of decrease in mortality to accelerate since the 1970s (with Japan as the most flagrant example), which is expected to continue for the age groups constituting the pool of pensioners. 6 Observed period and CLE increases result from the continuous linear shift from younger to older ages in the distribution of mortality reductions. In the future, this will lead to an underestimation of the longevity of retired cohorts with an increasing impact on social welfare systems.
The LC model motivated numerous variants, extensions, and alternatives to provide more robust statistical properties and to improve the model's goodness-of-fit and forecasting performance. For example, a number of alternative estimations and modeling approaches have been proposed. 7 Extensions adding linear and non-linear cohort effects have been developed. 8 Multipopulation mortality models were suggested to identify the underlying long-term mortality trends. 9 Alternative explanatory approaches (e.g., mortality forecasting by cause of death) have also been tested. 10 Palmer et al. (2018) highlight the need to design statistical models that capture the dynamics of the accelerating decrease in mortality rates across industrialized countries, particularly at higher ages. Their ex-post and ex-ante evaluations against 2600 birth cohort data of eight countries suggest a sizable and rising underestimation of CLE using existing methods.

Estimating the life expectancy gap for Portugal and Spain
This section presents novel results for the magnitude of the life expectancy gap in Portugal and Spain. Expected future mortality developments are modeled using the log-bilinear LC model under a Poisson setting (Brouhns et al., 2002a;Renshaw and Haberman, 2003). To calibrate the model, data for the overall populations of Portugal and Spain from 1980 to 2015 and for ages 0-95 are used. Data on deaths and exposures are obtained from Human Mortality Database (2017). Parameter estimates are obtained using ML methods and an iterative method for estimating log-bilinear models developed by Goodman (1979), considering the usual identification constraints. It is then assumed that the age vectors α x and β x remain constant over time and future values of k t are forecasted using a standard univariate time series ARIMA( p,d,q) model. Finally, to close the prospective life tables at high ages and to establish the highest attainable age ω, the simple and efficient method proposed by Denuit and Goderniaux (2005) is applied. We use bootstrap simulation methods to derive confidence bands for the mortality rates. Once the matrix of observed and projected mortality rates q x,t , x ∈ [x min , x max ], t ∈ [t min , t max ] is generated, complete period and cohort life expectancies are computed using (6) and (7).
The LC parameter estimates (Figure 1), the forecasted mortality rates for some representative ages ( Figure 2), and the forecasted period and cohort life expectancies ( Figure 3) for the Portuguese and Spanish female populations are exhibited below. Figure 1 shows that the general shape of mortality across ages (as represented by the α x parameter estimates) exhibits similar patterns in Portugal and Spain between 1980 and 2015.
As is common in developed countries, average mortality rates are relatively high for newborns and children, then decrease rapidly toward their minimum (around age 12), increasing thereafter with age, reflecting higher mortality at older ages. 11 The time trend parameter estimates k t exhibit a clear decreasing tendency (approximately linear) in both countries, indicating the significant mortality improvements registered for all ages and both sexes over the last 35 years. The pace at which mortality improvements have taken place is not homogeneous across ages, however, as observed from the β x parameter estimates. Observed mortality improvements have been more significant for youth, particularly in Portugal due to better infectious diseases control, better health care systems, and improved living conditions, but are also relevant for adults and the elderly. The forecasted mortality rates project 7 See, e.g., Lee and Miller (2001) See, e.g., Renshaw and Haberman (2006), Currie (2006), Plat (2009) and Blake (2014, 2015). 9 See, e.g., Li and Lee (2005)  See, e.g., Hanewald (2011), Gourieroux and Lu (2015). 11 The only minor exception is the well-known 'mortality hump' around ages 20-25, normally more pronounced in the male population, a phenomenon normally associated with accident-or suicide-related mortality.
into the future past trends observed in mortality across all ages ( Figure 2). As can be observed, the Poisson-LC method projects a continued decline in mortality at these ages, with increased volatility around the general trend more significant at birth. Figure 3 reports period and cohort life expectancies computed at birth and at age 65 for Portuguese and Spanish females for the period between 1980 and 2060. In both populations the life expectancy gap is significant, with PLE indicators clearly underestimating future longevity prospects. The difference is, as expected, more significant at birth (13.1 years in 1980 in Portugal and 12.8 years in Spain) than at age 65 (1.7 years in 1980 in Portugal and 1.9 years in Spain). 12 The life expectancy gap is expected to continue to be significant in both countries in the future, although the magnitude ofė Gap x (t 0 ) is forecasted to decline at birth and to slightly increase at age 65.

Period versus cohort life expectancy estimates: international results
Cohort life expectancies currently exceed period life expectancies due to the observed decreases in mortality rates that started in the 18th century in some advanced countries and continue in the 21st century worldwide (see Ayuso et al., 2015). As explained in the preceding section, period tables are static tables built on the basis of the mortality behavior observed in the population during one period, while cohort tables incorporate projections of the future trend in mortality, taking into account observed changes over time, at birth and at different ages for different generations. The different demographic institutes across countries do not construct cohort tables as frequently as they do period tables. In fact, for most countries, information on observed and projected life expectancy based on static calculations (typically jointly collected for different countries by international organizations such as the UN 13 , the World Bank 14 , Eurostat 15 , and OECD 16 ) can be found, and is systematically used in calculations related to pensions, health, long-term care, and welfare status; on the contrary, it is rare to find life expectancy estimates based on cohort tables. This section compares the limited comparable country data on period and CLE that exist from official sources for Australia, the United Kingdom, and the United States, supplemented by the estimates

17
The period and cohort life expectency estimates for New Zealand and their differences and trends support the analysis. Their results are not included here as the available years do not coincide with this analysis. For New Zealand estimates, see http://archive.stats.govt.nz/browse_for_stats/health/life_expectancy/cohort-life-expectancy.aspx. Table 1 confirms that CLE estimates are always greater than those for PLE for all compared countries and years. These differences exceed 10 years for several countries and periods analyzed (e.g., 12.65 years for Australian men in 2010). This is the case for the United Kingdom, Australia, and Portugal,  which have the greatest differences between the two estimates. Differences also exist in the United States and Spain, but are less pronounced (e.g., in the United States, the difference in life expectancy at birth between period and cohort estimates in 2010 is about 5.5 years). In all five countries, differences between the two values are projected to decrease over time, probably due to the smaller margin expected to improve survival probabilities (taking into account the high probabilities already reached in advanced ages). Differences between countries are also observed by gender (Table 1, bottom panel). In general, greater differences are seen between the estimates of period and CLE at birth in men than in women. For instance, in 1981 the difference between the men and women life expectancy gap was 2.86 years in the United States and 4.44 years in Australia. Only in Australia is a greater difference projected for women in 2060, a phenomenon that also occurred in Portugal and Spain in 1981. However, a reduction in the differences between men and women is generally projected over time (except in Spain and Portugal, where they remain essentially unchanged), which could be driven by the reduction in the gender gap in life expectancy in these countries.
A similar analysis for life expectancy at age 65 is presented in Table 2. The results again show that estimated life expectancies from cohort tables are higher than those obtained from period tables in all five countries. The biggest life expectancy gap values are observed in Australia (up to an 11-year difference in life expectancy), followed by the United States (up to a 5-year difference). Again, projected values reflect a reduction in the (albeit still positive) differences between cohort and period estimates over time.
Analyzing the differences between men and women ( Table 2, bottom panel), only small differences between genders tend to persist over time. In the United Kingdom, the United States, and Australia, men show greater differences between cohort and PLE estimates at 65 years; the opposite result is found in Spain and Portugal, which show slightly higher differences for women.

Implications for pension policy
Applying estimates of remaining life expectancy at retirement by using the lower PLE instead of the higher CLE has two key implications for pension policy. First, at the individual level, doing so fails to establish an actuarially fair link between contributions and benefits, thus distorting individuals' labor supply and saving decisions (contrary to the goal of recent reform attempts). Assuming that CLE is the correct estimate, using the lower PLE to calculate the initial benefit implies a subsidy for individuals that may not only bias one's labor supply while young and one's retirement decision when olderit may also affect one's saving and dis-saving decisions over a lifetime. While the bias can go in both directions, the income effect is likely to dominate the substitution effect, which may lower one's labor supply when young and advance the retirement age; for saving, one may lower accumulation efforts when young and decumulate faster when retired.
Second, at the pension scheme level, the use of the lower PLE makes the pension scheme financially unsustainable, as it incorrectly signals solvency; that is, that liabilities are smaller or at most equal to assets, while in reality this is not the case. This is valid for both unfunded and funded schemes: in funded schemes, assets are essentially unchanged by an underestimation of remaining life expectancy, while liabilities increase. In unfunded (nonfinancial) schemes, the contribution asset is negatively affected as it represents the present value of the difference between future contributions and the liabilities thereby created; if the life expectancies are actually higher, the pay-as-you-go (PAYG) asset must be lower and hence both liabilities and assets deteriorate, with liabilities higher and PAYG assets lower than assumed.
Of course, underestimation of actual life expectancy does not remain unnoticed, but emerges only gradually in periodic asset/liability checks or more often in the annual cash-flow comparison when expenditure due to longer periods of benefit payment exceed contribution revenues. The policy reaction is typically an ad hoc adjustment in nonfinancial defined benefit (NDB) schemes' parameters (such as increasing the contribution rate, playing with the indexation parameters of benefits under disbursement, or calling for an increase in the retirement age; else the government transfers are increased). In nonfinancial defined contribution (NDC) schemes, the key policy options are a lower notional interest rate for the annual account accumulation and a lower indexation for benefits in disbursement. In both cases, the consequences of systematically underestimated life expectancy are shared in an ad hoc manner between the working and retired populations. Such a disruptive approach does not create the confidence in the scheme that pension economists consider important for a smooth and successful operation.
This section explores four policy areas: (a) How important is the difference between period and CLE for individual decisions and financial sustainability underestimation?; (b) What is the effect of the wrong life expectancy choice on the balancing mechanism recently implemented in a number of NDB and NDC schemes?; (c) If there is a difference between these schemes with regard to the choice of life expectancy estimation, what happens if heterogeneity in longevity exists?; and (d) How relevant are life expectancy changes as policy triggers after recent reforms?

How important is the difference between period and cohort life expectancy?
The fifth section offers the estimated magnitudes between period and cohort life expectancies at birth (age 0) and retirement (assumed at age 65). It is this latter age that matters most for pension policy considerations. Table 2 offers the scope of and differences between both approaches. These magnitudes and their ratio can be given a simple welfare economic interpretation through the concept of pension wealth (see Ayuso et al., 2017b). Pension wealth at any age is the present value of future benefit streams at this age. Assuming that the benefit indexation equals the discount rate (an assumption that broadly holds for wage-indexed pensions), then pension wealth at retirement is the pension benefit at retirement multiplied by life expectancy. If the scheme is actuarially fair, accumulation at retirement should equal pension wealth; that is, the devisor of the accumulation needs to be the correct life expectancy.
PW 65 (t) = AK 65 (t) for an actuarially fair scheme (16) where PW 65 (t) is pension wealth,ė 65 (t) life expectancy, AK 65 (t) accumulated contributions (plus interest), and b 65 (t) initial pension benefit, all at retirement age 65 in year t. If the initial benefit is calculated by using the too-low PLE, then actual pension wealth exceeds the value of the accumulation by the ratio of the CLE to the PLE. Expressing the ratio as a change in the difference amounts to a subsidy that the generation would receive (unless corrective actions were undertaken). From Table 2, one can calculate the implicit subsidy rate at retirement that is behind the differences between period and CLE for the five countries and the 3 years of estimation (Table 3). The reason this can be called a subsidy rate is very simple: this is the rate at which own accumulations would need to increase to achieve the same benefit level as that derived from applying the PLE rate to own accumulations. Table 3 indicates both major differences and commonalities in the implicit subsidy rate between countries for which both period and cohort life expectancies are available. First, all countries have a declining relative gap between period and CLE and thus a shrinking subsidy. While for the age cohort of 1981 the average difference is 32% for men and 24% for women, the difference is projected to reduce to slightly above 10% for both genders for the age cohort of 2060. Second, the differences between genders are reduced across 1981, 2010, and 2060 for Australia, the United Kingdom, and the United States; in Portugal and Spain they remain broadly constant. Third, the differences between countries are also reduced. While the differences in 1981 between the highest and lowest country value were about 10:1, this ratio reduces to 2:1 in 2060.
It is not clear the extent to which these reduced differences are related to the application of similar or common estimation models or if these common trends actually constitute common underlying developments. In any case the scopes are relevant, and comparable in magnitude to the heterogeneity created by differences in lifetime income (see the fourth section). For the current mid-career generation of 1981 or the current primary school generation of 2010, the subsidy created by applying period instead of CLE is sizable and may distort their labor supply and saving decisions, working against the objectives of recent systemic and comprehensive parametric reforms in these five countries.
7. The balancing mechanism in defined benefit and defined contribution schemes Most advanced and some emerging economies undertook systemic or comprehensive parametric reforms with the objective of making their pension system financially sustainable (or more specifically, their main earnings-related pension scheme(s)). To deal with future financial disequilibria, various countries introduced a balancing mechanismthat is, a rule-bound mechanism of parametric adjustments to the scheme that is triggered when financial disequilibria emerge. 18 The adjustment may be in the level of benefit indexation, a reduction in the nominal benefit level, a decrease in the annual accrual rate (in NDB schemes), or a change in the annual account indexation rate (in NDC schemes). What triggers the application of the balancing mechanism may simply be differences in expenditure and revenues of the scheme, or some measure of an actuarial imbalance based on the present value of deficits, or very rarely the application of more elaborate asset/liability comparisons. To establish financial (un-) soundness, some countries undertake annual or periodic actuarial assessments (such as Japan, Sweden, the United Kingdom, and the United States).
How are these balancing mechanisms influenced if the 'wrong' mortality/life expectancy data are selected? What is the scope of underestimated liabilities? Are assets also affected? Is there a difference between benefit type (DB/DC) and funding mechanism? These issues are discussed next in turn. 18 A number of countries introduced balancing mechanisms to assure or at least support financial sustainability of the mandated schemes. Some are, in principle, automated through the move toward an NDC scheme and the choice of the account indexation rate (such as in Italy, Latvia, Norway, Poland, and Sweden), with only the latter country actually having a formal balancing mechanism. A number of NDB countries (e.g., Finland, Germany, Japan, Portugal, and Spain) introduced sustainability factors to the same end (see OECD, 2012OECD, , 2015OECD, , 2017. However, an assessment of their actual working and effectiveness is still outstanding.

The scope of underestimated liabilities
For many policy makers and pension observers, this is quite likely the key question, as the size of underestimation of liabilities may determine the speed and type of corrective interventions. A full reply is, of course, country specific but the results of Table 3 suggest the magnitude. The birth cohort of 1981 (i.e., the cohort of those currently 37 years old) can be taken as the low boundary estimate for the current generation. If correct and without taking into account future corrective interventions, the subsidy rates can proxy the difference between liabilities estimated with cohort mortality rates/life expectancy and with the corresponding period estimates. The difference in the all-country average is well above 20%; for Australia, it is well above 50%; and for Spain, it is surprisingly low, at about 10%.
Clearly, the scope of underestimation of the true liabilities can be sizable, but the implications depend on the country and its scheme. In the United States, the 30% underestimation is likely to prevent the adjustment mechanism established by the social security law from kicking in. As this scheme has a sizable trust fund that is running down, the policy reaction may only occur when the trust fund resources are closer to expiration (currently foreseen by 2034; see Board of Trustees, 2018). In Australia, there are no direct consequences. Its earnings-related scheme is funded and does not provide any annuity at all, nor is its purchase required, and purchase of voluntary life annuities is minimal. If individuals underestimate their remaining life expectancy and decumulate too fast, the consequences will be realized in the universal old-age pension, which is means-and asset-tested but accessed by some two-thirds of the eligible older population. In Spain, the (small) underestimation is only gradually felt by the cash-flow gap as no actuarial estimation reprocess and trigger is established.

The differences between DB/DC and funded/unfunded schemes
In the typical Bismarckian NDB schemes that still dominate much of the world, the differences between period and CLE have little importance even if periodic actuarial assessments are undertaken. In almost all cases, it is the cash balance outcome (i.e., the difference between period revenue and period expenditure) that triggers actions (with or without a balancing mechanism in place). Actuarial assessments with the too-low life expectancy data may trigger a late and insufficient reform when the actual insolvency has already existed for some time. And actuarial assessments of NDB schemes are typically built on discounted cash balance approaches, not on asset/liability comparisons. Conceptually, NDB schemes may give rise to the largest underestimation of actual liabilities, because they underestimate not only the liabilities of those already retired, but also those of active contributors. These estimates also rely on using the correct mortality data. This differs from the liability estimates in NDC schemes, in which liabilities for the active workforce are covered by their accumulated contributions, and assets and liabilities fully match. In the estimation of implicit pension debt, liabilities toward active workers typically amount to two-thirds of overall liabilities (Holzmann et al., 2001). Hence the underestimation of liabilities in NDC schemes applies only for pensions in disbursement and the applied too-low life expectancy when converting individual accumulations into the initial pension benefit, and amounts to only one-third of the full liability. However, compared to financial defined contribution (FDC) schemes where the assets are in principleimmune to mortality rate misestimation, the wrong choice of mortality rates/life expectancy estimates should impact the estimate of assets in NDC schemes. This is due to the PAYG asset, which is calculated as the present value of future contributions minus the liabilities created; the latter depends, of course, on the selected mortality/life expectancy estimates. Applying the too-low PLE estimates underestimates these future liabilities and thus overestimates the PAYG asset. 19 A proper projection of LE would also impact the estimate of assets in NDC schemes through a (turnover time of a unit of money) liquidity effect since higher life expectancy values translate, ceteris paribus, into higher expected turnover duration via an increase in the money-weighted average age of retirees. For a given accumulated notional capital, benefit payments on the savings account would have to be spread out over a longer period of time (Palmer, 2012). 20 For FDB and FDC schemes, typically periodic and often annual actuarial assessments take place. While assets arein principleimmune to the selected mortality estimates, liabilities are affected by the incorrect selection, and the scope of this underestimation has been hinted above. In funded schemes, however, additional underestimation of the liability may take place through the choice of a too-high discount rate. Sometimes the choice of the discount rate is determined by the return of the asset side of the funded scheme, for which highly optimistic assumptions are assumed. For most pension economists, there is no link to the rate of return as the discount rate is governed by other considerations.

Adding heterogeneity considerations to estimation of life expectancy
For pension schemes the issue of correct estimation of future life expectancy of a retiree cohort is complicated by the increasing recognition that this mean estimate has a dispersion that is linked to the level of lifetime income and accumulated savings (Ayuso et al., 2017a(Ayuso et al., , 2017b. Hence, not taking account of both issues may lead to multiple and interrelated distortions at both the individual and scheme level. For example, the wrong mean estimate for life expectancy leads to underestimation of the pension scheme's liability. If the same common life expectancy is applied to all cohort members at retirement, a second underestimation of the liability is introduced, as richer individuals have a higher life expectancy. For the highest income decile, the individual subsidy rate may reach 30% for women and 15% for men. In contrast, those in the lowest income decile may face a tax rate of 20% or more on a much lower pension. The scope of the aggregate underestimation of liabilities will depend on the distributive characteristics and choice of the mean value (average or median), but is hypothesized to be 10-15%.
The interaction of both effects on individuals and their labor supply and saving decisions has not yet been worked out. For lower-income groups, the tax rate effect of heterogeneity is counteracted by an incorrect lower mean value. For higher-income groups, the subsidy effect of heterogeneity is fortified by the subsidy effect of a too-low mean estimate. How this affects individuals' decisions will also depend on their perceptions of their own life expectancy. Understanding these mechanisms is important for designing appropriate policy interventions. For correct mean and heterogeneity estimates, the effectiveness of some policy proposals has already been estimated (Holzmann et al., 2019). 19 In reality, of all the NDC countries only Sweden has a formal (automatic) balancing mechanism and it approximates the PAYG asset from cross-sectional revenue data. Hence, underestimation of life expectancy will also affect its NDC scheme as the PAYG asset will tend to be overestimated. 20 We are in debt to an anonymous referee for this suggestion.

Life expectancy measures in recent pension reforms
In recent decades, most OECD countries responded to continuous growth in life expectancy with pension reforms in which a common feature is to create an automatic link between future pensions and changes in life expectancy. The link between life expectancy and pension benefits has been accomplished in at least six different ways (Whitehouse, 2007;OECD, 2017): • By introducing FDC plans as a (often partial) replacement for unreformed NDB pensions (e.g., Mexico, Poland, Sweden); • By introducing an automatic link between life expectancy and pension benefits, for example through demographic sustainability factors (e.g., Finland, Portugal, Spain); • By linking the normal retirement age to life expectancy (so far 10 countries including Denmark, Italy, the Netherlands, and Portugal); • By connecting years of contributions needed for a full pension to life expectancy (e.g., France); • By substituting traditional NDB public schemes with NDC schemes that replicate some of the features of FDC plans, namely the way in which pension (annuity) benefits are computed (e.g., Sweden, Poland, Latvia, Italy, Norway); • By linking penalties (bonuses) for early (late) retirement to years of contributions and normal retirement age (e.g., Portugal).
These reforms represent a fundamental change in the way longevity risk is shared between contributors and retirees, and between current and future generations, that has become more explicit and, in principle, based on automatic rules rather than ad hoc changes as in the past. However, the legislated automaticity in countries must still stand the test of time. And even if automatically introduced and moving in the right direction, these measures are almost always incomplete and insufficient to assure financial sustainability issues triggered by life expectancy changes. This adds to issues of not selecting the correct life expectancy under these mechanisms and the incentive distortions involved. In almost all cases and countries, the PLE measure has been used to link longevity and pension benefits, which, as discussed above, results in underestimating remaining lifetime at retirement. This option has consequences on the way longevity risk is shared between generations. The financial consequences of underestimating life expectancy during retirement are ultimately borne by those who fund the pension scheme; that is, younger cohorts in NDB/NDC schemes and private contributors/ sponsors in FDB/FDC schemes.
For instance, demographic sustainability factors introduced in Finland, Portugal, and Spain automatically link initial pension benefits to life expectancy observed at the time of retirement, leading, in most cases, to a reduction in pension entitlements. These sustainability factors are computed as a simple ratio between period life expectancy observed at some reference age (e.g., 65 in Portugal and 67 in Spain) in some (past) reference year (e.g., 2000 in Portugal and 2012 in Spain) and period life expectancy observed at the time of retirement (in Spain according to mortality tables for the pensioner population as they are designed by the social security system). By design, these sustainability factors are not consistent with an actuarially neutral pension scheme since they do not guarantee that by adjusting (reducing) initial pension benefits but paying them for a longer period the scheme is financially neutral. To the extent that trends in period and cohort life expectancies differ between past and current generations, the system could redistribute in favor of older cohorts and have a negative impact on the public pension system's sustainability.
Linking the normal retirement age to period life expectancy instead of CLE will extend working lives but will be insufficient to preserve actuarial neutrality between contributions and benefits, thus maintaining major distortions in individual labor supply and saving decisions, and forgoing the macroeconomic impact of higher levels of employment on investment, gross domestic product, consumption, and public finances. Moreover, the lower-than-consistent-with-actuarial-neutrality increase in the retirement age reduces the size of the positive effect on pension adequacy resulting from longer contribution careers and higher pension accruals.
Linking penalties (bonuses) for early (late) retirement to years of contributions and normal retirement age using period life expectancy measures does not ensure actuarial neutrality between contributions and benefits. Actuarial neutrality depends on the parameters that determine the annuity factor survival probabilities, indexation rate, discount ratethat revert to life expectancy when the indexation rate equals the discount rate. Increases in life expectancy require higher penalties for early retirement and lower bonuses for late retirement to keep up with actuarial neutrality. Adopting a period approach in measuring life expectancy to assess the work incentives around retirement ages systematically underestimates the magnitude of the penalties (bonuses) for early (late) retirement needed to ensure actuarial neutrality and a fair share of longevity risk between generations, and is likely to reduce labor supply.

Conclusion
While the general discussion about the change in life expectancy of the population and its past and projected future increases has finally reached policy makers and the public at large, more technical and political attention is needed on the selection of estimates of life expectancy and their application. This paper substantiates why it is so important to get life expectancy estimates right for pension policy, how it can be done, and how to overcome the key obstacles.
At the technical level, few arguments arise for not estimating and applying cohort life expectancies that take account of past and expected future declines in mortality rates. Technical issues remain regarding how best to estimate cohort values, and the best methods depend on high-quality data. But even for countries with excellent data quality, estimates seemingly differ across groups of countries. Some recent methods that take account of the acceleration in the decrease in mortality rates, particularly at higher ages, suggest a noticeable and rising underestimation of CLE by conventional methods. Yet the potential estimation errors that may emerge from a commonly applied approach (e.g., across the European Union) are likely to be dwarfed by the magnitude of the differences generated between the cohort and period approaches.
The paper presents estimates of such differences for five countries: three have official estimates of both cohort and period life expectancy (Australia, the United Kingdom, and the United States), while two have only official period tables, supplemented by cohort values estimated herein (Portugal and Spain). Significant differences are estimated between cohort and period life expectancies: differences reach 8-15 years at birth that are reduced to 2-4 years at age 65, and all projected differences tend to decrease over time but do not disappear. Differences in estimated life expectancies by gender are also found but are mostly moderate and are not systematic across countries or over time.
Translating these differences in life expectancy at age 65 (around the median age of current standard retirement age) amounts to an implicit subsidy to the average retiree that can reach 30% or more of the pension wealth of the current working generation in some countries. That is, using period life expectancy to calculate the initial benefit at retirement offers a too-generous benefit level that is not consistent with actual financial sustainability. To address the implications at a later stage will require additional contributions or budget transfers by future working generations or a partial default for those currently working (i.e., future retirees).
Selection of the correct life expectancy estimate is also increasingly important for day-to-day pension policy. Most countries have undertaken some kind of reform that legally links pension schemes to the development of the officially measured change in life expectancy. This naturally includes NDC countries and calculation of the initial benefit, but also includes the many NDB countries that have linked benefit levels and/or retirement ages to such a life expectancy measure, and the fewer countries that have a financial stability mechanism for their pension scheme. In all cases, inadequate choice of the life expectancy measure leads to incentive distortions and miscalculations of financial sustainability.
The proposed solution is simple as well as effective: convince governments that it is in their interest to apply the best estimates of cohort life expectancies. The estimation can be performed by national statistical offices in close cooperation with academic and partner institutions in other countries to compare, to learn, and to progress.