• No results found

As might be considered unexpected, the ratio of the observed deaths and exposures are not declining very much. The reason for this is that the average age increases generally over time as well. Over the years, the number of people aged 90 years increased considerably relative to the number of people aged 45 years. As shown before, this has an increasing effect on mortality rates. Apparently, those effects cancel each other out to some extent. Therefore, calculating the trend in observed mortality per age might provide valuable insights. This isolates the effect of the decrease in death rates per age. The results are shown in Figure 5. It appears that, mostly for females, the logarithm decreases smoothly by approximately one. Note thatFigure 3and4 show a decrease in log mortality of approximately 0.4.

Figure 5

The logs of observed deaths and exposures in the whole country per year for males (right) and females (right) with certain ages.

3.2.1 Simulating numbers of deaths

As we aim to investigate the amount of data that is necessary for this model to perform appropriately, we construct data sets consisting of different numbers of years and different years of exposure per year. The yearly numbers of deaths are simulated per age x, per pension fund i, per year t, per income class k and per gender g. As derived earlier, under the assumption of piecewise mortality, the yearly total number of deaths Dk,t,xi are Poisson distributed with a mean that equals the years of exposure Ek,t,xi multiplied by the fund-specific mortality rate. The fund-specific mortality rates are obtained by multiplying the baseline mortality ˙µk,t,x by fund-specific age-dependent factors ˙Θix. The dots indicate that these are presumed, and do not have to be estimated in the model. Even though the numbers of deaths are simulated for both genders separately, the specifying indexes are left out for simplicity purposes. Therefore, the yearly total numbers of deaths are drawn from a Poisson distribution as follows:

Dik,t,x∼ Poisson

Eik,t,xµ˙k,t,xΘ˙ix

. (25)

3.2.2 Exposure for data generating process

As the aim is to assess the performance of the model for data with similar characteristics as the data available, we simulate data sets with the same year groups and the same distribution of exposure over the ages. We simulate nine data sets with three different years of exposure per year and three different numbers of historical years. The data set with the least number of historical years and the lowest exposures is similar to the data set that is obtained from the pension funds. Also, similar to the data from the pension funds, two groups of years are implemented. Analogously, two pension funds have simulated historical data from the year 2011, and eleven pension funds from the year 2014. Hence, the first group of years pop1 consists of the years 2014 to 2019, and pop2 includes the years 2011 to 2013. Note that there are thus thirteen pension funds in total. In the two other varieties, we artificially expand group pop1. In the second variety, pop1 consists of years 2007 to 2009, and pop2 includes the years 2010 to 2019. In the variety with the most historical years, the years 2003 to 2005 are in pop2, and the rest of the years up to 2019 belong to pop1. Artificially expanding pop1 by four years is comparable to the data set that results when four years of data of all pension funds can be used in addition to the current available data set in the case that all pension funds remain willing to participate.

There are also three varieties in terms of years of exposure. By construction, the exposure per pension fund is the same for all thirteen pension funds, all seven income classes and all years. However, in order to imitate the size of the aggregated pension fund data in the first variety of exposures, the total years of exposure is set to be approximately the same as the pension fund data, and distributed over the different ages accordingly. The total exposure is distributed over the ages as depicted inFigure 6.

In this way, the total exposure for the year 2019 is equal to approximately 88.000. For the other two other varieties, the exposure is higher. The total exposures Ek,t,xi are simply multiplied by 11.9 and 142.9. In this way, the last variety corresponds to a yearly exposure of approximately 12.558.000, and the total exposures are

Figure 6

The distribution of the years of exposure over the ages in the simulated data sets in fractions.

equidistant on the logarithmic scale.

3.2.3 Fund-specific mortality rates for data generating process The fund-specific mortality rates given by ˙µk,t,xΘ˙ix are calculated as follows:

˙

µk,t,xΘ˙ix= ( ˙αx+ ˙βx˙κt+ ˙γk) ˙Θix.

The parameters ˙αx, ˙βxand ˙κtare obtained by applying the Lee-Carter model on the country-wide population as discussed in the previous section. Whereas the frequentist estimates of ˙αx and ˙βx are used directly, we adjusted κtby making it perfectly smooth; the value of ˙κtis linear in t with the maximum of κtas the starting point, and the minimum as the ending point. In this way, we ensure that the mortality rates strictly decrease over time. Also,Lee & Carter(1992) suggest to apply the constraint that the sum of κtshould be equal to zero, which is often applied (Antonio et al.,2015, Brouhns et al., 2002,Czado et al.,2005, Renshaw & Haberman, 2006, Van Berkum, 2018). In this way, βx affects the age-specific time trend but barely affects the average age-specific mortality rates. In agreement, we subsequently subtract the mean from ˙κt in order to meet this constraint.

As mentioned earlier, there are seven income classes. The dependence between the index of the income class and the log of mortality is assumed to be linear, decreasing from 121 to −121. In this way, the mortality rate of an individual from the lowest income class is approximately 18% higher than the mortality rate of an individual from the highest income class. We set the value of ˙γ4to 0. In this way, the fourth income class could be interpreted as the benchmark.

The parameter ˙Θixcaptures the heterogeneity in mortality among different pension funds. As no information is available, the patterns of ˙Θixare smooth but the patterns are arbitrary. The parameters are chosen such that their mean is equal to one for all ages. In this way, ˙Θixare approximately equal to the actual ratio between the mortality rates of the pension fund and the baseline mortality. The two pension funds that include in pop2 are the pension funds with the lowest factors. The parameters are shown inFigure 7, where the blue lines represent the factors of the two pension funds with the lowest number of most historical years. We assume that these have the lowest mortality. The pension funds with the least historical years have the factors depicted using the

red line and the green lines. The results for estimating the fund-specific trend are analyzed for one pension fund, of which the factors are presented by the red line.

Figure 7

The age-specific factors for all pension funds. The blue lines represent the factors of the two pension funds that have the most historical years. The red line and the green lines represent the factors of the pension funds with the least historical years, of which the red line corresponds to the pension fund for which the fund-specific mortality is estimated in the analysis.

After obtaining all values for Ek,t,xi and Dik,t,x, the parameters that are used for simulation are approached by calibrating the Bayesian model. When the simulated data set is sizeable, both in terms of the number of historical years and the years of exposure per year, it is expected that the densities of the parameters that result are centred and close to the parameters that are used for the data generating process.