Sensitivity analysis as a tool to optimise Human Development Index

Research background: Composite indicators are commonly used as an approx imation tool to measure economic development, the standard of livin g, competitiveness, fairness, effectiveness, and many others being willingly implemented into ma ny different research disciplines. However, it seems that in most cases, the variable weighting procedure is avoided or erroneous since, in most cases, the so-called ‘weights by belief’ are a pplied. As research show, it can be frequently observed that weights do not equal importance in co mposite indicators. As a result, biased rankings or grouping of objects are obtained. Purpose of the article: The primary purpose of this article is to optimise and improve the Human Development Index, which is the most commonly used composite indicator to rank countries in terms of their socio-economic development. The opti misation will be done by re-scaling the current weights, so they will express the real impa ct of every single component taken into consideration during HDI’s calculation process. Methods: In order to achieve the purpose mentioned above, th e sensitivity analysis tools (mainly the first-order sensitivity index) were used to det ermine the appropriate weights in the Human Development Index. In the HDI’s resilience evaluati on process, the Monte Carlo simulations and full-Bayesian Gaussian processes were applied. Base d on the adjusted weights, a new ranking of countries was established and compiled with the ini tial ranking using, among others, Kendall tau correlation coefficient. Findings & Value added: Based on the data published by UNDP for 2017, it ha s been shown that the Human Development Index is built incorrect ly by putting equal weights for all of its components. The weights proposed by the sensitivity analysis better reflect the actual contribution of individual factors to HDI variability. Re-scaled Human Development Index constructed based on proposed weights allow for better differentiatio n f countries due to their socio-economic development. Equilibrium. Quarterly Journal of Economics and Economic Policy, 14(3), 425–440


Introduction
The Human Development Index is probably the most prominent composite indicator ever used. This well-known index was created in 1990 by the United Nations Development Programme and since then has been published every year. As one can read at the UNDP website, 'The HDI was created to emphasize that people and their capabilities should be the ultimate criteria for assessing the development of a country, not economic growth alone' (UNDP, 2019). Therefore, it is often classified as a measure 'replacing GDP', and it ought to quantify social progress in a more direct way than GDP.
However, being prominent, it is not synonymous with being faultlessness and correctness. Despite its popularity, the Human Development Index is perceived by Ravallion (2010, pp. 1-32) as one of the examples of 'mashup indices'. In his work Ravallion (2010, pp. 1-32) defines mashup indices as those for which 'existing theory and practice provides little or no guidance for its design'. The lack of strong theoretical background, both in terms of data selection and aggregation function, is pointed out as the main problem of composite indicator. According to Ravallion (2010, pp. 1-32), many composite indicators were build being constrained only by data availability, ending up a set of processed data without useful meaning.
It should be noted that the concept of HDI has evolved as a result of which some modification in HDI's calculations has been done in 2010 and 2014. To be more precise -since 2010 HDI has no longer been the arithmetic mean of three determinants: life expectancy at birth, adult literacy rate and real GDP per capita in PPP ($). The change in methodology was the result of criticism directed towards HDI (Mcgillivray, 1991(Mcgillivray, , pp. 1461(Mcgillivray, -1468Sagar & Najam, 1998, pp. 249-264). This criticism referred mainly to: combining variables that represent flow, stock, input and output, and doubts directed at used normalisation and aggregation formulas (Zavaleta & Tomkinson (Eds.), 2015, pp. 1-37). Currently, the Human Development Index consists of four variables arranged into three dimensions ( Figure 1): − long and healthy life -life expectancy at birth (in years) (LE), − knowledge -mean years of schooling (in years) (MYS) and expected years of schooling (in years) (EYS), − a decent standard of living -Gross National Income per capita (PPP US$) (GNI). The HDI aggregation formula was also changed from the arithmetic mean to the geometric mean of the three-dimension indices.
The primary purpose of this article is to check whether the change in methodology has eliminated HDI's structural defects indicated by the re-searchers (Despotis, 2005, pp. 969-980;Neumayer, 2001, pp. 101-114). So, are the weights of individual variables truly reflecting the significance of each factor? Going further -does the Human Development Index in its new form has a good discrimination ability? Additionally, does it precisely catch differences between countries due to their socio-economic development? In order to answer the above questions, the sensitivity analysis was applied. Uncertainty and sensitivity analysis was previously used to investigate the correctness of HDI construction by Aguna and Kovacevic (2011, pp. 1-65). They conclude that 'the HDI is a relatively robust index with the most sensitivity exhibited to the choice of weights for income and education component' (Aguna & Kovacevic, 2011, p. 40). Notwithstanding, they do not indicate what specific values of weights should be covered to reflect the real meaning of HDI's components. An attempt to fill this gap will also be taken in this article.
The paper is organised into five sections. Section no. 2 focuses on the literature review regarding composite indicators and the problems concerning weighting procedures. The third section describes the data and methodology used in empirical research. The fourth section presents the results and findings sensitivity analysis and re-calculated Human Development values based on data from 2018. The final section concludes and draws possibilities for further investigations.
It is worth mentioning that Bandura in her work (Bandura, 2008, pp. 1-95) lists 178 indicators that aim to assess countries' performance in various areas of broadly understood socio-economic development. Some scientists harshly call this eager to measure everything at all cost as 'measure-mania' (Diefenbach, 2009, p. 900) and synthetic indices themselves as 'mashup indices' (Ravallion, 2010, pp. 1-32).
It should not be surprising that in the vast majority of cases, the synthetic variable is created evading the stage of variable weighing. That, basically, is tantamount to giving different determinants the same weights, tacitly assuming that they are equally crucial for the analysed phenomenon. In some cases, weights are given subjectively by researchers or based on experts' opinions. Relatively seldom weight establishment occurs on the basis of the factor analysis (Zizka, 2013(Zizka, , pp. 1093(Zizka, -1098, principal component analysis (Perisic, 2015, pp. 29-42), multiple-criteria decision analysis (Pietrzak & Balcerzak, 2017, pp. 310-318), multidimensional IRT models (Gnaldi & Del Sarto, 2018, pp. 1139-1156, data development analysis (Zhou et al., 2010, pp. 169-181) or regression analysis. Some researchers (Pietrzak, 2016, pp. 69-86), when analysing spatial objects, decided to give weight based on spatial autocorrelation, but this does not solve the problem of weighing non-spatial objects.
The usefulness of the sensitivity analysis in the evaluation of synthetic measures has been presented, among others, on an example: Technology Achievement Index (Saisana et al., 2005, pp. 307-323), the Resource Government Index (Becker et al., 2017, pp. 12-22), the Good Country Index (Becker et al., 2017, pp. 12-22), the Water Retention Index (Becker et al., 2017, pp. 12-22), Environmental Performance Index (Saisana & Saltelli, 2010, pp. 1-34) or PISA ranking (Dobrota et al., 2015, pp. 41-58). In all previously mentioned papers, the analysis carried out by the authors pointed to the existence of an erroneous assumption of an equal weighting of partial variables.

Research methodology
As it was mentioned in the introduction section currently the Human Development Index is calculated as a geometric mean of three individual indices (Zavaleta & Tomkinson (Eds.), 2015, pp. 11-14): where: -the value of the Human Development Index, -health dimension index, -education dimension index, -income dimension index.
Individual dimension indices are calculated according to the formulas presented below: They are using a geometric mean instead of arithmetic one, which allowed to get rid of the flattening of results. However, it is still assumed that health, education and income dimensions are equally relevant from countries' socio-economic development. Therefore, so-called 'weights by belief' are still valid.
In the case of building synthetic variables for any of the objects, the approach promoted by the Competence Centre on Composite Indicators and Scoreboards (COIN) may be useful. The approach promoted by COIN's members, basing applying the sensitivity analysis in the process of composite indicators' construction is also supported by (Becker et al., 2016, pp. 1-33;Becker et al., 2017, pp. 12-22;Greco et al., 2019, pp. 61-94;Paruolo et al., 2013, pp. 609-634).
The approach proposed by the researchers mentioned above is based on the use of Pearson's correlation ratio -as a first-order sensitivity measure commonly applied in a global sensitivity analysis (Paruolo et al., 2013, pp. 609-634). In that approach, a composite indicator is considered as an out-put variable, and its components are considered as input variables. A variance-based Pearson's correlation ratio will then express the strength of the dependence between the output and input variable accounting for possible nonlinearity of dependence (Becker et al., 2016, p. 3). Following the procedure presented in (Becker et al., 2017, pp. 13-15): 1. The composite indicator is understood as, not necessarily linear function of determinants describing the analysed phenomenon: where: 0 1 -output variable, 4 1 -input variables, 7 1 -error term.
2. Pearson's correlation ratio is used to measure the influence of each input variable, assuming that all other input variables are fixed: where: 8 N * -target normalised correlation ratio, 8 N -normalised correlation ratio.
6. Calculating Re-scaled Human Development Index as a weighted geometric using optimal weights, 7. Assessment of conformity of HDI and re-scaled HDI ranking using the Kendall-tau correlation coefficient. The above-presented set of tools used will allow for answering the following research question: Does the 'new' version of HDI, keeping equal weights, fully reflect the actual significance of individual components?

Results
Based on data retrieved from the United Nations Development Programme concerning individual factors shaping HDI in 2018, it has been investigated whether each HDI's three pillars share equal importance or maybe its meaning resulting from the variance is entirely uneven. The procedure presented in the previous section was implemented to all calculations. The data set include statistics concerning 189 countries.
As it was mentioned before, the Human Development Index is currently calculated as a geometric mean of three sub-indices. The HDI's creators assumed that all components are equivalent. Referring to the terminology contained in the previous chapter, HDI will be denoted as an output variable and health, education and income indices as input variables.
Taking into consideration the relations presented at Figure 2, one can observe that both output (HDI) and input variables have a negatively skewed distribution, which means that in the case of all analysed variables more than 50% countries have higher values than the average. Analysing the same figure, it can be observed that the most robust liner relation between output variable (HDI) and input variables is in the case of GNI index. Therefore, leading to a kind of premise that the indicated variable will have a potentially more significant impact on the output variable. Table 2 is also worth paying attention to, and it can be noted that in the case of each pair of variables there is a strong, statistically significant, positive correlation. Is should be emphasised that HDI has the strongest correlation with GNI, although the coefficient is only slightly higher than in the case of education index. Nevertheless, the most crucial stage of this analysis is to set up correctly first-order indices. The results included in Table 3 were obtained using 'tgp' R package, and they present the estimated values of the correlated and uncorrelated main effect of each input variable onto an output variable. One should have in mind that, according to the intention of the creators, the impact of each variable should be even; wheras it is not. As it was expected from the analysis of the previous data, the Income Index has the strongest influence on the HDI, while the education one shows the weakest impact. It is, therefore, clear that there is no justification for giving them equal weight.
The lack of equality was, therefore, the premise for trying to establish adequate weights using a simplex search method. The Nelder-Mead method was used as this one does not require the prior knowledge of trends in the analysed process. A comparison of original and optimised weights is included in Table 4. It is somewhat not surprising that as a result of the optimisation procedure, the highest weight was obtained in the case of income index, while the lowest in the case of education index.
The change in weighting system caused that the re-scaled Human Development Index has better discrimination features (compare Table 1 and Table 5) without changing the countries' ordering significantly (see Table 6 and Figure 3) and maintaining correlation level among sub-indicies and re-scaled HDI (see Table 7). The original HDI considerably flattens the differences in socio-economic development between the analysed 189 countries. Thus, HDI values used as the explanatory variable in other analyses, due to the low variability, may contribute little to the study. Rescaled HDI, which, largely maintains the original ranks, provide greater diversity and asymmetry of the composite indicator values.
The Kendall's tau correlation coefficient, based on both rankings, reached the value of 0.969, which proves the high compatibility of ordering. Among all the analysed countries, the average difference between the position in HDI ranking and re-scaled HDI is only 1.68, while 54 out of 189 countries have precisely the same position in both rankings. The most significant differences were observed for Kuwait (9 places), Ukraine, Equatorial Guinea and Eritrea (7 places), and Palau, Turkey and Maledives (5 places).

Discussion
The HDI concept has been the subject of criticism since the very beginning, mainly due to the limitation of socio-economic development to three dimensions of equal importance. As it was mentioned in the introduction, some researchers argue that the HDI is redundant, bringing no new information.
This study uses the sensitivity analysis to check the stability of HDI results from 2018. The results presented in the paper are consistent with the research by Mazouch et al. (2016, pp. 5-18) confirming that 'finding directly negates the base of the calculation of the index where all dimensions are supposed to be equal'. It turns out that the change in the methodology for calculating HDI (from arithmetic to geometric mean), did not affect the treatment of individual indicators by their actual importance. Sensitivity analysis is a remedy to this problem.

Conclusions
The analysis conducted in this paper indicated that equal weights in HDI construction are not the optimal solution. It seems, therefore, that the Ravallion's statement that HDI is a 'mashup index' is not groundless. The article proposes adjusted weights that better illustrate the influence of each factor on the final counties' ranking due to their socio-economic development. Additionally, the re-scaled HDI has better discriminatory properties than its original version while maintaining statistically significant compatibility with the original ranks. The presented paper is another example of the usefulness of applying sensitivity analysis in the construction of composite indicators. The main disadvantage of the presented method is its high degree of complexity and the necessity to recalculating weights each time. The recalculation is needed as the final set of weight is sensitive to the variance of variables. It seems, however, that this is a justified effort because it allows for obtaining robust results, and helps to avoid the most common defect in the use of composite indicators, i.e. arbitrariness of weights.
The conducted analysis is the starting point for constructing an author's measure of the standard of living at the regional level, in which the weights of individual measures will reflect their actual significance.  Source: author's study based on data from the United Nations Development Programme.  Source: author's study based on data from the United Nations Development Programme. Source: author's study based on data from the United Nations Development Programme. Source: author's study based on data from the United Nations Development Programme. Source: Zavaleta and Tomkinson (Eds.) (2015, p. 5). Figure 3. The relation between HDI and re-scaled HDI ranks