Given that the EU-SILC estimates are based on sample surveys, point estimates should be accompanied with standard errors and confidence intervals, especially so when indicators are used for policy purposes or when estimates are based on relatively small sample sizes. Besides, the EU-SILC indicators are calculated not only for the national population, but for many sub-populations of interest. There is also an increasing demand for EU-SILC indicators at the regional level. Due to smaller sample sizes, the question of standard error estimation for breakdown indicators is even more prominent.
However, the computation of standard errors for EU-SILC estimates is confronted with many challenges:
1. Standard error estimation for cross-sectional measures
Standard error estimation for cross-sectional measures is a common problem, and there is a huge literature on this topic. Osier (2009) implemented linearisation techniques with the package Poulpe to estimate standard errors for the main EU-SILC indicators. This method has strong advantages: the linearisation technique takes into account of the complex non-linear structure of the EU-SILC indicators, while the package Poulpe is powerful enough to take into account of the main sample design features (stratification, clustering and unequal probabilities of selection) as well as unit non-response and adjustments to external data sources (calibration). However, this solution turned out to be hard to implement from the second wave onwards, especially because of the rotational structure of the sample. Therefore, alternative strategies need to be found.
2. Standard error estimation for measures of changes
In order to monitor the process towards agreed policy goals, we are interested in the evolution of social indicators. This is even more relevant in the context of the new Europe 2020 poverty reduction target indicators. However, interpreting differences between point estimates of different waves may be misleading. It is therefore necessary to estimate the standard error for these differences, taking account of the dependence between samples (Berger and Priam, 2011), in order to judge whether or not the observed differences are statistically significant.
3. Standard error estimation for longitudinal measures
Even though we are now at the 8th wave of the survey (for countries part of the EU-SILC survey in 2003) and a large amount of longitudinal data have been collected, the issue of standard error estimation for EU-SILC longitudinal indicators (mainly the at-persistent-risk-of-poverty rate) has not been resolved.
4. Standard error estimation under imputation
In EU-SILC income variables have been heavily imputed. A common source of error in standard error estimation is to treat imputed values as exact values. Such an assumption may lead to underestimating the variance, particularly when the proportion of missing values is not small.
5. Standard error estimation from users’ perspective (outside NSIs)
The current versions of the EU-SILC User Data Base (UDB) do not convey enough information which would allow data users outside NSIs to compute reliable standard errors estimates for a given set of indicators Goedemé (2010a, 2010b). More precisely, the issues are: the limited documentation of sample design variables, several shortcomings of the sample design variables available to Eurostat and the limited availability of sample design variables for EU-SILC users outside NSIs.
This thematic workshop is due to be held in Eurostat premises in Luxembourg on the 29th and 30th March 2012. Participants are expected to come from the Commission, NSIs, universities and research centres. The objective is to discuss the main issues and recommendations in relation to standard error estimation in EU-SILC. On the basis of the workshop, a ‘handbook’ will be produced. The handbook will be very practical and solution-oriented. It will also provide: (a) suggestions concerning the concrete implementation procedures for computing standard errors at NSI’s level (production database) and at database users level, i.e. non-NSI’s level; (b) concrete recommendations for better recording of sampling design variables (e.g. suitable documentation and metadata), after reviewing the current practices on micro-data for the sample design variables.
Berger, Y. G. and Priam, R. (2011). Estimation of Correlations between Cross-Sectional Estimates from Repeated Surveys - an Application to the Variance of Change. Proceedings of Statistics Canada Symposium, 2010.
Goedemé, T. (2010a). The standard error of estimates based on EU-SILC. An exploration through the Europe 2020 poverty indicators. Working Paper, Centrum Voor Sociaal Beleid Herman Deleek http://www.centrumvoorsociaalbeleid.be/sites/default/files/CSB%20Working%20Paper%2010%2009_december%202010-1.pdf
Goedemé, T. (2010b), The construction and use of sample design variables in EU-SILC. A users’ perspective, Report prepared for Eurostat, November 2010, Antwerp, Herman Deleeck Centre for Social Policy, University of Antwerp, 16p.
Osier, G. (2009). Variance estimation for complex indicators of poverty and inequality using linearization techniques. Survey Research Methods, Vol. 3, N°3 http://w4.ub.uni-konstanz.de/srm/article/view/369