ST2_1 Overlapping numerical variables without a benchmark: Integration of administrative sources and survey data through Hidden Markov Models for the production of labour statistics

Document date: 
Monday, 28 August, 2017

The increased availability of large amount of administrative information at the Italian Institute of Statistics (Istat) makes it necessary to investigate new methodological approaches for the production of estimates, based on combining administrative data with statistical survey data.
Traditionally, administrative data have been used as auxiliary sources of information in different phases of the production process such as sampling, calibration, imputation. Basically, the classical approach, that could be defined supervised, relies on the assumption that, at least after some data editing procedures to remove occasional measurement errors, the survey data provide correct measures of the target variables, so that the use of external sources is essentially limited to the reduction of the sampling error. This is because the measures provided by administrative sources usually do not correspond to the target variables. On the other hand, although surveys are designed to meet the statistical requirements, also statistical data could be affected by measurement errors that may seriously compromise the accuracy of the target estimates.