In general, data validation is the process of ensuring that data are clean, correct and useful. Eurostat performs data validation by verifying whether data are in accordance with certain basic criteria that serve to assess the plausibility of the given data.
[no-lexicon]The ESSnet ValiDat Integration examines different ways to implement a common infrastructure for data validation in the ESS. Our work includes theoretical groundwork, data structures and languages and their integration into the data validation subprocess of statistical production. We also look at the architecture and interoperability of distributed data validation in the ESS.
Validation is a key process in statistical production and has a major impact on data quality and productivity.
The ESSnet project Validat Foundation has worked on harmonisation and standardisation in this field.
Methods for balancing the national accounts–simple illustration of principle
[no-lexicon]Paper reviewed: Knottnerus , P. and C. v an Duin (2006), Variances in Repeated Weighting with an Application to the Dutch Labour Force Survey. Journal of Official Statistics 22 , pp. 565 – 584.
Di Consiglio L., Tuoto T. (2015). Coverage evaluation on probabilistically linked data, Journal of Offic ial Statistics, Vol. 31, No. 3
S. Gerritse, P.G.M. van der Heijden, B.F.M. Bakker. Sensitivity of Population Size Estimation for Violating Parameter Assumptions in Log - linear Models. Journal of Official Statistics , Vol. 31, No 3, 2015, pp. 357 - 379, http://dx.doi.org/10.1515/JOS-2015-0022.
Paper reviewed: Fosen, J. and L. - C. Zhang (2011), Quality assessment of register - based census employment status, Proceedings of the International Statistical Institute, World Congress, Dublin.
Paper reviewed: Fosen, J. and Zhang, L.-C. (2011), Quality assessment of register-based census employment status, Proceedings of the International Statistical Institute, World Congress, Dublin.
Administrative data are used more and more in official statistics as a replacement for survey data.
[no-lexicon]When data sets are linked at individual level, for instance survey data with administrative data, often no unique linkage keys are available. In that case, probabilistic linkage may be used. With probabilistic linkage, linkage errors will occur. These errors may have impact on subsequent statistical analysis.
Guarnere U., Variale R. Estimation from contaminated multi-source data based on latent class models. Statistical Journal of the IAOS, vol. Preprint, no. Preprint, [no-lexicon]pp[/no-lexicon]. 1-8, 2015
Darcy Steeg Morris , A Comparison of Methodologies for Classification of Administrative Records (Quality for Census Enumeration ) JSM 2014 - Survey Research Methods Section pp 1729-1743 http://ww2.amstat.org/sections/SR
Schnetzer, M., Astleithner, F., Cetkovic, P., Humer, S., Lenk, M., and Moser, M. (2015), Quality Assessment of Imputations in Administrative Data, Journal of Official Statistics, Vol. 31, No. 2, pp. 231–247, http://dx.doi.org/10.1515/JOS-2015-0015
Asamer E., Astleithner F., Ćetković P., Humer S., Lenk M., Moser M. and Rechta H. (2016a): Quality Assessment for Register-based Statistics - Results for the Austrian Census 2011. Austrian Journal of Statistics Vol. 45, No. 2, pp.
The description of a generic, machine readable validation report structure.