Statistical confidentiality and personal data protection - Access to microdata
Statistical confidentiality and personal data protection
Statistical data are subject to 2 data protection frameworks:
- the general personal data protection framework, applicable whenever information about individuals is collected for whatever purpose
- the specific framework for the protection of data collected for statistical purposes.
The general data protection framework applies to personal data collected for all kinds of purposes: administrative, commercial, statistical or any other. The General Data Protection Regulation strengthens the rights of data subjects and the obligations of data controllers (organisations that collect and process the data). The personal data protection aspects (data security, data traceability, data access) should be a key part of the design of any data collection. Further information is available here.
The protection of data collected for statistical purposes – statistical confidentiality – is a fundamental principle of official statistics. Statistical confidentiality means that data on individuals (or businesses) may be used only for statistical purposes and that rules and measures must be applied to prevent the disclosure of information on an individual or business entity.
Terms and definitions used in the personal data protection framework and the statistical framework
|Personal data protection framework||Statistical framework|
‘Personal data’ means any information relating to an identified or identifiable natural person or data subject. An identifiable person is someone who can be identified, directly or indirectly, in particular with reference to an identification number or to one or more factors specific to their physical, physiological, mental, economic, cultural or social identity.
‘Confidential data’ means data that allow statistical units to be identified, either directly or indirectly, thereby disclosing individual information. To determine whether a statistical unit is identifiable, all relevant means that might reasonably be used by a third party to identify the statistical unit must be taken into account.
|‘Data subject’ is the person whose personal data are collected, held or processed by the data controller.||‘Statistical unit’ means the basic observation unit, namely a natural person, household, economic operator and other undertakings, referred to by the data.|
|Purpose of data collection: data collected for different purposes.||Purpose of data collection: data collected for statistical purposes.|
|Scope: data on persons.||Scope: all data collected for statistical purposes on the basis of the applicable law; data on persons, households and businesses|
Examples of data in the scope and out of scope of the respective legal frameworks
Example 1: statistically confidential personal data
The statistical data collected through questionnaires are stored in files in which each record contains information about the individual respondent. These files are called microdata files. They act as the basis for compiling statistics or indicators. When these files contain information about natural persons and when these persons are identifiable, these data fall both under the scope of both statistical confidentiality and personal data protection.
Example 2: statistically confidential but not personal data
Business data are considered confidential if they lead to disclosure of information pertaining to a particular company. For example, the aggregated turnover of a specific type of company located in a given region would be considered confidential if this region only has 1 or 2 companies of this type. This data is subject to the statistical data protection framework (statistical confidentiality). These data are outside the personal data protection framework because they involve legal persons.
Example 3: personal data but not statistically confidential data
The data on natural persons collected for purposes other than statistical data fall under the scope of personal data protection.
Why is the protection of personal data important in the context of access to microdata for scientific purposes?
Microdata files for researchers (scientific use files) contain information about people. These microdata are prepared to reduce the risk of respondents being identified. Microdata files released by Eurostat never contain direct identifiers like name, address or identification number. The information on respondents is reduced to ensure their anonymity. Example of protection measures applied to microdata (Labour Force Survey):
AGE: by 5-year bands
NATIONALITY/COUNTRY OF BIRTH: up to 15 predefined groups
NACE: at 1-digit level
ISCO: at 3-digit level
INCOME: only provided as (national) deciles and from 2009
HHNUM: household numbers are randomised per dataset, not allowing respondents to be tracked across time
Microdata files for researchers fall within the scope of both the personal data protection framework and statistical confidentiality framework. The authorised users of microdata (researchers having fulfilled all conditions) are therefore obliged to fulfil the same obligations as other recipients of personal data, for example to use the data for an agreed purpose, for a specific period of time and with respect to security rules. The researchers also have to follow the requirements of EU statistical legislation, namely: to use the data for scientific purposes only, respect reliability and confidentiality thresholds and to destroy original data after use.
How is the protection of personal data ensured in the context of access to microdata for scientific purposes?
Eurostat provides access to microdata to researchers belonging to research entities (universities, research institutes, research departments or other organisations) that have been accredited by Eurostat. This accreditation is based on an assessment of the organisation applying for access and the purpose for which access is requested.
Once accredited, research entities sign an agreement with Eurostat. In line with the rules in force for the protection of personal data, the agreement distinguishes between:
- Recipients in jurisdictions recognised by the European Commission as providing an adequate level of personal data protection. These are the EU and European Economic Area (EEA) countries. In addition, the Commission has recognised Andorra, Argentina, Canada (commercial organisations), Faroe Islands, Guernsey, Israel, Isle of Man, Jersey, New Zealand, Switzerland and Uruguay as providing adequate protection.
- Recipients in other jurisdictions. The template comprises an additional commitment that recipients ‘have no reason to believe, at the time of entering into clauses, in the existence of any laws to which they are subject that would have a substantial adverse effect on the guarantees provided for under the clauses, and that they will inform Eurostat if they become aware of any such laws’.
The table below provides information on the legal acts for personal data protection and for statistical confidentiality applicable in the EU.
|Personal data protection laws||Statistical laws|
|Legal acts applicable in EU Member States||Legal acts applicable in the EU institutions||National - covering all data collected in the countries||European - covering European statistics|
|Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation - GDPR||Regulation (EU) 2018/1725 of the European Parliament and of the Council of 23 October 2018 on the protection of natural persons with regard to the processing of personal data by the Union institutions, bodies, offices and agencies and on the free movement of such data||Separate laws in the EU/EEA/EFTA countries, more details can be found here (see Partners / European Union||
Regulation 223/2009 on European statistics
For microdata access: Regulation (EU) No 557/2013 on access to confidential data for scientific purposes