Structure of Earnings Survey - Access to microdata

Structure of Earnings Survey (SES)

Description of the dataset

The Structure of Earnings Survey (SES) is conducted in EU Member States as well as EU candidate countries and European Free Trade Association (EFTA) countries.

The survey aims to provide accurate data comparable across countries and over time on earnings in those countries for policy-making and research purposes. It’s a large sample survey of enterprises on the relationships between the level of pay and individual characteristics of employees (sex, age, occupation, length of service, highest educational level attained, etc.) and those of their employer (economic activity, size and location of the enterprise).

The statistics refer to enterprises with at least 10 employees operating in all areas of the economy except public administration, as defined in the Statistical classification of economic activities in the European Community (NACE).  Business activities, which are included in SES microdata, are mentioned in NACE Rev. 2 sections B to S excluding O (NACE Rev. 1.1 sections C to O excluding L until reference year 2006).

Some countries also provide information on a voluntary basis on public administration (NACE Rev. 1.1 Section L until 2006 and NACE Rev. 2 Section O from 2010) as well as on enterprises with less than 10 employees.

The national statistical institutes are responsible for selecting the sample, preparing the questionnaires, conducting the survey and forwarding the results to Eurostat in accordance with the common coding scheme. Eurostat then processes the data.

What data are available?

The 4-yearly SES microdata sets are available for reference years 2002, 2006, 2010, 2014 and 2018. SES 1995 is also available for a limited set of EU Member States (IE-ES-FR-IT-LU-SE).

  • The scientific-use files (SUFs) contain anonymised microdata for 2002, 2006, 2010, 2014 and 2018 from the following countries: BE-BG-CZ-DE (not in SES 2002) and EE-EL-ES-FR-HR-IT-CY-LV-LT-LU-HU-NL-PL-PT-RO-SK-FI-SE-UK-NO-IS (since 2018).
  • The SES microdata not anonymised (secure use files) can be accessed via Eurostat's Safe Centre in Luxembourg and contain data for the following years: 1995 (IE-ES-FR-IT-LU-SE), 2002, 2006, 2010, 2014 and 2018 (CZ-EE-IE-EL-ES-FR-HR-IT-CY-LV-LT-LU-HU-MT-NL-PL-PT-RO-SI-SK-FI-SE-NO-IS). The computers in the Safe Centre are equipped with a standard configuration (SAS, CSV, XLS and STATA) for accessing the microdata.

These general anonymisation rules are used to anonymise SES data in SUFs:

  • Recodifying the categorical quasi-identifiers NACE, NUTS and SIZE of the enterprise in order to yield ratios between the number of sensitive NACE-NUTS-SIZE combinations (defined as those for which less than 3 enterprises exist in the Member State's samples) and the total number of combinations in the SES lower than the threshold of 10% in a high majority of cases. The resulting codification mixes NACE sections, sub-sections or divisions and NUTS 0 or 1 levels as well as a maximum of 3 size classes (<50, 50 to 249 and 250+), depending on the Member State. 
  • Removing citizenship and performing global recoding on the age variable (2.2) to restrict its values to 6 intervals: 14-19, 20-29, 30-39, 40-49, 50-59, 60+.
  • Affording additional protection to employees using unconstrained individual ranking micro-aggregation for SES metric variables (absence days and earnings) by groups of (at least) 3 employees. This means that the latter variables are averaged for categories/combinations that include less than 3 employees in order to hide the information relating to individuals. 

Variables to which micro-aggregation is applied:

 

Variable
code

Variable label
B33 Annual days of holiday leave (in full days)
B34 Other annual days of paid absence
B41 Gross annual earnings in the reference year
B411 Annual bonuses and allowances not paid in each pay period
B42 Gross earnings in reference month
B421 Earnings related to overtime
B422 Special payments for shift work
B43 Average gross hourly earnings in the reference month (to 2 decimal places)

 

  • Suppressing the grossing-up factor for local units (A51) and removing the key identifying the enterprise (KEY_B).

In addition to the general anonymisation rules, some countries requested further specific anonymisation to protect individual data. Precise information is provided together with the SUF files. 

Economic activities according to NACE available in SUF files are provided at sections level (1-digit level) and also divisions (2-digit level) but at this level of detail further aggregation is performed to protect sensitive information (e.g. NACE divisions 23, 24 and 25 are merged into group 23_to_25).

A regional breakdown (variable A11) is available at NUTS 1 level. However, for some countries recoding of the variable is performed to protect sensitive information