Back to top

Census 2021 round (cens_21)

DownloadPrint

National Reference Metadata in Euro SDMX Metadata Structure (ESMS)

Compiling agency: Statistics Estonia

Need help? Contact the Eurostat user support

The data present the results of the 2021 EU census on population and housing, following Regulation (EC) 763/2008; Regulation (EU) 2017/543; Regulation (EU) 2017/712 and Regulation (EU) 2017/881.

27 December 2022

The information is given separately for each census topic. See the sub-concepts 3.4.1 - 3.4.37.

The EU programme for the 2021 population and housing censuses includes data on persons, private households, family nuclei, conventional dwellings and living quarters.

The persons enumerated in the 2021 census are those who were usually resident in the territory of the reporting country at the census reference date.

According to Regulation (EU) No 1260/2013 on European demographic statistics, article 2(c) and 2(d), ‘Usual residence’ means the place where a person normally spends the daily period of rest, regardless of temporary absences for purposes of recreation, holidays, visits to friends and relatives, business, medical treatment or religious pilgrimage. The following persons alone shall be considered to be usual residents of a specific geographical area:

  1. those who have lived in their place of usual residence for a continuous period of at least 12 months before the reference time; or
  2. those who arrived in their place of usual residence during the 12 months before the reference time with the intention of staying there for at least one year.

Usual residence population is determined using residency index - methodology based on sign-of-life approach. More information: Implementation of the residency index in demographic statistics.

The method is based on the idea that each potential inhabitant of Estonia is assigned an index which shows the person’s likelihood of being a permanent inhabitant of Estonia, i.e. a resident. The value of the index ranges between 0 and 1. The greater the index value, the more likely it is that a person is a resident of Estonia. A threshold is used to make the distinction between definite residents and definite non-residents: those whose index value is above the threshold are considered residents. In order to calculate the index, a wide range of Estonian administrative registers and sub-registers are used, including the Estonian Education Information System, the State Pension Insurance Register, the health insurance database, etc. Each register or sub-register gives a person one so-called sign of life. The signs of life are not equal; thus, each sign of life has been assigned a weight. For example, if a person permanently lives in a care home in Estonia, he/she is a definite resident, but an Estonian driving licence may be issued to a person who has come here for a shorter period as well. All persons whose index was 0 at the beginning of the year and 1 at the end of the year are recorded as persons having been born or immigrated to Estonia in the respective year. In the opposite situation, where a person’s index was 1 at the beginning of the year and 0 at the year end, the person is considered as having died or emigrated in the respective year. In order to distinguish emigration from births and deaths, register data are used and supplemented with the data of the Police and Border Guard Board. Internal migration events occur if the person’s place of residence at the beginning of the year differs from the residence at the end of the year (in the case of a death if the place of residence at the beginning of the year differs from the place of death, and in the case of a birth if the place of residence of the mother differs from the place of residence at the end of the year). 

Usual residence population includes all population groups who have enough signs of life during the reference year. This includes third level students, people without a permanent address (e.g. homeless), asylum seekers, refugees and people under temporary protection. 

Data are available at different levels of geographical detail in EU countries: national, NUTS2/NUTS3 regions and local administrative units (LAU), grids.

Information is provided in the sub-concepts 5.1 - 5.3.

This section describes the methodology used to estimate data on the topic. The sub-concepts 13.1.1 - 13.1.35 describe the methodological steps taken when compiling the data for a topic and where relevant highlights the number of unknowns.

Counts of statistical units should be expressed in numbers and where is needed rate per inhabitants enumerated in the country.

Data compilation is described in detail in the Methodology report. 

The r-programming language was used to process the data. 

Principles of data processing: 

  • The population is formed at person/dwelling/household/family level
  • Data for each sub-domain (person/dwelling/houldhold/family) is held in separate databases but can be linked
  • When multiple data sources are used for a topic, they are used in specific priority order defined separately for each topic 
  • Document-based data sources are preferred over survey-based data sources
  • For some topics, age checks are applied when assigning information from data sources. For example, age checks are carried out for the education topic, which means that if the level of education indicated in the source is too high considering the person’s age, this information is disregarded. 

Models used: see sub-concept 18.6. Adjustment. 

 

Capturing: Administrative data are received via X-Road, an FTP-server and by (encrypted) e-mail.

Coding:  Where applicable, data in the databases is coded according to the classifications:

Identifying variable(s): Administrative data sources are used in annual statistics across the fields in Statistics Estonia. Therefore, there is collective institutional knowledge about the data sources, which is used when identifying new potential data sources and variables in the data sources. In addition, annual population statistics are based on the same set of data sources as the census output. There is also a public administration system for the state information which allows for improving the reuse and findability of data.

Record editing, and record deletion: Source data (administrative data sources) are never edited. Processed and cleaned data is used separately from the original data. 

Record imputation and estimation: Imputation of data was needed for topics Occupation and Industry. If the occupation of an employed person remained unknown after combining data from data sources, the information was assigned on the basis of persons with known occupations (excluding armed forces occupations). The results were compared against the same topics from the Labour Force Survey. No outliers or abnormalities were detected. 

Record linkage including identifying variable(s) used for the record linkage: In order to compile the high-quality statistics for the census topics, data needed to be combined from over 30 data sources (see 18.1. Source data). Common denominators such as (pseudonymised) personal identification codes are used to link data from various sources.  

Generation of households and families: Partnership and location index is used to compile usual residence, household and family statistics. See Section 13.1.6. Accuracy overall - Household status. 

Measures to identify or limit unit-no-information: A combination of data sources was used to minimise the number of unknowns for each topic. However, when the information was not present in any of the data sources (including previous censuses), the value remained unknown. The percentage of unknowns can be reduced when additional data sources become available. 

Post-enumeration survey: No post-enumeration survey was carried out. Over coverage of the census population was analysed using data from survey samples and under coverage was analysed using the 2021 census sample survey results. See section 11.2.1. Coverage assessment.

 See below.

Decennial.

 Data retrieved from the registers refers to the census reference data (31 December 2021) or to the year 2021. 

For all topics, data is comparable between any given administrative level. 

There are no regional differences in the quality of the geographical data.

Not applicable.