Reference metadata describe statistical concepts and methodologies used for the collection and generation of data. They provide information on data quality and, since they are strongly content-oriented, assist users in interpreting the data. Reference metadata, unlike structural metadata, can be decoupled from the data.
Description of the main topics dissaminated for population, households, families, living quarters and dwellings.
3.1.1. Impact of the COVID-19 pandemic on census methodology
The 2021 National Census of Population and Housing was implemented in particularly difficult conditions.
As a result of the epidemic caused by the SARS-CoV-2 virus and the associated restrictions, there was a need to amend the Act on 2021 Census. The amendments to the Act on the National Census of Population and Housing 2021 related, inter alia, to the extension of the census duration by three months (until 30 September 2021, and not, as originally planned, until 30 June 2021), as well as to increase the efficiency of the use of census methods, i.e. ways of obtaining data from natural persons included in the census, in particular in cases where the use of some methods is not possible. By extending the census collection deadline by Q3 2021, it was possible to use a period of significantly lower epidemic risk compared to Q2 2021 to implement the census with very good completeness.
The COVID-19 pandemic had no impact on the census methodology.
3.2. Classification system
The 2021 Population and Housing Census used the following population groups and classifications:
Groupings and classifications, as defined in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017 laying down the rules for the application of Regulation (EC) No.763/2008 on population and housing censuses concerning the technical specifications of the topics and their breakdowns;
International Standard Classification of Education (ISCED 2011);
International Standard ISO 3166 @-@ 1, Codes for the representation of names of countries and their subdivisions Part 1: Country code (ISO 3166-1:2020);
ISCO @-@ 08, NACE Rev 2;
Classification of Territorial Units for Statistics (NUTS).
3.3. Coverage - sector
Not applicable.
3.4. Statistical concepts and definitions
The information is given separately for each census topic. See the sub-concepts 3.4.1 - 3.4.37.
3.4.1. Statistical concepts and definitions - Usual residence
Data processed under the theme: The place of residence is specified in Regulation 2017/543.
Residents include:
a. permanent residents, with the exception of persons staying outside their place of residence for a period of at least 12 months – regardless of their place of residence (in Poland or abroad);
The exception are Polish diplomats at institutions and soldiers stationed on foreign military missions, who – despite being abroad – were counted as permanent residents of Poland.
b. persons temporarily resident in the municipality for a period of at least 12 months, arriving from another place in Poland or abroad (foreigners without permanent residence in Poland).
The following were used as a criterion for population movement when separating categories of residents: study, work, family and housing conditions, treatment and rehabilitation, stay in a nursing home. This means that people staying, for example, in prisons or detention centres – regardless of the time of absence – are counted as residents of the places where they lived before their “forced” departure.
Students were considered residents in the place (municipalities) of studies, in the case of studies abroad they were not included in the residents of Poland.
Foreigners studying in Poland were included as residents of Poland. Residents of the commune of permanent residence (family house) included young people studying in secondary schools, regardless of the location of the school.
Homeless people were counted among the residents of the gmina in which they were enumerated.
Residents of the commune, in which collective living quarters were located, included permanent residents of Poland living in collective living quarters (CLQ). Foreigners staying in CLQ were considered residents if their actual or planned period of residence was at least 1 year. If it was less than 12 months, they were not considered to be residents of Poland.
If a person lives regularly in more than one place during a year, the place where he or she spends most of the year shall be deemed to be the place where he or she spends most of the year, whether he or she is elsewhere in the same country or abroad. However, in the case of a person who works away from home during the week and returns for the weekend to the home where he or she lives with his family, the home where the person resides with his or her family, regardless of whether or not his/her place of work is elsewhere in the same country or abroad.
On the basis of the definition of place of residence, persons residing in the place of enumerating but who, at the time of the census, are or will be absent for less than one year are considered temporarily absent and included in the total population. Unlike the case described, persons who live or will live outside the place of enumerating for one year or longer are not considered temporarily absent and they are therefore excluded from the total population. This is independent of the duration of any visits to the family.
Persons who are registered but do not meet the criteria for considering the place of residence to be their place of residence, i.e. persons who do not live or will not reside in the place of enumerating for a continuous period of at least 12 months, are considered temporarily present and are therefore not included in the total population residing in the place concerned.
3.4.2. Statistical concepts and definitions - Sex
Males, females
3.4.3. Statistical concepts and definitions - Age
Age of individuals is determined by the number of years completed as determined by comparing the full date of birth to the date of the census (known as the critical moment, March 31, 2021)
3.4.4. Statistical concepts and definitions - Marital status
Marital status –in Polish law, marriage refers to marriage as the union of a man and a woman.
Legal marital status was determined for persons aged 15 or more and was defined as marital status according to the law in force in Poland (the Law on Civil Status Acts). Under Polish law, there are four categories of marital status:
single – persons who have never been legally married;
married – persons whose marriage was contracted in accordance with secular law;
widowed – persons whose legal marriage has ceased to exist because of the death of a spouse;
divorced – persons whose marriage has been dissolved by a court decision.
Persons whose marriage has been legally separated by the court – continue to be married.
Polish law allows women from the age of 16 to marry and men from the age of 18. Polish law does not provide same-sex marriages. Therefore, in the 2021 Population and Housing Census did not collect data on persons in the same-sex marriages. Polish legislation also does not provide for registered partnerships. Therefore, data on persons in registered partnerships was not collected.
3.4.5. Statistical concepts and definitions - Family status
A family nucleus is defined in the narrow sense as two or more persons who live in the same household and who are related as husband and wife, as cohabiting partners, or as parent and child.
According to the above definition, the following types of biological families are distinguished:
marriage without children,
marriage with children,
informal relationships without children,
informal relationships with children,
lone mother with children
lone father with children.
A child is defined as a person at any age who remains in the household together with the both, or one of parents. The stepsons/stepdaughters as well as adopted children are also rated as a children.
A person who live with a spouse, with a partner in a informal union, or with one or more own children, is not considered to be a child. A person under the age of 15 who has children of his own is treated according to the facts as a parent with children. According to documentation of Civil Status Offices in Poland, in 2021 the youngest woman who gave birth to a child was 13 years old. A child who alternates between two households (for instance if his or her parents are divorced) shall consider the one where he or she spends the majority of the time as his or her household. Where an equal amount of time is spent with both parents the household shall be the one where the child is found at the time of census night or, alternatively, the household where the child has his or her legal or registered residence.
Informal relationship means the maintenance of psychological, physical and economic ties of a marital nature (without marriage) between two persons. Information about staying in an informal relationship were collected for persones 18 years old and more on a voluntary basis.
‘Skip-generation households’ (households consisting of a grandparent or grandparents and one or more grandchildren,but no parent of those grandchildren) are not included in the definition of a family
3.4.6. Statistical concepts and definitions - Household status
Member States shall apply the ‘housekeeping concept’ to identify households, or if not possible, the ‘household- dwelling’ concept. In In the 2021 National Population and Housing Census adopts the ‘household- dwelling’ concept, according to which all persons living in one dwelling (whether related or unrelated) are considered members of the same household.
Important is that households can only consist of persons identified as residents. The use of the household_dwelling concept to identify households and the fact that households can only consist of people classified as residents has resulted in cases of persons under the age of 15 forming single-person households. This happened, for example, in cases where parents were not identified as residents or had an established place of residence different from that of the children. At the same time, as written in para. 3.4.5 “skip-generation” households, grandparents with grandchildren, are not considered as a familiy according to the accepted definition. It can happen the couple shares a household with another member of the extended family or an unrelated person.
Each person is defined by the position of the person in the household:
1. Persons living in the household - a category including:
a) persons in a biological family belonging to a household in which there is a biological family of which they are a member. By type of biological family a distinction is made between:
persons in marriages,
persons in informal relationships,
single parents,
sons/daughters.
The term 'sons/daughters' is in line with the definition of the term 'child'. The definition of the ‘Child’ given for the topic ‘Family status’ (FST) also apply to the topic ‘Household status’ (HST).
b) persons outside the biological family belonging to non-family households or to family households if they are not members of any biological family in that household. Within this category, persons are distinguished:
living alone (in single-person non-family households),
living with other persons (in non-family multi-person households and in family households that are not members of any biological family).
2. Persons not living in a household - a category including:
persons in collective living quarters,
persons not living in the household (including homeless persons), category undetermined.
Homeless persons are persons who, for various reasons - economic, family or administrative - declare they have no permanent place of residence. Homeless persons do not include those who are homeless due to random accidents (disasters, floods, fires, etc.).
3.4.7. Statistical concepts and definitions - Current activity status
The definitions used to prepare census data for this topic do not differ from the definitions contained in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. Only the upper age limit for unemployed persons was specified, which was 74 years.
‘Current activity status’ is the current relationship of a person to economic activity, based on a reference period of one week, which may be either a specified, recent, fixed, calendar week, or the last complete calendar week, or the last seven days prior to enumeration.
The ‘labour force’ comprises all persons who fulfil the requirements for inclusion among the employed or the unemployed.
‘Employed’ persons comprise all persons aged 15 years or over who during the reference week:
(a) performed at least one hour of work for pay or profit, in cash or in kind, or
(b) were temporarily absent from a job in which they had already worked and to which they maintained a formal attachment, or from a self-employment activity.
Employees temporarily not at work were considered as in paid employment provided they had a formal job attachment. They did not work for reasons such as: illness, holiday, reduction in economic activity, strike and but formally they had a job as employees or self-employed people.
The formal job attachment is determined on the basis of one or more of the following criteria:
(a) a continued receipt of wage or salary; or
(b) an assurance of return to work following the end of the contingency, or an agreement as to the date of return; or
(c) the elapsed duration of absence from the job which, wherever relevant, may be that duration for which workers can receive compensation benefits without obligations to accept other jobs.
Self-employed persons were considered ‘employed’ if they have worked as such during the reference week or if they are temporarily absent from work and their enterprise meanwhile continues to exist.
Contributing family workers were considered ‘employed’ at work on the same basis as other employed persons; that is irrespective of the number of hours worked during the reference period.
The ‘unemployed’ comprise all persons aged 15 years or over who were:
(a) ‘without work’, that is, were not in wage employment or self-employment during the reference week; and
(b) ‘currently available for work’, that is, were available for wage employment or self-employment during the reference week and for two weeks after that; and
(c) ‘seeking work’, that is, had taken specific steps to seek wage employment or self-employment within four weeks ending with the reference week.
‘Outside of the labour force’ includes persons below the minimum age for employment (up to 15 years) and persons aged 15 years or over who were not classified as employed or unemployed, i.e. persons who in the reference week:
a) were not working, did not have a job and were not looking for job because they were only: retired, on disability pension, obtained capital income (shares, bonds or property income); continued their education as an apprentice or student; took care of a child or an adult; own health condition that does not allow you to work; have exhausting all possibilities of finding a job or any other reason;
b) were not working, were looking for work, but were not able/ready to work between April 1 and April 15;
In ascribing a single activity status to each person, priority was given to the status of ‘Employed’ in preference to ‘Unemployed’, and to the status of ‘Unemployed’ in preference to ‘Outside of the labour force’.
3.4.8. Statistical concepts and definitions - Occupation
The definitions used to prepare census data for this topic do not differ from the definitions contained in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
‘Occupation’ refers to the type of work done in a job. ‘Type of work’ is described by the main tasks and duties of the work.
Persons doing more than one job are allocated an occupation based on their main job, which is to be identified according to:
(1) the time spent on the job or, if not available,
(2) the income received.
The breakdown by occupation is available for persons aged 15 or over that were employed during the reference week.
3.4.9. Statistical concepts and definitions - Industry
The definitions used to prepare census data for this topic do not differ from the definitions contained in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
Industry (branch of economic activity) refers to the kind of production or activity of the establishment or similar unit in which the job of an employed person is located.
Persons doing more than one job are allocated an industry (branch of economic activity) based on their main job which is to be identified according to:
(a) the time spent on the job or, if not available,
(b) the income received.
The reference time for the classification of persons aged 15 or older by ‘Industry’ coincided with the reference week (from March 25 to 31, 2021). ‘Industry’ was not collected in the census for unemployed and economically inactive persons, only for persons who were employed during the reference week.
3.4.10. Statistical concepts and definitions - Status in employment
The definitions used to prepare census data for this topic do not differ from the definitions contained in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
An ‘employee’ is a person who works in a ‘paid employment’ job, that is a job where the explicit or implicit contract of employment gives the incumbent a basic remuneration, which is independent of the revenue of the unit for which he/ she works (this unit may be a corporation, a non-profit institution, government unit or a household). Persons in ‘paid employment’ jobs are typically remunerated by wages and salaries, but may be paid by commission from sales, by piece rates, bonuses or in-kind payment such as food, housing or training.
An ‘employer’ is a person who, working on his or her own account or with a small number of partners, holds a ‘self- employment’ job and, in this capacity, on a continuous basis (including the reference week) has engaged one or more persons to work for him/her as ‘employees’.
If a person is both employer and employee, he/she is allocated to only one group according to:
the time spent on the job or, if not available,
the income received.
An ‘own-account worker’ is a person who, working on his/her own account or with one or a few partners, holds a ‘self- employment job’ and has not engaged, on a continuous basis (including the reference week), any ‘employees’.
‘Other employed persons’ includes persons who are ‘contributing family workers’ and ‘members of producers’ cooperatives'.
A ‘contributing family worker’ is a person who
holds a ‘self-employment’ job in a market-oriented establishment operated by a related person, living in the same household, and
cannot be regarded as a partner (that is an employer or own-account worker) because the degree of commitment to the operation of the establishment, in terms of working time or other factors to be determined by national circumstances, is not at a level comparable to that of the head of the establishment.
A ‘member of a producers’ cooperative' is a person who holds a ‘self-employment’ job in an establishment organised as a cooperative, in which each member takes part on an equal footing with other members in determining the organisation of production, sales and/or other work, the investments and the distribution of the proceeds among the members.
3.4.11. Statistical concepts and definitions - Place of work
The location of the place of work is the geographical area in which a currently employed person does his/her job. The place of work of those mostly working at home is the same as their usual residence. The term ‘working’ refers to work done as an ‘employed person’ as defined under the topic ‘Current activity status’. ‘Mostly’ working at home means that the person spends all or most of the time working at home, and less, or no, time in a place of work other than at home.
Location of the place of performance of the main job is the actual place of performance of the job taking into account the division into work performed in Poland (details), outside Poland (name of the country) or no fixed location (when it is not possible to indicate a specific location geographically). This information, combined with residence data, is used to determine the scale and directions of commuting.
In the case of remote work at home, if this mode of work was forced by the COVID-19 pandemic, the actual place of work before the COVID-19 pandemic was provided. However, if work was performed before the COVID-19 pandemic, both in the home office system and at the company's headquarters, only one place of work was indicated based on the time spent at a given place of work.
3.4.12. Statistical concepts and definitions - Educational attainment
Educational attainment refers to the highest level successfully completed in the educational system of the country where the education was received. All education which is relevant to the completion of a level shall be taken into account even if this was provided outside schools and universities.
Persons aged 15 years or over shall be classified under only one of the categories according to their educational attainment (highest completed level). Persons under the age of 15 years shall be classified under ‘Not applicable’.
The level of education is determined on the basis of the ISCED 2011 classification.
3.4.13. Statistical concepts and definitions - Size of the locality
A locality is defined as a distinct population cluster, that is an area defined by population living in neighbouring or contiguous buildings. Such buildings may either:
form a continuous built-up area with a clearly recognisable street formation; or
though not part of such a built-up area, comprise a group of buildings to which a locally recognized place name is uniquely attached; or
though not meeting either of the above two criteria, constitute a group of buildings, none of which is separated from its nearest neighbour by more than 200 meters.
3.4.14. Statistical concepts and definitions - Place of birth
Information on the ‘Place of birth’ were collected according to the place of usual residence of the mother at the time of the birth, or, if not available, the place in which the birth took place. Information on the country of birth shall be collected on the basis of international boundaries existing on 1 January 2021.
‘EU Member State’ means a country that is a member of the European Union on 1 January 2021.
For reporting countries that are EU Member States, the sub-category ‘Other EU Member State’, referring to their Member State, does not apply.
For reporting countries that are not EU Member States, the category ‘Other EU Member State’ (to be replaced by the category ‘EU Member State’).
The category ‘Other’ includes persons whose country of birth existed at the time of their birth but does not exist
at the time of the census, and which cannot be attributed unequivocally to another country existing at the time of the census, i.e. according to the internationally recognised borders currently in force.
The ‘Unidentified’ category includes persons for whom the mother’s place of residence is unknown at the time of birth and who were born outside the borders of any country, e.g. at sea or in the air.
3.4.15. Statistical concepts and definitions - Country of citizenship
Citizenship is defined as the particular legal bond between an individual and his/her State, acquired by birth or naturalisation, whether by declaration, option, marriage or other means according to the national legislation.
A person with two or more citizenships shall be allocated to only one country of citizenship, to be determined in the following order of precedence:
1. reporting country; or
2. if the person does not have the citizenship of the reporting country: other EU Member State; or
3. if the person does not have the citizenship of another EU Member State: other country outside the European Union.
Where there are cases of dual citizenship where both countries are within the European Union but neither is the reporting country, Member States shall determine which country of citizenship is to be allocated.
The list of countries by “Country of citizenship” applies only for statistical purposes.
For reporting countries that are EU Member States, the sub-category “Citizenship of an EU Member State other than the reporting country” referring to their Member State does not apply.
For reporting countries that are not EU Member States, the category ‘Citizenship of an EU Member State other than the reporting country’ is replaced by ‘Citizenship of an EU Member State’.
Persons who are not citizens of any country or are stateless and have certain but not all rights and obligations related to nationality are classified as ‘unidentified citizenship’.
3.4.16. Statistical concepts and definitions - Year of arrival in the country
The year of arrival shall be the calendar year in which a person most recently established usual residence in the country.
The year 2021 refers to the period from 1 January to the census reference date (in Poland 31 March).
3.4.17. Statistical concepts and definitions - Residence one year before
Place of usual residence one year prior to the census informes about the relationship between the current place of usual residence and the place of usual residence one year prior to the census shall be reported.
Children under one year of age shall be classified under ‘Not applicable’.
Information collected on the topic ‘Previous place of usual residence and date of arrival in the current place’ classify all persons that have changed their usual residence more than once within the year prior to the reference date according to their previous place of usual residence, i.e. the place of usual residence from which they moved to their current place of usual residence.
3.4.18. Statistical concepts and definitions - Housing arrangements
The topic 'Housing arrangements' covers the whole population and refers to the type of housing in which a person usually resides at the time of the census. This covers all persons who are usual residents in different types of living quarters, or who do not have a usual residence and stay temporarily in some type of living quarters, or who are roofless, sleeping rough or in emergency shelters, when the census is taken.
Occupants are persons with their usual residence in the places listed in the respective category.
'Conventional dwellings' are structurally separate and independent premises at fixed locations which are designed for permanent human habitation and are, at the reference date, either used as a residence, or vacant, or reserved for seasonal or secondary use.
'Separate' means surrounded by walls and covered by a roof or ceiling so that one or more persons can isolate themselves. 'Independent' means having direct access from a street or a staircase, passage, gallery or grounds.
'Other housing units' are huts, cabins, shacks, shanties, caravans, houseboats, barns, mills, caves or any other shelter used for human habitation at the time of the census, irrespective if it was designed for human habitation.
'Collective living quarters' are premises which are designed for habitation by large groups of individuals or several households and which are used as the usual residence of at least one person at the time of the census.
'Occupied conventional dwellings', 'other housing units' and 'collective living quarters' together represent '‘living quarters'. Any 'living quarter' must be the usual residence of at least one person.
The sum of occupied conventional dwellings and other housing units represents 'housing units'.
‘Homeless persons’ are persons who, for various reasons - economic, family or administrative - declare they have no permanent place of residence. Homeless persons do not include those who are homeless due to random accidents (disasters, floods, fires, etc.).
3.4.19. Statistical concepts and definitions - Type of family nucleus
The definitions of the terms ‘Family nucleus’, ‘Informal relationship’ and ‘Child’ given for the topic ‘Family status’ (FST) also apply to the topic ‘Type of family nucleus’ (TFN).
3.4.20. Statistical concepts and definitions - Size of family nucleus
The definition of the term ‘Family nucleus’ given for the topic ‘Family status’ (FST) also applies to the theme ‘Size of family nucleus’ (SFN).
3.4.21. Statistical concepts and definitions - Type of private household
The definitions of household terms given for the topic ‘Household status’ (HST) also apply to the topic ‘Type of privte household’ (TPH).
3.4.22. Statistical concepts and definitions - Size of private household
The 2021 National Census adopts the „household- dwelling” concept, according to which all persons living in one dwelling (whether related or unrelated) are considered members of the same household.
A person living alone forms a one-person household.
3.4.23. Statistical concepts and definitions - Tenure status of households
The topic 'Tenure status of households' refers to the arrangements under which a private household occupies all or part of a housing unit.
Households that are in the process of paying off a mortgage on the housing unit in which they live or purchasing their housing unit over time under other financial arrangements are classified under 'Households of which at least one member is the owner of the housing unit'.
Households of which at least one member is the owner of the housing unit and at least one member tenant of all or part of the housing unit are classified under category 'Households of which at least one member is the owner of the housing unit'.
3.4.24. Statistical concepts and definitions - Type of living quarter
A living quarter is housing which is the usual residence of one or more persons. The terms ‘Conventional dwellings’, ‘Other housing units’ and ‘Collective living quarters’ are defined as under the topic ‘Housing arrangements’.
'Separate' means surrounded by walls and covered by a roof or ceiling so that one or more persons can isolate themselves. 'Independent' means having direct access from a street or a staircase, passage, gallery or grounds.
'Other housing units' are huts, cabins, shacks, shanties, caravans, houseboats, barns, mills, caves or any other shelter used for human habitation at the time of the census, irrespective if it was designed for human habitation.
'Collective living quarters' are premises which are designed for habitation by large groups of individuals or several households and which are used as the usual residence of at least one person at the time of the census.
'Occupied conventional dwellings', 'other housing units' and 'collective living quarters' together represent '‘living quarters'. Any 'living quarter' must be the usual residence of at least one person.
3.4.25. Statistical concepts and definitions - Occupancy status
Occupied conventional dwellings’ are conventional dwellings which are the usual residence of one or more persons at the time of the census.
‘Unoccupied conventional dwellings’ are conventional dwellings which are not the usual residence of any person at the time of the census.
Dwellings reserved for seasonal or secondary use, vacant dwellings, as well as conventional dwellings with persons present but not included in the census shall be classified under the category ‘Unoccupied conventional dwellings’
3.4.26. Statistical concepts and definitions - Type of ownership
The topic ‘Type of ownership’ refers to the ownership of the dwelling and not to that of the land on which the dwelling stands. It is intended to show the tenure arrangements under which the dwelling is occupied.
‘Owner-occupied dwellings’ are those where at least one occupant of the dwelling owns parts or the whole of the dwelling.
‘Rented dwellings’ are those where at least one occupant pays a rent for the occupation of the dwelling, and where no occupant owns parts or the whole of the dwelling.
Unoccupied conventional dwellings shall be classified under ‘Not applicable’.
3.4.27. Statistical concepts and definitions - Number of occupants
The number of occupants of a housing unit is the number of people for whom the housing unit is the usual residence.
3.4.28. Statistical concepts and definitions - Useful floor space
Useful floor space is defined as:
— the floor space measured inside the outer walls excluding non-habitable cellars and attics and, in multi-dwelling buildings, all common spaces; or
— the total floor space of rooms falling under the concept of ‘room’.
3.4.29. Statistical concepts and definitions - Number of rooms
A ‘room’ is defined as a space in a housing unit enclosed by walls reaching from the floor to the ceiling or roof, of a size large enough to hold a bed for an adult (4 square metres at least) and at least 2 metres high over the major area of the ceiling.
3.4.30. Statistical concepts and definitions - Density standard (floor space)
The topic ‘Density standard’ relates the useful floor space in square meters or the number of rooms to the number of occupants, as specified under the topic ‘Number of occupants’. Member States shall report on the density standard measured by the ‘useful floor space’, or, if not possible, by the ‘number of rooms’.
3.4.31. Statistical concepts and definitions - Density standard (number of rooms)
The topic ‘Density standard’ relates the useful floor space in square meters or the number of rooms to the number of occupants, as specified under the topic ‘Number of occupants’. Member States shall report on the density standard measured by the ‘useful floor space’, or, if not possible, by the ‘number of rooms’.
3.4.32. Statistical concepts and definitions - Water supply system
Not defined.
3.4.33. Statistical concepts and definitions - Toilet facilities
Not defined.
3.4.34. Statistical concepts and definitions - Bathing facilities
A bathing facility is any facility designed to wash the whole body and includes shower facilities.
3.4.35. Statistical concepts and definitions - Type of heating
Conventional dwelling is considered as centrally heated if heating is provided either from a community heating centre or from an installation built in the building or in the conventional dwelling, established for heating purposes, without regard to the source of energy.
3.4.36. Statistical concepts and definitions - Type of building
The topic ‘Dwellings by type of building’ refers to the number of dwellings in the building in which the dwelling is placed.
3.4.37. Statistical concepts and definitions - Period of construction
The topic ‘Dwellings by period of construction’ refers to the year when the building in which the dwelling is placed was completed.
3.5. Statistical unit
In the 2021 National Population and Housing Census, the statistical units were: building, dwelling, person. Secondarily derived units (based on person data) were household and family.
3.6. Statistical population
"Target population" means the set of all statistical units in a defined geographical area at the reference date that are eligible for a survey on one or more specified topics. The target population includes each valid statistical unit exactly once.
3.7. Reference area
In accordance with the provisions of the Census Act 2021, the Population and Housing Census was conducted on the territory of the Republic of Poland and covered the following population groups:
Polish citizens residing in Poland who have their place of residence (understood as the place of permanent or temporary registration or as a place of permanent or temporary residence) in dwellings, buildings, other premises other than a dwelling or in collective living quarters;
foreigners residing in Poland permanently and staying temporarily (whether registered or not) in dwellings, buildings, other premises other than a dwelling or in collective living quarters;
Polish citizens residing abroad (regardless of the period of residence) who had not deregistered from permanent residence in Poland;
homeless persons persons living in the streets without a shelter – Poilsh citizens and foreigners;
moreover:
dwellings, buildings, other occupied premises other than a dwelling.
A person who deregistered from permanent residence in Poland due to the permanent stay abroad is not obliged to participate in the National Population and Housing Census 2021.
The census did not include:
heads and foreign staff of diplomatic representations and consular offices of foreign countries, their family members and other persons enjoying privileges and immunities under the law, international agreements or generally recognized international customs;
apartments, buildings, facilities and premises owned by diplomatic representations and consular offices of foreign countries
3.8. Coverage - Time
The data relates to the census reference date, 31 March 2021. Data on ‘Current activity status' are based on a reference period of one week (25 to 31 March 2021) in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
3.9. Base period
Not applicable.
Counts of statistical units should be expressed in numbers and where is needed rate per inhabitants enumerated in the country.
Information is provided in the sub-concepts 5.1 - 5.3.
5.1. EU census reference date
31 March 2021
5.2. National census reference date
31 March 2021
5.3. Differences between reference dates of national and EU census publications
Data on the population in a 1 km2 grid were transferred on 15 December 2022. Data on the resident population (for HC2 in the cross-section for NTS3) by gender and age groups were transferred in accordance with the deadline set by Eurostat - by the end of February 2023.
6.1. Institutional Mandate - legal acts and other agreements
Legal acts regulating issues related to the conduct of the National Population and Housing Census in 2021 were:
1. National legal acts regulating issues related to the conduct of the National Census of Population and Housing in 2021:
a) Act of 29 June 1995 on public statistics (Journal of Laws of 2021, item 955);
b) Act of August 9, 2019 on the national census of population and housing in 2021 (Journal of Laws of 2021, item 1143);
c) Act of 10 May 2018 on the protection of personal data (Journal of Laws of 2019, item 1781).
The above subpoints list the metrics of normative acts that were in force during the National Census 2021.
2. International legal acts regarding censuses:
a) Regulation (EC) No 763/2008 of the European Parliament and of the Council of 9 July 2008 on population and housing censuses (OJ EU L 218, 13 August 2008, p. 14);
(b) Commission Implementing Regulation (EU) 2017/543 of 22 March 2017 laying down rules for the application of Regulation (EC) No 763/2008 of the European Parliament and of the Council on population and housing censuses, as regards the technical specifications of the topics and their divisions ( OJ L 78, 23 March 2017, p. 13);
c) Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation data) (OJ EU L 119 of 04 May 2016, p. 1) - "GDPR", regarding the principles of personal data processing (abbreviated as GDPR);
d) Regulation (EC) No 223/2009 of the European Parliament and of the Council of 11 March 2009 on European statistics and repealing Regulation (EC, Euratom) No 1101/2008 of the European Parliament and of the Council on the transmission of data to the Statistical Office of the European Communities statistical data subject to the principle of confidentiality;
e) Council Regulation (EC) No 322/97 on Community statistics and Council Decision 89/382/EEC, Euratom establishing a Committee for the Statistical Programs of the European Communities (Text with EEA and Swiss relevance) (OJ L 87 of 31 march 2009, p. 164), regarding the principles of developing, creating and disseminating statistics and the confidentiality of statistical information;
(f) Commission Regulation (EU) 2017/712 of 20 April 2017 establishing the reference year and the statistical data and metadata program for population censuses and housing provided for in Regulation (EC) No 763/2008 of the European Parliament and of the Council (OJ L 105, 21 April 2017, p. 1);
(g) Commission Implementing Regulation (EU) 2017/881 of 23 May 2017.
on the implementation of No. 763/2008 of the European Parliament and of the Council in the field of population and housing censuses as regards the conditions and structure of reports on the quality and technical format of the transmission of data and amending Regulation (EU) No. 1151/2010 (OJ L 135, 24 May 2017, p. 6);
(h) Commission Implementing Regulation (EU) 2018/1799 of 21 November 2018 on the establishment of a temporary direct action in the field of statistics on the dissemination of selected topics of the 2021 population and residential census and encoded in a kilometer grid (OJ L 296, 22 November 2018, p. 19).
3. Issues regarding the popularization of the National Census 2021 by public media were regulated by the Regulation of the Council of Ministers of April 27, 2021 on the detailed conditions and method of disseminating programs popularizing the national census of population and housing in 2021 (Journal of Laws of 2020, item 837).
4. The obligation of municipalities to verify, update and supplement the address and housing list resulted from:
a) Act on the National People and Housing Census 2021, Art. 24 section 1 point 1,
b) Regulation of the Council of Ministers of April 15, 2020 on the detailed scope of data for the address and housing list, to be verified, updated and supplemented by gminas in connection with the national census of population and housing in 2021 (Journal of Laws 2020, item 737).
5. List of legal acts based on which the Central Statistical Office is obliged to prepare data for national needs:
a) Act of November 13, 2003 on the income of local government units (Journal of Laws of 2010 No. 80, item 526, and later changes) - population data for communes (in accordance with Article 2, these data constitute the basis for subsidizing each commune, in accordance with Article 28(6), these data are used to distinguish rural areas or towns with up to 5,000 inhabitants in order to determine how division of the educational part of the general subsidy for local government units);
b) Act of February 20, 2009 on the village fund (Journal of Laws No. 52, item 420, and later changes) - data on the population in the commune (pursuant to Article 2, this data constitutes one of the elements used to calculate the amount of the village fund);
c) Act of November 19, 2009 on gambling (Journal of Laws No. 201, item 1540, and later changes) - data on the population of cities (in accordance with Article 15, these data are used to determine the number of casinos that can operate and bingo halls);
d) Regulation of the Council of Ministers of May 13, 2003 (Journal of Laws No. 88, item 808 and later changes) regarding the algorithm for transferring funds from the State Fund for the Rehabilitation of Disabled Persons to voivodeship and poviat governments - data on the number of people with disabilities by age groups at the poviat level;
e) Act of January 6, 2005 on national and ethnic minorities and on the regional language (Journal of Laws No. 141 of January 31, 2005, and later changes) - data on the number of inhabitants of a commune belonging to a minority (in accordance with from Article 14);
f) Resolution No. 115/2016 of the Council of Ministers of September 27, 2016 on the adoption of the National Housing Program - use of data to assess the implementation of the National Housing Program, introduction of anti-smog resolutions.
6.1.1. Bodies responsible
The census work, i.e. the preparation, organisation and conduct of the censuses, as well as the preparation of the results, sharing and dissemination of the resulting statistical information, was managed by:
President of the Statistics Poland, as General Census Commissioner and personal Data Administrator,
Director of the Statistics Poland's organisational unit responsible for censuses - as Director of the Central Census Bureau and Deputy General Census Commissioner,
Voivodes - as Voivodship Census Commissioners,
Directors of locally competent statistical offices - as Deputy Voivodship Census Commissioners (ZWKS),
City Presidents/Mayors/Commune heads - as Commune Census Commissioners (GKS).
Census tasks within the functioning organizational structure during the 2021 Census were performed by:
a) Central Census Office - CBS (including task forces and the Census Management Center with central dispatchers). The CBS was established by the General Census Commissioner by order, specifying its organizational structure and work organization, taking into account the need to ensure the correct and efficient performance of census work. Central Census Office aimed to prepare, organize and conduct the census and develop the final results of these censuses. It continuously analyzed the status of the census implementation, possible threats and problems emerging during the 2021 Census, and also proposed ways to solve them as remedial actions. Census staff meetings were held on average once a week to analyze and resolve these issues. Ultimately, the General Census Commissioner made decisions regarding them. The Census Management Center (CZS) was established to manage the census in the country. Its members - central dispatchers were the first line of support for Voivodship Census Offices.
b) Voivodship Census Offices - WBS (including Voivodship Census Management Centers with voivodship dispatchers and telephone enumerator coordinators). Each WBS monitored the progress of the census in its voivodeship on an ongoing basis and took remedial actions, both its own and those recommended by the CBS. If any problems or significant issues requiring decisions at a higher organizational level occurred, they were reported at census staff meetings. A very important role, especially at the stage of data collection by census enumerators, was played by voivodeship dispatchers, whose task was to plan and manage the work of enumerators in the subordinate area (based on daily targets provided by CZS), control the quality and progress of this work, and monitor and report the course of the census. They also constituted the first line of support for Commune Census Offices. As part of the WBS, census work was also performed by employees designated by ZWKS (including statistical interviewers), acting as telephone enumerators, who operated the "Register by telephone" line and conducted telephone interviews with respondents (CATI).
c) Commune Census Offices – GBS - ensured the correct and efficient performance of census work in the subordinated area. Work at GBS was performed by employees of local government units from a given office, performing various functions depending on the scope of work, including: commune coordinators. They played a key role during the preparatory work and during the implementation of the census, constituting the first line of support for census enumerators.
d) census enumerators - their task was to obtain data collected as part of the population and housing census from people who did not complete the online self-enumeration, by conducting telephone and face-to-face interviews. The census enumerators were:
employees of public statistics units, including statistical interviewers appointed by ZWKS - carried out telephone interviews (CATI);
natural persons appointed by ZWKS as a result of external recruitment, organized on the terms and in the manner specified in the 2021 Census Act, on the basis of a order contract - carried out both telephone interviews (CATI) and direct interviews with the respondents (CAPI).
The work of the WBS was supervised by the General Census Commissioner and that of the GBS by the deputy of the locally competent Voivodship Census Commissioner. Supervision was exercised with regard to compliance with the provisions of the Census Act 2021. If a violation of the Act was found, either the General Census Commissioner or the ZWKS were obliged to order immediate corrective measures, setting a deadline for their implementation.
A simplified diagram of the organizational structure is presented in the figure attached.
In order to use data from administrative sources and to adapt official registers and public administration information systems to perform tasks in the censuses, the Consultative Council for the Agricultural Censuses and the National Population and Housing Censuses was established by Order No. 204 of the Prime Minister of 20 December 2017.
The Consultative Council then consisted of:
Chairman - the President of the Statistics Poland,
Deputy Chairmen and other members - representatives, at the rank of Secretary of State or Undersecretary of State from designated ministries,
Secretary - director of the organisational unit responsible for the organisation of the census in the Statistics Poland.
The main task of the appointed body was to systematise the data contained in the registers and information systems of the public administration, to define the conditions for their cooperation and data exchange (in accordance with the requirements of interoperability) and the manner of making the information contained in them available for the needs of the tasks carried out, for the benefit of the censuses, by state authorities and by citizens and entrepreneurs. Moreover, the task of the system managers was to ensure the functioning of information and information standards that guarantee the coherence and interoperability of the public administration information structure in public registers and public administration information systems.
The measures agreed by the Consultative Council were implemented in the Act on the National Population and Housing Census 2021 and accompanying documents. The acquisition of data and meta-information from information systems for the census was regulated in the above Act.
The data provided were used for the preparation and updating of the population and housing list and as a direct source of census data on the 2021 Census.
The entities obliged to submit data to the President of the Statistics Poland as part of the census work, as well as the detailed scope of data and the deadlines for their submission, are specified in Annex No. 2 to the Act on the Census 2021. Due to the prevailing pandemic conditions and the amendment to the Act on Census 2021, the list of entities operating information systems that provide data in the census has been expanded to include providers of publicly available telecommunications services in order to provide the list with current telephone numbers of respondents. This change allowed us to minimize face-to-face interviews in favor of telephone interviews.
In addition, technical and organisational issues are defined in:
Ordinance of the Council of Ministers of 15 April 2020 on the detailed scope of data for the address-housing list provided for verification, updating and supplementation by municipalities in connection with the National Population and Housing Census 2021. (Dz. U. 2020, item 737);
the annually prepared Programme of Public Statistics Surveys (PBSSP) with regard to the deadlines for the submission of data from information systems.
7.1. Confidentiality - policy
Personal data, from the moment they are collected directly from respondents or from information systems of public administration and official registers or non-public information systems for the purpose of performing tasks specified in the Act (including census administration), are treated as statistical data and are handled to ensure statistical confidentiality. In the course of preparations for the 2021 Census, personal data were obtained from entities obliged to provide data to the President of Statistics Poland in the course of census administration (a detailed list of such entities is included in Annex 2 to the Act on the 2021 Census). Personal data can only be processed in compliance with the need-to-know principle by persons authorized by Data Administrators.
In Poland, ensuring confidentiality of information about natural persons is one of the key principles for handling data collected during statistical surveys, including the census. The obligation to maintain statistical confidentiality of official statistics is specified in the Act on Official Statistics. Consolidated text Journal of Laws of 2021, item 955:
Art. 10: “Identifiable individual data collected in statistical surveys require obsolute protection. Such data may only be used to prepare statistical studies, compilations and analyses, and to create sampling frames for statistical surveys; it is prohibited to release such data or use them for purposes other than those specified in this act (statistical confidentiality)”.
Art. 35: “It is prohibited to use such data for purposes other than those specified in the Act, in particular to make decisions regarding a specific natural person.”
Art. 38: "1. Identifiable unit data obtained in statistical surveys may not be published or released for public use.”
The basic confidentiality regulations are described above. In addition, there is also “A policy for handling statistical data”, introduced by virtue of an internal regulation No. 32 issued by the President of Statistics Poland on December 4, 2020. It includes regulations and recommendations regarding the specification of needs regarding statistical data, survey design, construction and modification of systems and applications used in the process of statistical production, data collection, processing, analysis and release as well as the evaluation of obtained information.
7.2. Confidentiality - data treatment
As already mentioned, the rules for how to process data to ensure confidentiality of statistical information and how to prevent unauthorized disclosure are included in the documents listed above.
In the work currently underway, the starting point is the 3-anonymity rule for frequency tables, which is stipulated in the act on official statistics. It means that a cell with the risk of primary disclosure that needs to remain zero or those in which the relevant row/column contains at least one cell with the risk of primary disclosure. confidential is one that includes fewer than three units. Cells with the risk of secondary disclosure are those containing a Cells with the risk of primary disclosure are suppressed. In the work involving data on the resident population broken down by sex, age and 1km-grid, in the case of age groups cells with the risk of secondary disclosure were suppressed at random – based on the distribution of age groups in the population. For the population dataset broken down by commune of residence, sex, age, place of birth and place of residence one year before the census, the method of targeted record swapping is being tested, which was used to swap territorial codes of communes (the 1km-grid data set was treated as microdata). The loss of information resulting from the use of SDC methods (both globally and for individual variables) was assessed. Tests were carried out using the cell key method for hypercubes (HC) from the 2021 census.
8.1. Release calendar
The calendar for sharing the results of the 2021 Census has been developed and published on the website of Statistics Poland. The information needs of national audiences have been taken into account in the timing of availability. The schedule includes the following informations:
date of publication;
type of elaboration (signal information, publication, data set in database systems);
type of the data (preliminary/final);
the scope of information included in the elaboration;
territorial cross-sections of data.
As needed, the release calendar was updated – additional studies were supplemented and/or publication dates were revised.
Disemination of information from censuses is carried out in accordance with the applicable procedures, similary to those adopted for the results of other statistical surveys. The basis for designing solutions for sharing the resulting statistical information from the censuses was to identify groups of data recipients and define their needs regarding the resulting information and the methods of its transmission to the widest possible extent. For the purpose of sharing the resulting information from the 2021 Census, three main groups of users have been distinguished:
international organizations,
domestic external users,
internal users.
Data on population and housing censuses are disseminated every decade.
Information is provided in the sub-concepts 10.1 - 10.7.
10.1. Dissemination format - News release
Based on the results of the 2021 Census for national users, the following news releases were issued:
Preliminary results of the National Population and Housing Census 2021 (date of publication 27 January 2022)
National Population and Housing Census 2021. Preliminary Results (date of publication 26 April 2022)
Population by social characteristics - preliminary results of the National Census 2021 (date of publication 31 May 2022)
Equipping dwellings and buildings with technical installations and devices - preliminary results of the Census 2021 (date of publication 30 June 2022)
Labour market status of the population - preliminary results of the National Census 2021 (date of publication 29 July 2022)
Information on the results of the National Population and Housing Census 2021 at the voivodship, poviats and gminas level (date of publication 20 September 2022)
Usually resident population – information on the results of the National Population and Housing Census 2021 (date of publication 21 December 2022)
Family - National Population and Housing Census 2021. Preliminary Results (date of publication 30 January 2023)
Preliminary results of the National Population and Housing Census 2021 in terms of national-ethnic structure and the language of home contacts (date of publication 11 April 2023)
International migration 2021 - results National Population and Housing Census 2021 (date of publication 31 August 2023)
Visualization of census data is possible through the Geostatistical Portal - an interactive tool for presenting statistical data in spatial terms, using cartograms and cartodiagrams.
An open access (OA) database available to all people via the Internet is the Local Data Bank (BDL website). This database contains aggregated data from the National Census of Population and Housing from such sections as: population, buildings, dwellings, economic activity of the population, households and families (BDL website) - category NATIONAL CENSUSES. The data in this database cover various information and territorial scope, as well as result information from the 2011, 2002 and 1988 censuses. The database allows you to generate your own reports according to available variables within the sections indicated above.
10.4. Dissemination format - microdata access
Access to non-identifiable microdata from censuses is granted on the basis of a written contract concluded with the research and scientific institution requesting access to these data in order to conduct advanced analysis. The contract includes the terms and costs of the data release as well as obligations to maintain statistical confidentiality, certified by the handwritten or electronic signature of a representative of the requesting institution.
The original microdata sets from the censuses, in order to obtain a version safe for release in the form of non-identifiable microdata, are modified by the units of official statistics responsible for these data. These data sets are then made available on a special computer workstation (so-called Scientist Workstand) isolated from the Internet, located in a separate room of the statistical office, equipped with a secure IT environment, intended for advanced analyses conducted for research and scientific purposes.
The obtained results of the analyses, after positive verification in terms of maintaining statistical confidentiality by the unit of official statistics responsible for data preparation, are transferred to the research and scientific institution via the TransGUS platform (a secure ICT channel for electronic data transmission) or an external storage medium.
The Analytical Microdata Base (ABM) is designed as the main element for processing result information and for making data available.
The process of making data available under the ABM system includes:
the preparation of products to be made available;
the management of products made available;
the monitoring and analysis of questions asked by users.
Different forms of dissemination of census data, and in particular an extensive set of tables published and of pre-defined tables, available in the ABM system and in other databases, meet essential requirements of a wide range of users at the national and regional levels.
The users of the results also have a possibility to calculate themselves simple correlation tables, through the access to ABM and to the Metainformation Subsystem (PM). At the same time, tables which require data processing (calculation) according to individual, special requests made by recipients, covering non-typical territorial sections, wider data correlation, or non-standard grouping of data, will be prepared by specialist statistical units, equipped with an appropriate IT structure and availing of trained human resources.
Scientists, as well as representatives of national and international institutions, can make use of so-called ‘Scientist Workstand‘ where the use of unidentifiable individual data are made available on special request.
10.5. Dissemination format - other
The resulting information from the census is disseminated using the following catalogue:
Information Portal of Statistics Poland – stat.gov.pl,
census portal - census.gov.pl,
database - BDL,
a tool for spatial visualization of statistical data - Geostatistics Portal,
data sharing environments: for scientists, for internal users,
data transmission programs to international organizations,
electronic communication tools - social media, e-mail, newsletter, RSS channel.
In addition to providing access to the resulting census information using the above-mentioned online tools, dissemination of statistical information also take place through following functioning solutions:
Statistical Information Center,
network of Provincial Regional Research Centers,
Central Statistical Library,
Sales Kiosk of Statistics Poland publications,
Statistical hotline,
presentation of information during various events (e.g. seminars, conferences, meetings).
10.6. Documentation on methodology
The methodology used for the 2021 National Census was disseminated in the publication "National Census of Population and Housing 2021. Methodology and organization of the survey."
The publication contains the objectives of the census, legal basis, methods of conducting the census, scope of data and data sources, scope of thematic areas, methods and forms of data dissemination. It is available online:
For internal needs of people involved in census work, the "Methodological Instruction for the National Census of Population and Housing in 2021" (2021) was developed.
In addition, an instruction for the online self-enumeration was also developed, which was a document for external use by people who carried out the census on their own by using the census application available on the website dedicated to 2021 CENSUS. The document was developed in several language versions, i.e. in addition to the version in Polish, the website also contains a document in English, Ukrainian and Russian.
10.7. Quality management - documentation
As part of the preparatory work for the Census 2021, a Quality Team was in place at the Statistics Poland. The Team developed extensive documentation containing issues related to measuring the quality of data sources and data at different stages of the census, with the aim of exploring the feasibility of using registers and register data in censuses.
In addition, with the aim of obtaining the best possible quality of data from respondents and facilitating the data collection process, numerous strategic documents were developed, including organisational and methodological guidelines and instructions. The various types of documents (instructions, guidelines, procedures) were intended for the enumerators carrying out the census by telephone interview, the voivodship and commune census offices, as well as the persons working on the census helpline.
The organisational documentation was intended to standardise the conduct of those involved in carrying out census work during the 2021 Census. It contained the most important legal issues, described in detail the organisational structure of the census apparatus, the tasks of individual units and persons performing different functions in the structure, a list of training courses, presented key dates related to the census and the organisation of the data collection stage, and described the applications and systems used in the Census 2021.
11.1. Quality assurance
Quality assurance at the various stages of data acquisition and processing was achieved by developing and implementing very detailed guidelines and actions covering the issues described below.
Organization of the census apparatus
The use of an organisational structure at central, provincial and municipal levels has enabled efficient and effective management of the census work (a simplified organisational diagram of the structure of the census apparatus is presented in chapter 6.1.1).
It also made it possible to organise responsibility and decision-making at individual organisational levels, thanks to which it also significantly influenced the quality of the tasks carried out throughout the census.
Census apparatus training
Before the census began, training was conducted for all people involved in the census work in order to prepare for the implementation of the tasks. Due to the existing pandemic threat related to COVID-19, all training was conducted remotely, using Lync/Skype or Webex communicators. In accordance with the adopted training concept according to the cascade model, in the first stage, central training was carried out, in which substantive and organizational trainers, central and voivodeship dispatchers, census helpline consultants, directors and deputy directors of statistical offices and other members of the Voivodship Census Offices were trained. Then, centrally trained trainers were training candidates for census enumerators and members of Gmina Census Offices and also provided additional training for enumerators, if necessary.
Theoretical training and practical workshops were conducted in a very wide range of topics, including:
a. organization of training,
b. organization of the census, including the work schedule, organizational structure of the census apparatus,
c. census methodology,
d. legal basis,
e. security of personal data in 2021 CENSUS, including personal data protection,
f. popularization of 2021 CENSUS,
g. selected social engineering issues, including coping with stress and difficult situations in contacts with clients, assertive attitude training,
h. learning how to conduct remote training and the tools used for this purpose,
i. practical work with census applications by completing them based on training examples and test data,
j. practical work with other applications used to manage and monitor the census.
Recordings of the training sessions and all necessary materials were made available to the trainees and those involved in the census work on the e-learning application and the Redmine reporting system, among others.
In the opinion of the census offices, the type and substantive scope of the training exhausted the subject of the census and the proposed form of online training turned out to be a good solution.
IT systems and applications supporting the census work
During the census work, members of the census apparatus used various IT systems and applications. They were used to manage the quality of census processes and monitor the current progress of work in order to achieve high completeness of the census. The most commonly used IT tools were:
CORSTAT_census management application
The CORstat_census system brought together all the resources and functions related to the management of the tasks of the census participants - the counters, dispatchers, telephone consultants and those monitoring the course and progress of the census campaign. For each user of the system, the range of available functions and the data space within which the individual census participants carried out their tasks were defined. CORstat_Census interacted with data acquisition systems and applications, collected and distributed to their users data and information related to the implementation of data acquisition processes.
The system enabled:
(a) monitor the status of the census and report on the progress of the census,
(b) controlling the workflow between the different data collection channels,
c) allocating work to the enumerators,
d) providing data on the progress of the census of dwellings and persons,
e) ensuring data flow between census systems,
f) managing data availability,
g) determining statuses that determine the status of the census,
h) managing the user base - access to the system was strictly limited.
Through the implementation of the above-mentioned functionalities, the system ensured confidentiality and security, as well as high data quality and made it possible to control the completeness of the census.
In order to confirm that a person was enumerated, it was required to obtain complete data on that person (correct individual form), thus marking that person as enumerated occurred when the first correct individual form was registered in the census system. In addition, the CORstat_census system generated detailed daily reports on the progress of the census at the national level and at the level of individual regionals, districts and municipalities.
Redmine reporting system
The Redmine reporting system was a tool to facilitate the clarification of problems and communication between different levels of the census apparatus. It also allowed previously developed solutions to be used for those who encountered a problem for the first time (access to the knowledge base). It also allowed information to be communicated quickly to all participants.
Call Center
The Call Centre system was used to handle the Census Helpline and outbound calls for the census by computer-assisted telephone interview method, using an electronic form to record the data.
IC Business Manager
A system that allowed the load on the Helpline to be checked on an ongoing basis. This application allowed checking the number of calls from respondents, the waiting time for a call from a consultant or an enumerator, call times and the number of serving consultants and enumerators. The Central Census Bureau, during the census, after analysing the statistics available in the IC Business Manager, made decisions on staffing changes on the "Write by Phone" line to eliminate excessive waiting times for respondents.
Information dashboard
Was developed to provide access to information on the completeness of the census and selected recorded phenomena. It provided statistics, charts and maps at the national level, as well as for individual regionals, municipalities, districts and census tracts. The dashboard also calculated and made available daily targets, i.e. the number of dwellings and persons that should be enumerated on consecutive days in order to achieve the assumed completeness of the census. The targets were a key tool for planning future activities at the various levels of the organisation to ultimately achieve the required completeness.
E-learning application
The e-learning application, mentioned earlier in the training scope, was also prepared for the examination to qualify the census enumerator candidates and for them to continuously improve their knowledge. The system provided training materials for the candidates, as well as the possibility to take tests and fill in data for the contracts. The application included census methodological materials and sample tests. It was also extended with an additional module for sign enumerators - people with knowledge of Polish Sign Language (PSL).
An application for verifying the identity of census enumerators
It was prepared and published prior to the launch of telephone and face-to-face interviews in the Census 2021. During the interview, the respondent had the possibility to verify the enumerator by entering his/her first name, surname and three drawn digits of the ID. If the data was entered correctly, information was returned about the correct verification and, in addition, about the Statistical Office where the enumerator was employed.
The "Order contact" application
Was developed to ensure high completeness of the census and the possibility of carrying out the Census 2021 interview with a deaf or hard of hearing person. It made it possible to make an appointment with a signer enumerator. The completed contact order form went to the relevant staff in the Statistical Offices, whose job it was to arrange an interview between the enumerator and the respondent via instant messaging.
Schedules
In addition to a set of documents containing organisational and methodological guidelines and instructions, a work schedule was drawn up for the implementation of the census. It contained descriptions of the tasks of the individual task groups of the Central Census Bureau, the leaders responsible for their implementation and other units participating in a given task. The schedule was divided into 32 areas, in which tasks covering a given scope of work were grouped. The activities included in the document were carried out in accordance with the deadlines indicated therein, and the completion of these tasks on time was monitored by the Central Census Bureau.
The schedule built in this way was an accumulation of very labour-intensive and responsible tasks to be performed, but at the same time resulted in the most optimal solutions. The document made it possible to schedule the progression of all the census activities over time, helped influence the scope and interrelationships between them and facilitated the identification of multiple options for the succession of these activities. Moreover, it made it possible to supervise and detect early risks in the implementation of the census.
As the experience of the censuses shows, it is necessary to plan the individual activities appropriately and at an early stage, with deadlines for execution and the unit responsible for the tasks in question, and then to carry out the individual processes in accordance with the approved schedule.
Security of data processing (data protection, security measures used by statistics, principles of personal data processing)
The data collected and processed in the course of the Census 2021 was subject to special protection in accordance with the provisions of the law. The Data Administrator, which was the President of the Statistics Poland, implemented appropriate technical and organisational measures to ensure that the processing took place in accordance with the Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data and repealing Directive 95/46/EC (RODO).
The censuses, like all statistical surveys conducted by the Statistics Poland, were carried out with high security standards, based on modern ICT techniques providing, among other things, advanced protection against cyber attacks and serving to manage information and security events.
The security tools and procedures used by the Census met the highest standards and ensure the legally required protection of the information collected.
During 2021 CENSUS, security design and management standards such as ISO/IEC 27001 (for information security system management), ISO/IEC 27002, ISO/IEC 27005 and good network and security design practices, e.g. CERT, IETF/RFC guidelines, were relied upon.
Moreover, all persons performing census work made a written promise as follows: "I promise that I will perform my work for public statistics with full reliability, in accordance with the professional ethics of a statistician and I will keep the individual data learned during my work secret from third parties."
The result of the development and implementation of very detailed guidelines and measures was a very high census completion rate was achieved and the data collected was complete and of good quality. The census ran according to schedule. Care was also taken to ensure the health safety of all those taking part.
11.2. Quality management - assessment
The use of large register resources and a high degree of completion of the questionnaire provided sufficient access to data on all units that could potentially belong to the surveyed populations. This applies to persons (indirectly households and families) but also to buildings and dwellings.
It is estimated that there is a small margin of possibility that there are units that are outside the field of view of official statistics, i.e. those that left no trace in the census resource collected as part of the census work. As regards the population, this is confirmed by a small number of identified
in the course of the questionnaire survey (as already marked with a high rate of implementation) such persons for whom there was previously no data in the register resources.
Notwithstanding the above, it should be borne in mind that the key in terms of adequate coverage of the target population in the census results was the correct determination, within the data obtained, of the relevant target population, i.e. the correct selection from the total available in the datasets of such units that actually belonged to the target population. The latter concerns analyses aimed at assessing the quality of coverage.
For persons, the inclusion in the target population (residential population) was based on the respondents’ declarations (regarding the place, nature and time of residence) and data from the registers:
about check-in, declarations of trips abroad and signs of life in appropriate types of registers.
1) demographic analysis and 2) comparative analysis of the subpopulation of persons residing abroad (details in section 18.2.1) were used to assess population coverage. The second method showed an underestimation of the subpopulation of persons residing abroad (emmigrants), which in turn had a direct effect on the overestimation of the population of residents, since such persons (emigrants) should be excluded from this population. In the assessment of national statistics, the magnitude of this error is generally low. Analyses of the causes of this overestimation are still ongoing, but the main hypothesis verified during them is the assumption that some of the persons staying abroad, as part of the questionnaire survey, showed a tendency to declare their stay in their dwelling in Poland.
On the other hand, with regard to the characteristics of the surveyed populations, it can be concluded in general that the majority of census themes (topics) obtained satisfactory completeness of the data, and only for a few characteristics the use of imputation procedures for missing values was required. On the basis of analyses carried out on the basis of a master control study and comparisons of census results with data from various previous studies, the breakdowns of characteristics obtained as part of the inventory are generally considered to be reliable.
11.2.1. Coverage assessment
Demographic analysis, based on the methods described in the literature, was used to assess population coverage in the 2021 Census, consisting of comparing the balance sheet method (population balance based on census results 2011 and retroactive balance based on census results 2021) of population structures by gender, age and territorial cross-section between the last two censuses, i.e. in 2011 and 2021, as well as the interpartition period. This analysis was carried out for a wider population than residents, i.e. taking into account those temporarily residing abroad.
The group of persons temporarily staying abroad for more than 12 months was also analysed, taking into account the results of census 2002, 2011 and 2021 as well as current surveys on emigration for permanent residence conducted by the Statistics Poland. Foreign data on the number of residents with Polish citizenship were also used: from census and housing census round 2001, 2011, 2021 from European countries and Eurostat database.
This analysis also uses the results of Oxford University’s scientific research on length and the nature of immigrants’ stay in the UK. Thanks to the use of multiple data sources in the analysis, it was possible to estimate the error rate of coverage of the subpopulations of temporary emigrants and as a result also the target population in the 2021 census.
The relevant results for the assessment of coverage for the resident population are as follows:
A. Census population
absolute value – 37019, 3 thousand.
percentage of the estimated target population – 100.57 %
Estimated target populationabsolute value – 36 811.0 thousand.
B. Under-coverage (estimated)
absolute value – 0
percentage of the census population – 0 %
C. Over-coverage (estimated)
absolute value – 208.4 thousand.
percentage of the census population – 0.57 %
In order to asses the quality of housing census, information on the numer of dwellings compiled for the years 2011-2020 as part of the annual housing stock balances, based on the results of the 2011 Census, was compared with information calculated under the retrospective estimation, for which the starting point is the results of the 2021 Census.
A. Census population of dwellings
absolute value – 15 227,9 thousand
percentage of the estimated target population 100,35%
Estimated target population - absolute value – 15 174,4 thousand
B. Under-coverage (estimated)
absolute value – 0
percentage of the census population – 0
C. Over-coverage (estimated)
absolute value - 53,5 thousand
percentage of the census population – 0,35%
11.2.2. Post-enumeration survey(s)
The control survey in the 2021 Population and Housing Census consisted of two phases.
The first phase of control activities was carried out during the conduct of the main census and consisted of routine control of the correctness of the work of a randomly or purposively selected 2% group of interviewers, different each month. Due to the implementation of the census by means of ICT methods, the ongoing control of this phase took place with regard to the data entered into the census forms by respondents and census interviewers in the main census, based on the assumptions implemented into the system, which did not allow the census to continue if an error was detected while filling in the questionnaire.
The primary objective of the second phase of the control survey (conducted between 12 and 24 November 2021) was to assess the quality of the population and housing census, in particular to assess content errors for the characteristics surveyed. The evaluation used data collected from a specially designed sample survey. As a result of the analysis, a number of statistical indicators were developed and conclusions were drawn. Due to the existing constraints of COVID-19, a number of decisions were made when planning the control survey that significantly limited the surveyed population of dwellings, to approximately 64% of occupied dwellings (the availability of a telephone number was mainly taken into account). For this reason, the conclusions derived from the control survey could only apply to this limited population. The extension of these conclusions to the entire population covered by the core census should be treated with great caution; in particular, there is no methodological justification for determining accurate numerical assessments of coverage errors (i.e., among other things, determining an estimate of the number of persons who were not included in the core census).
According to the literature on the subject and historical experience from many countries conducting control censuses, in order to fully assess coverage and content errors, an additional independent control census (usually using specially drawn samples) has to be carried out, which has to meet a number of methodological requirements (the key theoretical assumption is the independence of the work activities of the main and control censuses). The main limiting factors in the Polish conditions are the excessive costs and limited resources of staff carrying out the work within the framework of the system of official statistics, as well as the prevailing sanitary conditions associated with COVID-19. Therefore, the approach to quality control within the Census 2021 described below is the result of a compromise that is necessary from a practical point of view. The simplified approach used (also used in previous censuses) allows the quality of the census to be assessed only in terms of content errors.
Detailed tables, graphs illustrating the analyses performed and conclusions of the control study are included in the Polish publication, available on the Statistics Poland website: National Population and Housing Census 2021. Assessment of the data quality, Warsaw 2023, Statistics Poland.
Summarising the collected empirical material, the following general comments can be made on the assessment of the quality of the National Population and Housing Census 2021 results:
For most of the key characteristics analyzed, a good level of agreement (i.e. minimum measurement error) was obtained at country level and within the basic groupings.
The control survey showed the existence of content (measurement) errors for some specific analyzed characteristics, in particular related to the difficulty of interpretation (or sensitive nature) of the questions from the census form by respondents and related to the influence of additional factors on the level of this error (e.g. contact channel, kind of respondent, geographical distribution).
The material obtained provides a basis for possible additional analysis on the problems identified within the main census datasets for specific characteristics.
When analyzing the results obtained, attention should be paid to a possible additional distortion factor related to the "respondent memory" effect, which resulted from the long duration of the main census itself (6 months) and the additional period between the end of the main census and the start of the control survey (more than one month).
The organizational and methodological assumptions adopted in the control survey carried out did not assume obtaining information to assess the value of coverage errors.
12.1. Relevance - User Needs
The main beneficiaries of the statistics include:
• Government and local government and EU institutions,
• Economic entities and employers,
• Employment offices, employment agencies,
• Scientific and research centers,
• Journalists.
The thematic scope of the census as defined in the EU Regulations largely meets both national and international needs. However, during the preparatory work for the 2021 Census, the thematic scope was consulted during the public consultation with key data recipients. This action was aimed at involving the public in the decision-making process of identifying the information needs of the census and a broader and more informed understanding of the census activities undertaken.
The results of the consultation are available on the Information Portal of the Statistics Poland, while information on successively published data from the 2021 Census is made available as part of the ongoing information service of various stakeholder groups.
12.2. Relevance - User Satisfaction
Feedback from data users on the results of the 2021 CENSUS is obtained on an ongoing basis as part of the handling of statistical data queries. Users can use the Data Request Form available on the Statistics Poland Information Portal and the on-line contacts to the Department of Education and Communication to express their opinions on the available information resources. The information collected in this way is forwarded to the author's units so that they take into account the expectations and comments of the users.
In addition, as part of communication with users, the most frequently asked questions regarding the availability of data or methodological issues are analyzed on an ongoing basis. For the purpose of their service, response standards are developed based on the opinions of author's units. Information in this regard is also communicated among the coordinators of the Statistical Helpline in statistical offices, so that they can substantively address users questions.
When analysing user comments, they pay particular attention to the time between the survey and the availability of data. In addition, they expect data in the lowest possible territorial cross-section. Issues related to meta-information are also crucial, as they will present in a detailed and understandable way the methodology of the study and the interpretation of its results. There is also an interest in issues related to the security of the census individual data and legal regulations on the mandatory nature of the survey.
The users opinions were also reflected at the stage of the survey, when, due to the high demand for support in the implementation of the census obligation, automatic responses to repetitive questions were prepared, generated as part of the use of the Contact Form. For this purpose, Statistics Poland has prepared stands for the self-census stations where it was possible to fill in on-line census forms with the help of designated employees. Taking into account the expectations, all the details of the study implementation have been published on the website dedicated to the study. The website was successively expanded adequately to the needs of respondents and users. In the context of information needs, Statistics Poland Information Portal has a separate area dedicated to 2021 Census, which directs to the census results and the schedule for their availability.
12.3. Completeness
Poland has prepared and transmitted all themes with breakdowns in accordance with Regulation (EC) No 763/2008 of the European Parliament and of the Council of 9 July 2008 and Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. The completeness for statistical units i.e. persons, families, households, housing, CLQ was 100%.
13.1. Accuracy - overall
Information on the accuracy of individual topics in accordance with the requirements of Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. See the sub-concepts 13.1.1 - 13.1.35.
13.1.1. Overall accuracy - Usual residence
Data processed under the theme: Place of usual residence are consistent with the definition in Implementing Regulation (EU) 2017/543.
Place of residence means the place where a person usually spends his or her spare time from work (learning), including the night.
It does not matter whether or not the person is registered for permanent or temporary residence at a given address.
Resident population (residents) in Poland was derived on the basis of information on the nature of residence and the actual or intended duration of stay of individual persons.
This means all persons present at the time of the Census who have lived or intended to reside in Poland (in a given gmina) for at least 12 months have been recognised as residents of the country (gmina). Residents staying out of the place of permanen residence (gmina) were recognised as residents only if the duration of their intended absence was less than 12 months, if it was a year or more, they were considered residents in the place (gmina) of their temporary residence, if they were abroad – they were not considered as residents of Poland.
Similarly, in the case of foreigners temporarily residing in Poland, only those whose intended stay was at least 12 months were recognised as residents.
Students were recognised as residents in the place (gmina) of study, in the case of studies abroad they were not consedered as residents of Poland. Foreigners studying in Poland were recognised as Polish residents.
Youth studying in secondary schools have been counted as residents of the gmina of permanent residence (family home) regardless of the location of the school.
Homeless people were as residents of the gmina, where they were enumerated.
Permanent residents of Poland staying in collective living quartes (CLQ) were enumerated as residents of the gmina in which CLQ was located. Foreigners residing in CLQ were considered residents if their actual or intended duration of residence was at least 1 year, if it was less than 12 months, they were not recognised as residents of Poland.
Soldiers and persons on military missions as well as diplomats and their families remained residents of Poland.
13.1.2. Overall accuracy - Sex
Data processed as defined in Implementing Regulation (EU) 2017/543.
13.1.3. Overall accuracy - Age
Data processed as defined in Implementing Regulation (EU) 2017/543.
13.1.4. Overall accuracy - Marital status
Data processed under the topic LMS, definition and the breakdown categories are in accordance with the definition in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
Legal marital status was determined for persons aged 15 or more and was defined as marital status according to the law in force in Poland (the Law on Civil Status Acts). Polish law allows women from the age of 16 to marry and men from the age of 18.
Polish law does not provide same-sex marriages. Therefore, in the 2021 Population and Housing Census did not collect data on persons in the same-sex marriages. Polish legislation also does not provide for registered partnerships. Therefore, data on persons in registered partnerships was not collected.
13.1.5. Overall accuracy - Family status
Data processed under the topic FST, definition and the breakdown categories are in accordance with the definition in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
Polish law does not allow same-sex marriages. Therefore, in the 2021 Population and Housing Census did not collect data on persons in the same-sex marriages. Polish legislation also does not provide for registered partnerships. Therefore, data on persons in registered partnerships was not collected.
13.1.6. Overall accuracy - Household status
Data processed under the topic HST, definition and the breakdown categories are in accordance with the definition in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
In the 2021 Population and Housing Census there are no persons in the category ‘Persons living in a private household, but category not stated’ (there are only persons in the category ‘Persons living in a private household in a family nucleus’ or ‘Persons living in a private household in not in a family nucleus’.
13.1.7. Overall accuracy - Current activity status
There are no particular reasons for data unreliability for this topic.
Information on ‘Current activity status’ was prepared on the basis of a mixed-method survey, i.e. administrative sources and data collected from respondents (census form), which were collected as part of the National Population and Housing Census conducted from 1 April to 30 September 2021 in the Republic of Poland.
The current definition of ‘Current economic status’ (CAS) was applied in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. According to this definition, the population was divided into three main categories: ‘labour force’ (employed and unemployed), ‘outside of the labour force’ (- persons below the national minimum age for economic activity; - pension or capital income recipients; - students, - other) and persons who were not classified into the two above-mentioned categories were classified in the category ‘not stated’.
Employed persons are those whose minimum age is 15 years. This is the minimum age to work in Poland – the Labour Code of 26 June 1974 (Journal of Laws 1974, No. 24, item 141 as amended).
The criterion for the inclusion of a person in one of these categories was the fact of performing/having a job during the reference week (in Poland it was the week from 25 to 31 March 2021) or searching for a job and readiness to take it.
In ascribing a single activity status to each person, priority was given to the status of ‘Employed’ in preference to ‘Unemployed’, and to the status of ‘Unemployed’ in preference to ‘Outside of the labour force’. In ascribing a single activity status to each person currently outside of the labour force, priority was given to the status of ‘Persons below the national minimum age for economic activity’ in preference to ‘Pension or capital income recipients’, to the status of ‘Pension or capital income recipients’ in preference to ‘Students’, and of ‘Students’ in preference to ‘Others’.
The order in which individual subpopulations were selected ensures that each person was classified only into one of three categories differentiated from the point of view of their status on the labour market.
Under the topic ‘Current activity status' (CAS), the levels and categories of breakdowns of groups and sub-groups relating to persons in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017 at the required levels of detail. There are no derogations from this classification.
13.1.8. Overall accuracy - Occupation
There are no particular reasons for data unreliability for this topic.
Information on ‘Occupation’ (OCC) was prepared on the basis of a mixed-method survey, i.e. administrative sources and data collected from respondents (census form), which were collected as part of the National Population and Housing Census conducted from 1 April to 30 September 2021 in the Republic of Poland.
The current definition of ‘Occupation’ (OCC) was applied in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. ‘Occupation’ refers to the type of work done in a job. ‘Type of work’ is described by the main tasks and duties of the work.
The allocation of a person within the breakdowns of the topics ‘Occupation’, ‘Industry’ and ‘Status in employment’ is based on the same job. Persons doing more than one job is allocated an occupation based on their main job. The main job is the job that usually takes more time. If the jobs take the same amount of time, the main job is the one with the higher income.
Persons aged 15 or over that were employed (i.e. had the ‘Current activity status — CAS of “Employed”’ (CAS.L. and CAS.H. 1.1)) during the reference week is classified under only one category of OCC.1. to OCC.11.
The category OCC.11 ‘Not stated’ classified persons who had a status of ‘employed’ but no occupation was established. In the category OCC.12 ‘Not applicable’ classified the other persons participating in the census (CAS.1.2., CAS.2. and CAS.3.). Under the topic ‘Occupation’ (OCC), there are categories of group breakdowns relating to persons in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. There are no derogations from the ISCO-08 (COM) classification at the required level of detail.
13.1.9. Overall accuracy - Industry
There are no particular reasons for data unreliability for this topic.
Information on ‘Industry’ was prepared on the basis of a mixed-method survey, i.e. administrative sources and data collected from respondents (census form), which were collected as part of the National Population and Housing Census conducted from 1 April to 30 September 2021 in the Republic of Poland. The current definition of ‘Industry’ (IND) was applied in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
For persons who are recruited and employed by one enterprise but who actually have their place of work in another enterprise (‘agency workers’, ‘seconded workers’) the industry (branch of economic activity) of the establishment or similar unit where the place of work actually is reported.
The allocation of a person within the breakdowns of the topics ‘Occupation’, ‘Industry’ and ‘Status in employment’ is based on the same job.
Persons working in more than one place of work, the ‘Industry’ (IND) is determined on the basis of the main place of work. The main job is the job that usually takes more time. If the jobs take the same amount of time, the main job is the one with the higher income.
Persons aged 15 or over that have employed (i.e. had the ‘Current activity status — CAS’ of ‘employed’ (CAS.L. and CAS.H. 1.1)) during the reference week was classified under only one category of IND.1. to IND.11.
The category IND.11 ‘Not stated’ classified persons who had a status of ‘employed’ but no industry was established. In the category IND.12 ‘Not applicable’ classified the other persons participating in the census (CAS.1.2., CAS.2. and CAS.3.).
Under the topic ‘Industry’ (IND), there are categories of group breakdowns relating to persons in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. There are no derogations from the NACE Rev. 2 classification at the required level of detail.
The categories IND.H.1. to IND.H.10.4. of the breakdown "Industry (branch of economic activity)" correspond to the 21 sections of the NACE Rev.2 classification.
13.1.10. Overall accuracy - Status in employment
There are no particular reasons for data unreliability for this topic.
Information on ‘Status in employment’ (SIE) was prepared on the basis of a mixed-method survey, i.e. administrative sources and data collected from respondents (census form), which were collected as part of the National Population and Housing Census conducted from 1 April to 30 September 2021 in the Republic of Poland. The current definition of ‘Status in employment’ (SIE) was applied in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
The allocation of a person within the breakdowns of the topics ‘Occupation’, ‘Industry’ and ‘Status in employment’ was based on the same job. Persons doing more than one job were allocated 'Status in employment' based on their main job. The main job is the job that usually takes more time. If the jobs take the same amount of time, the main job is the one with the higher income.
Persons aged 15 or over that have status of ‘employed’ (CAS.L. and CAS.H. 1.1.)) during the reference week were classified under only one category of SIE.1. to SIE.5. The category SIE.5 ‘Not stated’ classified persons who had a status of ‘employed’ but no status in employment was established. In the category SIE.6. ‘Not applicable’ classified the other persons participating in the census (CAS.1.2., CAS.2. and CAS.3.).
Under the topic ‘Status in employment’ (SIE), there are categories of group breakdowns relating to persons in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. There are no derogations from this classification.
13.1.11. Overall accuracy - Place of work
There are no particular reasons for data unreliability for this topic.
Information on ‘Location of place of work’ (LPW) was prepared on the basis of a mixed-method survey, i.e. administrative sources and data collected from respondents (census form), which were collected as part of the National Population and Housing Census conducted from 1 April to 30 September 2021 in the Republic of Poland. The current definition of ‘Location of place of work’ (LPW) was applied in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
The location of the place of work is the geographical area in which a currently employed person does his/her job. The term ‘working’ refers to work done as an ‘employed person’ (CAS.L and CAS.H. 1.1) as defined under the topic ‘Current activity status’ (CAS). Persons who do not have a fixed place of work but who report to a fixed address at the beginning of their work period (for example bus drivers, airline pilots and stewards, operators of street market stalls which are not removed at the end of the workday) should provide information on this address.
The breakdowns for the topic 'Location of place of work' (LPW) have been made for all work places and any subtotals in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. For the purposes of the breakdown under the topic 'Location of the Workplace' (LPW), the version of the classification of territorial units for statistics (NUTS) in force on 1 January 2021 has been used. There are no deviations from this classification.
The data were processed in accordance with the Implementing Regulation (EU) 2017/543 categories and ISCED 2011 classification.
Information on the level of education was compiled for all persons aged 15 and more.
13.1.13. Overall accuracy - Size of the locality
The data comply with the definition and classification in Implementing Regulation (EU) 2017/543.
The size of the village is determined by the number of people who live in.
Depending on the population of a given locality, it is assigned an appropriate size symbol (LOC).
13.1.14. Overall accuracy - Place of birth
Both the definition of POB and the developed country dictionary comply with Implementing Regulation (EU) 2017/543 of 22 March 2017, which means that the Polish census collected information about the country of birth (within its current boundaries) where the person was born.
13.1.15. Overall accuracy - Country of citizenship
Both the definition COC and the developed countries of citizenship dictionary comply with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
It should be emphasised that according to the current legal norms, a foreigner in Poland is a person residing on the territory of Poland and without Polish citizenship. According to this definition, a person holding Polish citizenship and citizenship of other countries is not a foreigner.
13.1.16. Overall accuracy - Year of arrival in the country
Both the definition of YAE and the breakdown categories are consistant with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
The YAE variable applies to all people who have ever lived abroad.
The year of arrival in the country is the calendar year when a person settled in the country.
If a person has been abroad several times (has been a resident of another country), the year of arrival is the year of arrival after the last stay abroad.
13.1.17. Overall accuracy - Residence one year before
Both the definition of ROY and the breakdown categories are consistant with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
13.1.18. Overall accuracy - Housing arrangements
The data compiled under the HAR topic , definitions and the breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
The topic "Housing arrangements" covers the whole population and refers to the type of housing in which a person resided at the time of the census. This covers all persons who are usual residents in different types of living quarters, or who do not have a usual residence and stay temporarily in some type of living quarters, or who are roofless, sleeping rough or in shelters, when the census is taken.
13.1.19. Overall accuracy - Type of family nucleus
Data processed under the topic TFN, definition and the breakdown categories are in accordance with the definition in Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
Polish law does not allow same-sex marriages. Therefore, the Polish census did not collect data on a same-sex marriages. Polish legislation also does not provide for registered partnerships. Therefore, data on a families of couples in registered partnerships was not collected.
13.1.20. Overall accuracy - Size of family nucleus
Both the definition of SFN and the breakdown categories are consistant with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
13.1.21. Overall accuracy - Type of private household
Both the definition of TPH and the breakdown categories are consistant with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
13.1.22. Overall accuracy - Size of private household
Both the definition of SPH and the breakdown categories are consistant with Commission Implementing Regulation (EU) 2017/543 of 22 March 2017.
13.1.23. Overall accuracy - Tenure status of households
The data compiled under the TSH topic, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017
Legal title to occupy the dwelling by a household – refers to the legal title to a dwelling held at the time of the census by one or more persons comprising the household residing in the dwelling; Regarding the right to occupy a dwelling, households are classified into those residing by virtue of: ownership of a dwelling or house, cooperative right to a dwelling, lease, sublease (subtenants), relationship to the owner or the so-called main tenant of a dwelling, and other.
13.1.24. Overall accuracy - Type of living quarter
The data developed under the TLQ topic, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
A living quarter is housing which is the usual residence of one or more persons. The terms "Conventional dwellings", "Other housing units" and "Collective living quarters" are defined as under the topic "Housing arrangements".
13.1.25. Overall accuracy - Occupancy status
The data developed under the OCS theme, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
Information on the occupancy status of a dwelling was obtained by establishing relationships between the different data sets, i.e. the data set for people, buildings and dwellings. For the collection of conventional unoccupied dwellings, no voluntary division was made between dwellings intended for seasonal use or as so-called "second dwellings" and permanently unoccupied dwellings.
13.1.26. Overall accuracy - Type of ownership
The data compiled under the OWS topic, definitions and division categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
The topic "Type of ownership" refers to the ownership of the dwelling, not to the land on which the dwelling is located.
13.1.27. Overall accuracy - Number of occupants
The data compiled under the NOC topic, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
The number of people living in a dwelling is the number of people for whom the dwelling is a residence.
13.1.28. Overall accuracy - Useful floor space
The data compiled under the topic UFS, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
The breakdown levels and categories under the topic "Usable floor space" are used to break down the group "conventional housing" and any subgroups.
13.1.29. Overall accuracy - Number of rooms
The data compiled under the NOR topic, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
The breakdown levels and categories under the topic "Number of rooms" are used to break down the group "conventional housing" and any subgroups.
13.1.30. Overall accuracy - Density standard (floor space)
The data developed under the DFS topic, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
13.1.31. Overall accuracy - Density standard (number of rooms)
The data compiled under the DRM theme, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
In order to determine density standard, the number of residential rooms (DRM) was taken into account, i.e. the number of rooms per occupant, excluding those used exclusively for business purposes.
13.1.32. Overall accuracy - Water supply system
The data compiled under the WSS theme, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
A housing unit considered as having a piped water installation is a housing unit inside which there is a tap with running water.
13.1.33. Overall accuracy - Toilet facilities
The data compiled under the TOI topic, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017
Toilet facility is defined as a facility equipped with a flush toilet. The census only enumerated installations within the housing units from which the wastes are flushed by water from a flushing cistern, connected to a water supply system, regardless of whether this installation was located in a separate room (WC), or in a bathroom.
13.1.34. Overall accuracy - Bathing facilities
The data developed under the BAT topic, definitions and breakdown categories, are in accordance with Commission Implementing Regulation (EU) 2017/543 of March 22, 2017.
A bathroom is a space in a dwelling in which a bathtub or a shower cabin, or both, are installed, together with equipment evacuating wastewater outside the building.
13.1.35. Impact of the COVID pandemic on data accuracy
The census conducted in the 2021 round was carried out under particularly difficult conditions related to the pandemic caused by the SARS-CoV2 virus.
Despite many concerns, organisational and technical solutions were successfully implemented to reduce the health risk for respondents and census enumerators to a complete minimum. Due to their innovative nature, they also contributed to analyses and discussions on introducing them permanently into the implementation of other statistical surveys.
In order to mitigate the negative impact of the COVID-19 pandemic, key changes/solutions were made to adapt the technology and organisation of data collection to pandemic conditions. They included:
Amendment of the Census Act 2021. - enabled the census to be extended by three months, resulting in flexible management of census methods, with a particular focus on COVID-19 risk levels and areas with the highest incidence of disease, which allowed it to be implemented during a period of much lower epidemic threat compared to the second quarter of 2021.
The list of entities obliged to provide data from the information systems was extended to include providers of publicly available telecommunications services, in order to feed the list with up-to-date telephone numbers of respondents and minimise face-to-face interviews in favour of telephone interviews.
Flexibility in the use of data collection methods was provided by allowing them to be combined or substituted (as required).
Census applications and systems were customised and equipped with additional functionality, allowing census enumerators to conduct both telephone and face-to-face interviews using the same mobile device.
Appropriate organisation of the work of census enumerators, including remote or face-to-face work (depending on the pandemic situation); expansion of their areas of operation from municipality to province.
Most of the work was carried out remotely, including the recruitment of census enumerators, the training of enumerators and other members of the census apparatus.
Self-enumeration stations worked under a sanitary regime - personal protective equipment appropriate to the threat of a pandemic was provided, appointments for a specific day and time were mandatory.
A number of problems were encountered during the census work, including: staff absence due to illness, quarantine or holidays during the summer season (due to the extended duration of the census) or respondents' reluctance to have direct contact for fear of contagion. In spite of the difficulties, all the people involved in the census work carried out their duties on time (according to the schedule) and reliably, so that the entire census apparatus could operate efficiently.
In the end, all the activities influenced the efficiency of the collected data, to a very high census completion rate and the collection of complete and good quality data.
13.2. Sampling error
Not applicable, full survey
13.3. Non-sampling error
Not applicable, full survey
14.1. Timeliness
The census reference date is March 31, 2021.
Population by grid was published in December 2022.
Final census data for dwellings, population, households and families by layout and breakdowns in accordance with EU implementing regulations available by March 31, 2024.
14.2. Punctuality
Data provided on time.
15.1. Comparability - geographical
The definitions adopted and the breakdowns for themes ensure comparability of results at EU level.
Geographical area (GEO)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
SEX (SEX)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Age (AGE)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Legal marital status (LMS)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level. Marital status was defined for persons aged 15 and over and was defined as marital status according to Polish law (the Law on Civil Status Acts). Polish law allows women from the age of 16 and men from the age of 18 to marry.
Polish law does not allow same-sex marriages. As a result, the Polish census did not provide information on same-sex relationships, neither in law nor in fact, in marriage. Polish legislation does not provide for registered partnerships.
Household status (HST)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level. No information has been developed in the classification of a topic specified as optional for same-sex marriages/partnerships
Persons living in a household, undetermined category – in the Polish census there are no persons classified in this category (there are only people living in the household: in the biological family or outside the biological family).
Persons not living in a household, undetermined category – in the Polish census there are no persons classified in this category (there are only people living in a private household and not living in a private household: in collective living quarters and homeless persons).
Family status (FST)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level. No information has been developed in the classification of the topic specified as optional.
According to the definition only first-degree relationships between children and adults (between parents and children) are taken into account to determine families.
Polish law does not allow the registration of partnerships. Therefore, the Polish census did not compile information on partners in registered partnerships. There was also no information on same-sex couples.
Item “undetermined” – in the Polish census there are no persons classified in this category (there are only persons
of established position in the family and non-family members).
‘Not applicable’ – includes persons who do not form a biological family.
Size of family nucleus (SFN)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Type of household (TPH)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Size od private household (SPH)
The accepted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level
Educational atainment (EDU)
The accepted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level
Size of the locality (LOC)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Place of birth (POB)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Country of citizenship (COC)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Year of arrival in the country (YAE)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Place of usual residence one year prior to the census (ROY)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Current activity status (CAS)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Status in employment (SIE)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Occupation (OCC)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Industry (IND)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Location of place of work (LPW)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Tenure status of households (TSH)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Housing arrangements (HAR)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Type of living quarter (TLQ)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Occupancy status of conventional dwelling (OWS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Number of occupants (NOC)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Type of ownership (OCS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Useful floor space (UFS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Number of rooms (NOR)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Density standard (floor space) (DFS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Density standard (number of rooms) (DRM)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results
at the EU level.
Water supply system (WSS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Toilet facilities (TOI)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Bathing facilities (BAT)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Type of heating (TOH)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Dwellings by type of building (TOB)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Dwellings by period of construction (POC)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
15.1.1. Geographic information - data quality
Statistics Poland maintains Spatial Address Databases (PBA) containing statistical address points with geographic location of each residential building or building with at least one dwelling. Spatial Address Databases are updated and archived quarterly. The Population and Housing Census 2021 survey frame was built based on the population record (a dataset maintained at Statistics Poland). While a table with persons is the main table of the survey frame, there are also tables for dwellings and buildings. Statistical address points hold locations for building with dwellings.
Grid IDs were calculated for all statistical address points before they were matched with buildings in the census survey frame. The grid reference was written to the building table, from which it was easily transferred to persons.
The data is fully comparable between regions through the use of a uniform census form throughout the country.
15.2. Comparability - over time
Not applicable.
15.3. Coherence - cross domain
In terms of demographic and social characteristics, the data are coherent. However, in terms of economic characteristics the data presented may differ from those reported in other statistical domains due to differences between domains in the definitions and methods used. Data were compared with data with the Labour Force Survey (LFS), the survey of enterprises and the budgetary sphere units in terms of employed persons and the survey ‘The unemployed and persons seeking a job registered in poviat labour offices’. The census results and data coming from the above mentioned surveys do not differ significantly.
The analysis of the consistency of housing characteristics focuses on the analysis of the results on the number of housing units compiled using the results obtained from the successive censuses, i.e., the 2011 Census and the 2021 Census. The first stage compared information on the number of housing units for Poland in general as of December 31 for each year of the inter-census period. The second stage of comparisons, on the other hand, included analysis of data on the number of housing units at the provincial level as of December 31, 2020.
15.4. Coherence - internal
Internal consistency is ensured by the regulations setting out the breakdowns and definitions of topics (Commission Regulation (EU) 2017/712 of 20 April 2017. Commission Implementing Regulation (EU) 2017/543 of 22 March 2017, Commission Implementing Regulation (EU) 2017/881 of 23 May 2017).
As mentioned in earlier chapters - the population and housing census round 2020/21 took place in Poland (as in the rest of the world) under the sign of the COVID-19 pandemic. However, regardless of the pandemic, preparatory work for the census was carried out since 2017. As part of this work, a concept was developed for the introduction of modernisation changes, resulting from the 2011 census, aimed at offering respondents further facilitation of participation in the census. As part of the chapter on costs, it should also be emphasized that the census was carried out on time according as scheduled, the safety of all those taking part was ensured and, in addition, significant financial savings were achieved against the planned census budget, benefiting the country (particularly important in this time of crisis caused by the pandemic).
The census was carried out without the use of paper forms, exclusively using an interactive form application installed on mobile devices with internet access or using a ICT system. The census was carried out using:
the online self-enumeration (CAWI),
telephone interview (CATI),
"census on demand" (CATI2) - telephone interview at the request of the respondent available on the census helpline,
face-to-face interview (CAPI).
As in chapter 13.1.35, the changes that have had a significant impact on the cost of the census will be recalled once again here.
The mandatory form of participation in the census was an online self-enumeration. Complementary methods were telephone or face-to-face interviews, carried out by census enumerators. The online self-enumeration could be completed by respondents on their own electronic device (computer, tablet, smartphone), either by themselves or with the help of, for example, a family member or a census helpline consultant (by calling the Census 2021 helpline). If respondents lacked technical capabilities or computer skills, they were able to complete the self-enumeration using free census points equipped with computer equipment with dedicated software and Internet access, provided by municipalities, regional statistical offices (US) or the Statistics Poland.
Thanks to the amendment of the Census Act, it was possible to manage census methods flexibly, taking into account, in particular, the COVID-19 risk level and the areas of highest incidence. Health security was a priority in the census implementation strategy. Accordingly, the list of information system providers delivering data in the census was expanded to include providers of publicly available telecommunication services, in order to feed the list with up-to-date telephone numbers of respondents. This change minimised face-to-face interviews in favour of telephone interviews. The technology and organisation of data collection were also adapted to the pandemic conditions, withthe aim of reducing or avoiding face-to-face contacts as much as possible. Census applications and systems were equipped with additional functionality, allowing census enumerators to conduct both telephone and face-to-face interviews using the same mobile device. This has provided flexibility in the use of data collection methods, by allowing methods to be combined or substituted for each other as required.
Most of the work was carried out remotely, including the recruitment of census enumerators and all training of members of the census apparatus. A large number of staff also carried out their duties remotely, including those on duty as dispatchers.
An important aspect worth highlighting is the high percentage of people enumerated through an online self-enumeration (CAWI) - about 60% of all enumerated people. It can be seen that this method is becoming more and more trusted among respondents as being safe and at the same time convenient and not limiting them in terms of time or place of taking the census. On the other hand, the use of telephone and face-to-face interviews interchangeably showed that as much as 25% of the information obtained from respondents came from the telephone channel (CATI) and only 15% from the face-to-face method (CAPI), while the quality of the results obtained by both methods was the same, given the design of the census application that continuously monitors the reliability and completeness of the answers provided.
Costs of the 2021 population and housing census.
Conducting the census involved a huge undertaking to modernise the IT infrastructure, including work to prepare an ICT network to enable data transmission from information systems, construction of a platform for data collection, collection and editing, preparation of address and housing lists, development of assumptions for the electronic form application, and work leading to data acquisition from public administration information systems and non-administrative information systems.
Expenditure in 2019-2022 as part of the National Population and Housing Census 2021 (2021 CENSUS) was realised at a total of PLN 273,172,000 (USD 62,059,157), in comparison, census expenditure in 2011 amounted to PLN 395,284,000. The average cost of the census per capita was PLN 7,23 (USD 1,65).
Within the individual expenditure groups, costs were as follows:
property expenditures (modernisation and expansion of ICT infrastructure, investments, design and programming work) - it should be noted that most of the work has already been carried out before the 2020 agricultural census (PSR 2020) - under the Census 2021, the costs amounted to PLN 1,333,000 (USD 302,831), while under the PSR 2020 they amounted to PLN 44,438,000 (USD 10,095,416);
subsidies to Commune Census Offices to conduct two sample censuses and the 2021 Census - PLN 77,530,000 (USD 17,613,249);
non-personal salaries (costs of census enumerators from external recruitment) - PLN 74,470,000 (USD 16,918,079);
staff salaries - PLN 45,577,000 (USD 10,354,173), including PLN 10,031,000 cost of internal enumerators (USD 2,278,840);
so called "material and other" expenditures, related to the current activities of the units of public statistical services - PLN 74,015,000 (USD 16,814,712), including, among others, expenditures for:
purchase of mobile equipment for census enumerators - PLN 17,922,206 (USD 4,071,563),
promotion of the census - PLN 15,776,711 (USD 3,584,150),
mailing of letters from the President of the Statistics Poland - PLN 3,369,376 (USD 765,454);
training - PLN 247,000 (USD 56,113).
Size of census organisation
The three-level structure of the census organisation, i.e. census bureaus in the Statistics Poland, regionals and municipalities, was used for both the 2011 census and the 2021 Census. Approximately 28,300 people were involved in census work.
Within the Central Census Bureau at the Statistics Poland, census work was carried out by approximately 250 people.
There were 16 Regional Census Bureaus (WBS) operating in the country, within which work was carried out by approximately 4,200 people (including 1,799 telephone counters). In addition, 15910 census enumerators were employed within the WBS, conducting interviews using the CAPI method.
During the 2021 Census, municipal census bureaus were established in 2,477 municipalities, with approximately 8,000 people working in them.
17.1. Data revision - policy
The Statistics Poland has prepared a document describing the principles of revision of statistical data in many areas of research. This document describing the rules for the revision of statistical data is available on the website.
It describes the basic concepts of revisions, their types, as well as ways of publishing and notifying users about introduced changes.
17.2. Data revision - practice
Not applicable.
18.1. Source data
For 2021 Census purpose data from an electronic application obtained from respondents and data from registers and administrative systems were used. A detailed description of the sources is presented in the sub-concepts 18.1.1 - 18.1.4.
18.1.1. List of data sources
The primary data sources were data collected directly on the census form, supplemented by data from administrative and non-administrative sources. Additionally, a complementary survey was conducted involving individuals residing in collective living quarters.
The electronic census form is an interactive application consisting of sections: personal data (for whom the census was conducted or who completed the self-enumeration), determining the residential address of the person, persons in the dwelling/room not being a dwelling/collective accommodation facility, persons residing abroad, family relationships, personal questionnaires containing questions related to demographic-social, migration, and economic activity characteristics of individuals, and a housing questionnaire containing questions describing the dwelling and building.
Four language versions of the census application were prepared, in addition to Polish, the application was available in English, Russian, and Ukrainian.
Information about persons residing in collective living quarters, including homeless persons, was collected using a dedicated electronic application for collective living quarters. The application contained basic information about the type of facility and the persons residing therein, which was provided by the managers of these facilities.
List of information systems of public administration and official registers:
Universal Electronic System for Registration of the Population (PESEL);
National Official Register of the Territorial Division of the Country TERYT – a system for identifying streets, real estates, buildings, and dwellings (NOBC);
Central Register of Entities – National Taxpayers Register (CRP-KEP);
Central Register of Insured Persons – National Health Fund (NFZ);
new Social Insurance Information System (nSIU) – Agricultural Social Insurance Fund;
Database of personal income taxpayers (PIT);
National Family Benefits Monitoring System (KSMSR);
Central register of recipients of maintenance fund benefits (ŚFA);
Big Family Card Information System (KDR);
Central Application (AC);
Central Database of Unemployed Persons (CZDOB);
Integrated Information System on Higher Education and Science POL-on (ZSIoSWiN);
Educational Information System (SIO);
National collection of registers, records, and lists concerning foreigners (KZREW);
Integrated Management and Control System (ZSZiK);
Comprehensive Social Insurance Institution IT System (KSI): Central Contributions Payer Register (CRPS); Central Register of the Insured (CRU); Central Register of Family Members of Insured Persons Entitled to Health Insurance (RCR);
Pension and Disability Insurance System RENTIER (RENTIER);
Registry of EU Affairs (ROSU);
Comprehensive Pension and Disability Benefits Service System (FARMER);
Central Insured Persons Register (CWU);
Grants and Refunds Service System (SODiR);
Electronic National Disability Assessment Monitoring System (EKSMOoN);
Social Welfare System (SPS);
Register of localities, streets, and addresses (EMUiA);
Address data from the National Register of Boundaries and Area of Administrative Units of Poland (PRG);
Spatial Address Database (PBA);
Building and Dwelling Database (BDD);
Reports on residential buildings and residential units in non-residential buildings (put into use) (B-07);
Register of residential and non-residential buildings and buildings for collective accommodation (put into use);
data obtained from companies providing municipal services to households;
data from previous censuses;
telecommunications operators.
Non-public information systems – information systems of companies engaged in economic activity in the field of electricity sales (file File Energy ZE).
Statistical operation for units conducting economic activity, i.e., the Statistical Unit Database (BJS).
18.1.1.1. List of data sources - Data on persons
In the 2021 Census, for the elaboration of information on the state and structure of the population (in basic divisions, i.e., sex, age, place of residence), data were compiled for 100% of the resident population. The data sources included information gathered through electronic applications obtained via CAxI channels, as well as administrative data sources, including:
Universal Electronic System for Registration of the Population (PESEL);
Data sets from the Social Insurance Institution (ZUS) regarding contributors, insured persons, retirees, pensioners, social pension recipients, pre-retirement benefits, social security benefits, and independently paid cash benefits;
Database of personal income taxpayers (PIT);
Central Register of Insured Persons (NFZ) – regarding data on individuals covered by the health insurance system;
from the State Fund for Rehabilitation of Disabled Persons;
from the information systems of county labour offices regarding unemployed individuals and those seeking employment
National collection of registers, records, and lists regarding foreigners (KZREW) from UDSC;
Integrated Management and Control System (ZSZiK) from ARIMR.
18.1.1.2. List of data sources - Data on households
Data on households were compiled for 100% of the resident population. The data sources were information gathered through electronic applications obtained via CAxI channels and from available files and administrative registers, including:
Universal Electronic System for Registration of the Population (PESEL);
Data sets from the Social Insurance Institution (ZUS) regarding contributors, insured persons, retirees, pensioners, social pension recipients, pre-retirement benefits, social security benefits, and independently paid cash benefits;
Database of personal income taxpayers (PIT);
Central Register of Insured Persons (NFZ) – regarding data on individuals covered by the health insurance system;
from the State Fund for Rehabilitation of Disabled Persons.
18.1.1.3. List of data sources - Data on family nuclei
Data on biological families were compiled for 100% of the resident family population. The data sources were information gathered through electronic applications obtained via CAxI channels and from available files and administrative registers, including:
Universal Electronic System for Registration of the Population (PESEL);
Data sets from the Social Insurance Institution (ZUS) regarding contributors, insured persons, retirees, pensioners, social pension recipients, pre-retirement benefits, social security benefits, and independently paid cash benefits;
Database of personal income taxpayers (PIT);
Central Register of Insured Persons (NFZ) – regarding data on individuals covered by the health insurance system;
from the State Fund for Rehabilitation of Disabled Persons;
from the information systems of county labour offices on unemployed individuals and individuals seeking employment;
18.1.1.4. List of data sources - Data on living quarters
Data on living quarters were compiled for 100% of the residential population. The data sources included information gathered during verification work in the Address and Housing Register (15070130 records), and subsequently in electronic applications obtained via CAxI channels, as well as from available sets and administrative registers, including:
National Official Register of the Territorial Division of the Country ((TERYT)) – a system for the identification of streets, real estates, buildings, and dwellings (NOBC),
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by Register of Localities, Streets, and Addresses (EMUiA),
non-public information systems – information systems of enterprises engaged in economic activities related to the sale of electrical energy (ZE Energy Collection),
statistical survey for business entities, namely Business Register (BJS) – used to prepare a list of collective accommodation facilities,
The Building and Dwelling Database (BBM)
data obtained from enterprises providing municipal services to households,
data from previous censuses.
18.1.1.5. List of data sources - Data on conventional dwellings
Data on conventional dwellings were compiled for 100% of the conventional dwelling population The source of the data was information gathered during verification work in the Address and Housing Register (15,070,130 records), and then in the electronic application acquired via CAxI channels and based on available sets and administrative registers including:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, real estates, buildings, and dwellings (NOBC),
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA),
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE),
statistical register for economic units, i.e., Business Register (BJS) – used to prepare a list of collective accommodation facilities,
The Building and Dwelling Database (BBM)
data obtained from enterprises providing municipal services for households,
data from previous censuses.
18.1.2. Classification of data sources
The classifications of data sources for topics are consent with the Commission Implementing Regulation (EU) 2017/543 of 22 March 2017 provisions.
18.1.2.1. Classification of data sources - Data on persons
05.Combination of register-based censuses and conventional censuses
18.1.2.2. Classification of data sources - Data on households
05.Combination of register-based censuses and conventional censuses
18.1.2.3. Classification of data sources - Data on family nuclei
05.Combination of register-based censuses and conventional censuses
18.1.2.4. Classification of data sources - Data on living quarters
05.Combination of register-based censuses and conventional censuses
18.1.2.5. Classification of data sources - Data on conventional dwellings
05.Combination of register-based censuses and conventional censuses
18.1.3. List of data sources per topic
Geographical area (GEO)
The primary source of data for determining the place of residence was information collected in the electronic application acquired via CAxI channels (conventional census 2021). Additionally, data on the place/address of residence from the following registers were used:
Universal Electronic System for Registration of the Population (PESEL);
data collections from the Social Insurance Institution (ZUS) system;
Central Register of Insured Persons (NFZ)
Database of personal income taxpayers (PIT);
National collection of registers, records, and listings in matters of foreigners (KZREW) from UDSC;
Integrated Administration and Control System (ZSZiK) from ARIMR.
Sex (SEX)
The primary source of data was administrative registers which include:
Universal Electronic System for Registration of the Population (PESEL);
data collections from the Social Insurance Institution (ZUS) system;
Database of personal income taxpayers (PIT);
Central Register of Insured Persons (NFZ);
National collection of registers, records, and listings in matters of foreigners (KZREW) from UDSC;
Integrated Administration and Control System (ZSZiK) from ARIMR;
Integrated System of Information on Higher Education and Science POL-on (ZSIoSWiN);
Educational Information System (SIO).
Additionally, information collected in the electronic application acquired via CAxI channels (conventional census 2021) were used.
Age
The primary source of data was administrative registers which include:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Database of income tax data for individuals (PIT);
Central Register of Insured Persons (NFZ);
National collection of registers, records, and listings in matters of foreigners (KZREW) from UDSC;
Integrated Administration and Control System (ZSZiK) from ARIMR;
Integrated System of Information on Higher Education and Science POL-on (ZSIoSWiN);
Educational Information System (SIO).
Additionally, information collected in the electronic application acquired via CAxI channels (conventional census 2021) was used.
Legal marital Status (LMS)
The primary source of data was administrative registers which include:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Central Register of Insured Persons (NFZ);
Database of personal income taxpayers (PIT);
National collection of registers, records, and listings in matters of foreigners (KZREW) from UDSC;
Integrated Administration and Control System (ZSZiK) from ARIMR;
Integrated System of Information on Higher Education and Science POL-on (ZSIoSWiN);
Educational Information System (SIO).
Central Database of Unemployed Persons (CZDOB);
Additionally, information collected in the electronic application acquired via CAxI channels (conventional census 2021) was used.
Family status (FST)
The primary source of data for determining an individual's family status was information collected in the electronic application acquired via CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Database of personal income taxpayers (PIT);
Central Database of Unemployed Persons (CZDOB).
Household status (HST)
The primary source of data for determining an individual's household status was information collected in the electronic application acquired via CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Database of personal income taxpayers (PIT);
Central Database of Unemployed Persons (CZDOB).
Current Economic Activity (CAS)
The primary source of data in the 2021 National Population and Housing Census for determining current economic activity was information collected through the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Social insurance registers: Social Insurance Institution (ZUS) and Agricultural Social Insurance Fund (KRUS);
Unemployment register – Central Database of Unemployed Persons (CZDOB);
National Family Benefits Monitoring System – Ministry of Family, Labour and Social Policy (MRPiPS);
Educational Information System (SIO);
Integrated System of Information on Higher Education and Science POL-on (ZSIoSWiN).
Occupation (OCC)
The source of data in the 2021 National Population and Housing Census for determining occupation was information collected in the electronic application acquired via CAxI channels (conventional census 2021).
Industry (IND)
The primary source of data in the 2021 National Population and Housing Census for determining Industry was information collected in the electronic application acquired via CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Social insurance registers: Social Insurance Institution (ZUS) and Agricultural Social Insurance Fund (KRUS);
Database of personal income taxpayers (PIT);
Central Register of Entities – National Taxpayers Register (CRP-KEP).
Status in Employment (SIE)
The primary source of data in the 2021 National Population and Housing Census for determining status in employment was information collected in the electronic application acquired via CAxI channels (conventional census 2021). Additionally, data from the social insurance registers were used: Social Insurance Institution (ZUS) and Agricultural Social Insurance Fund (KRUS).
Location of Place of Work (LPW)
The primary source of data in the 2021 National Population and Housing Census for determining the location of work was information collected in the electronic application acquired via CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Social insurance registers: Social Insurance Institution (ZUS) and Agricultural Social Insurance Fund (KRUS);
Database of personal income taxpayers (PIT).
Educational attainment (EDU)
The primary source of data for determining educational attainment was information collected through the electronic application obtained via CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Data collections from the Social Insurance Institution (ZUS) system;
Central Database of Unemployed Persons (CZDOB);
Electronic National System for Disability Assessment Monitoring (EKSMOoN);
Integrated System of Information on Higher Education and Science POL-on (ZSIoSWiN);
Educational Information System (SIO).
Place of Birth (POB)
The primary source of data was administrative register data, which includes:
Universal Electronic System for Registration of the Population (PESEL);
Database of personal income taxpayers (PIT);
National collection of registers, records, and listings in matters of foreigners (KZREW) from UDSC;
Integrated System of Information on Higher Education and Science POL-on (ZSIoSWiN);
Additionally, information collected through the electronic application obtained via CAxI channels (conventional census 2021) was used.
Country of citizenship (COC)
The primary source of data was administrative register data, which includes:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Central Register of Insured Persons (NFZ);
Central Database of Unemployed Persons (CZDOB);
Database of personal income taxpayers (PIT);
National collection of registers, records, and listings in matters of foreigners (KZREW) from UDSC;
Integrated System of Information on Higher Education and Science POL-on (ZSIoSWiN);
Educational Information System (SIO).
Additionally, information collected through the electronic application obtained via CAxI channels (conventional census 2021) was used.
Year of arrival in the country (YAE)
The primary source of data for determining a person's year of arrival in the country was the information collected through the electronic application via CAxI channels (conventional census 2021). Additionally, data from the following register were used:
Universal Electronic System for Registration of the Population (PESEL).
Place of usual residence one year prior to the census (ROY)
The primary source of data for determining a person's place of usual residence one year prior to the census was the information collected through the electronic application via CAxI channels (conventional census 2021). Additionally, data from the following register were used:
Universal Electronic System for Registration of the Population (PESEL).
Housing arrangements (HAR)
The primary source of data was the information collected through the electronic application via CAxI channels (conventional census 2021). Additionally, data from registers regarding individuals and housing were used.
Type of family of nucleus (TFN)
The primary source of data for determining an individual's family status was the information collected through the electronic application via CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Database of personal income taxpayers (PIT);
Central Database of Unemployed Persons (CZDOB).
Size of family nucleus (SFN)
The primary source of data for determining an individual's family status was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Database of personal income taxpayers (PIT);
Central Database of Unemployed Persons (CZDOB).
Type of private household (TPH)
The primary source of data for determining an individual's family status was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Database of personal income taxpayers (PIT);
Central Database of Unemployed Persons (CZDOB).
Size of private household (SPH)
The primary source of data for determining an individual's family status was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
Universal Electronic System for Registration of the Population (PESEL);
Data collections from the Social Insurance Institution (ZUS) system;
Database of personal income taxpayers (PIT).
The tenure status of households. (TSH)
The primary source of data was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the registers regarding persons and dwellings were used.
Type of living quarters (TLQ)
The primary source of data was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
Occupancy status of conventional dwellings (OCS)
The primary source of data was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the registers regarding persons and dwellings were used.
Type of ownership (OWS)
The primary source of data was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
Number of occupants (NOC)
The primary source of data was the details gathered via the electronic application using CAxI channels (conventional census 2021). Additionally, data from the registers regarding persons and dwellings were used.
Useful floor space (UFS)
The primary source of data was information collected through the electronic application using the CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
Number of rooms (NOR)
The primary source of data was information collected through the electronic application using the CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
Density standard (floor space) (DFS)
The primary source of data was information collected in the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
Water supply system (WSS)
The primary source of data was information collected in the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
Toilet fascilities (TOI)
The primary source of data was information collected in the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
Bathing fascilities (BAT)
The primary source of data was information collected in the electronic application using CAxI channels (conventional census 2021). Additionally, data from the following registers were used:
National Official Register of the Territorial Division of the Country (TERYT) – system for the identification of addresses for streets, properties, buildings, and dwellings (NOBC);
National Register of Boundaries and Area of Administrative Units of Poland (PRG), supplemented by the Register of Localities, Streets, and Addresses (EMUiA);
private information systems – information systems of enterprises conducting economic activity in the field of electricity sales (File Energy ZE);
the statistical register for economic units, i.e., the Base of Statistical Units (BJS) – used to prepare a list of collective accommodation facilities;
The Buildings and Dwellings Database (BDD);
data obtained from companies providing municipal services for households;
data from previous censuses.
18.1.4. Adequacy of data sources
The data sources used in the 2021 Census fulfil the essential adequacy characteristics in accordance with Article 4 point 4 of Regulation 763/2008, as required by Regulation 2017/881, Annex point 2.4
18.1.4.1. Adequacy of data sources - Individual enumeration
The Resulting Census Data Set (WZDS) is an individual data collection consisting of relational tables covering: BUILDINGS, DWELLINGS, CLQs, HOUSEHOLDS, FAMILIES, PERSONS. Each of the above tables contains keys for linking with other tables in the set. The resulting set includes features of each statistical unit recorded separately in records, which allows for their any combination.
18.1.4.2. Adequacy of data sources - Simultaneity
The data collected in the census referred to the reference moment of the census, i.e., the status as of 31 March 2021, 24:00. This applied both to the data collected in the electronic census application from respondents and to data from administrative registers. In the section on Economic Activity, in the part concerning current activity, the reference period was additionally the week ending on the reference day of the survey (i.e., the period from March 25 to 31, 2021).
18.1.4.3. Adequacy of data sources - Universality within the defined territory
The resulting data from the 2021 National Census (2021 CENSUS) are available for all statistical units within a given territorial area, i.e., for buildings, dwellings and persons, considering the coherence and adequacy of the data, which was ensured at the data collection stage by using standardised questions in the census application and appropriate logical control mechanisms, and then at the results processing and deriving of stages.
18.1.4.4. Adequacy of data sources - Availability of small-area data
Census data concerning the population in terms of basic demographic and social characteristics (such as sex, age, marital status, educational levelattainment, labor market current activity status) as well as households and families are available at the lowest level of the country's territorial division, i.e., for communes gmina and statistical divisions. The subject granularity for selected features characterising the population depends on the level of both territorial and statistical disaggregation. Census data concerning buildings and dwellings are available at the lowest level of the country's territorial division, i.e., for communes gmina and statistical divisions. The subject granularity for selected features characterising dwellings and buildings depends on the level of both territorial and statistical disaggregation.
18.1.4.5. Adequacy of data sources - Defined periodicity
In order to ensure the conduct of the census under the difficult conditions of the pandemic, an amendment to the Act on the 2021 Census extended the duration of the census. Therefore, the main census was carried out for 6 months (originally planned for 3 months) from April 1 to September 30, 2021, as of the reference day of March 31, 2021, at 24:00.
18.2. Frequency of data collection
Every 10 years.
18.3. Data collection
The required information on the data collection process of in 2021 Census through electronic applications, the use of data from registers and administrative systems and the compilation process are described as required in adequate paragraph.
18.3.1. Data collection - Questionnaire based data
As part of the preparatory work for the 2021 census based on national and international information needs resulting from the above-mentioned legal acts, as well as internal recognition regarding interest in the results of the 2011 population and housing census, the information scope for the 2021 census was defined, which was consulted with key data users, utilising statistical data to fulfil their statutory tasks (in particular units of public administration and scientific research centres).
The analysis of the results of public consultations showed that both institutional and individual respondents generally did not report wide needs regarding the proposed information scope but placed a strong emphasis on the availability of data at the local level. Additionally, for the formulated information scope of the 2021 census, the possibility of their development based on data from registers and administrative systems was determined. Based on the formulated information scope, considering recommendations and comments from consultations, an electronic census form containing the range of questions necessary to develop the required information was prepared. It should be added that in order to test the adopted methodological, organisational, technical, and promotional solutions before the proper population census, two trial censuses were conducted:
1-st in September 2019, in two municipalities, with the objective of testing the functionality of electronic data collection channels, the census form (cognitive study), and the effectiveness of promotional solutions; it was conducted using CAWI, CATI, and CAPI methods.
2-nd in April 2020, in 16 municipalities, to test all census solutions; it was also conducted using CAWI, CATI, and CAPI methods.
The electronic census form
The electronic census form is an interactive application comprising parts for: personal data (of the person interviewed or self-enumerating), determining the person's residential address, persons in the dwelling/non-dwelling premises/collective accommodation facility, persons abroad, family relations, personal questionnaires with questions on demographic-social characteristics, migration, economic activity of persons, and a housing questionnaire on the description of the dwelling and building. The principle was adopted that the first adult to log into the census application provided information about the dwelling and all persons residing in it, established the composition of the household, and defined the family relationships among the individuals. Subsequently, personal questionnaires of the residents at the given address were filled out.
The application contained range and logical controls that allowed progression to subsequent questions depending on the response given. It was logically divided into thematic sections, which visually took the form of successive screens with questions according to previously defined paths. The interactive application displayed visual hints (in the form of help messages), containing explanations of definitions and descriptions to facilitate the understanding of questions, in order to provide correct answers.
The form application was supported by dictionaries (including addresses, types of collective living quarters, countries of citizenship, birth, residence, and stay, nationalities, languages, religions, types of fuel and energy sources, professions, and types of economic activities (NACE codes)). Due to their volume, the use of dictionaries was simplified by allowing the entry of a phrase/set of letters from the name (e.g., country of citizenship, profession, etc.), which narrowed down the number of possible positions and then enforced alphabetical sorting of terms depending on the position of the phrase in the first or subsequent word from the name.
The application was also adapted for persons with disabilities. Deaf and hard of hearing individuals, as well as those with speech disabilities, could use the online self-reporting application that met WCAG 2.0 standards.
Additionally, data about individuals staying in collective living quarters were provided by the managers of these facilities (to the extent of their available documentation) using a dedicated electronic application for collective living quarters. This application contained basic information about the type of facility and the persons staying in it. Persons staying in collective living quarters, who were enumerated by the manager, could independently submit data about themselves through an internet self-enumeration or, if requested, by a telephone interview.
Methods of logging into the self-enumeration application:
National Electronic Identification Node – logging in through the national electronic identification node triggered the login.gov.pl service, which offers the possibility of authentication in the census application 2021 Census using electronic identification means issued by other entities, as part of electronic identification systems (Trusted Profile, online banking). Due to the nature of this method (ensuring complete confidentiality of the information entered), after its first successful use, it was permanently assigned to the respondent and subsequent login attempts required its reuse.
Entering the PESEL number and individually defined access password – the first login using the PESEL number required providing the maiden name of the mother and defining a password (at which point a user account was created). The requirement to provide the maiden name increased the level of trust during the respondent's identity verification and also strengthened the protection against attacking the system using an external database or PESEL number generator. By having an individual access password, data entered into the electronic form application by the user was later made available only to them (if it was filled out in multiple work sessions).
Entering an email address and an individually defined access password – this method was intended for foreigners without a PESEL number, who on March 31, 2021, were residing permanently or temporarily on the territory of Poland. In the process of creating an access account for the self-enumeration application, an activation link was sent to the email address, which had to be opened to activate the account. Individuals with such an account could log in multiple times by providing an email and password.
Data persistence and availability of the self-enumeration application - data entered in the self-enumeration application were remembered and made available to the authorised user in subsequent work sessions, i.e., after each successful login. Access to the application was disabled at the moment of the correct completion of self-enumeration (confirmed by the "Finish self-enumeration" button) or when 14 days had passed since the first successful login.
Register of Personal and Residential Addresses
An essential element of conducting censuses is the preparation of a list of study units, i.e., buildings, dwellings, and persons containing address information, and for persons, data allowing for the identification of a given unit. In the 2021 National Census, the personal and residential address list was prepared based on address data and data about persons from available administrative and statistical registers. The compilation of the list was carried out in four main stages, namely the preparation of the address and housing list with x, y coordinates, the municipal update of addresses, the preparation of the list of individuals based on administrative data, integration of the prepared lists of units, and the addition of telephone numbers to persons.
The personal and residential address list was the main source of information about both persons and the buildings and dwelling being subject to enumeration. Additionally, the list served as a fundamental tool for managing the census, controlling the completeness of enumerated persons, buildings and dwellings, authenticating self-enumeration, and the flow of information about the progress of the census between data collection channels.
The most important phase of constructing the list was integrating the list of persons with address data from the updated address and housing list. This stage was crucial to check the quality and accuracy of the established addresses, as well as the addresses attached to persons.
The created personal and residential address list, along with assigned phone numbers and spatial location information, was the basis for carrying out the 2021 Census. This made it possible to feed the multi-channel data collection system in the census (CAxI) and monitor census progress, as well as manage the logistics of census enumerators' work.
In addition, the list was supplemented with information significant from the point of view of managing the data collection process from respondents in the form of special markings of the surveyed units, known as flags. Flags indicated potential occurrences of phenomena that could complicate data acquisition using the various data collection channels (CAxI) anticipated in the census, including particularly information on the possibility of obtaining data from an online self-enumeration, communication limitations, or difficulties in accessing individuals in a dwelling.
For the purpose of gathering data on persons staying in collective living quarters, a list of collective living quarters (CLQs) was created. The work on preparing this list aimed to isolate the population of collective living quarters with a strictly defined typology and scope of activity. This list was created in multiple stages to achieve the most complete population of CLQs possible, based on data from:
The address identification system of streets, properties, buildings, and apartments (NOBC), part of National Official Register of the Territorial Division of the Country TERYT,
Dispersed numerous registers and lists of OZZ published on websites (including registers and lists kept by provincial offices),
Lists of OZZ coming from the resources of public statistics, including the Base of Statistical Units (BJS).
The CLQs list, containing information about the types of facilities and their address characteristics, was used to contact the managing units, provide information about the census, and monitor the completeness of the study's implementation in CLQs.
Methods of conducting the census
The census was carried out as a full survey using the following methods:
a) online self-enumeration (CAWI) - conducted from April 1 to September 30, 2021, via an interactive application available on the website of the Statistics Poland (GUS);
b) telephone interview (CATI) - conducted from May 4 to September 30, 2021, by a census enumerator using software installed on a computer, dedicated to conducting the census, or a mobile device;
c) face-to-face interview (CAPI) - conducted from June 21 to September 30, 2021, by a field census enumerator with respondents for whom it was not possible to collect data by CAWI, CATI, and "Census on Demand", using a mobile device equipped with dedicated software for the census;
d) telephone interview "Census on Demand" on the census helpline - conducted from April 1 to September 30, 2021, by a telephone enumerator with a respondent who called the helpline and chose the "Census by Phone" channel or was switched to this channel from the 2021 CENSUS information channel, after expressing a willingness to be enumerated by this method, using software installed on a computer dedicated to conducting the census.
The various data collection methods for the census were launched successively, with the aim of enabling respondents to fulfil their obligation to complete the online self-enumeration first. In all the aforementioned methods of data collection, an interactive form-based application was exclusively used.
Due to the state of the pandemic and periods of intensified illness, all enumerators were provided with personal protective equipment (masks, disinfectants) by the WBS. The schedules for enumerators to go into the field were regulated by recommendations from the CBS, preceded by decisions of the Chief Census Commissioner and decisions of the ZWKS. These decisions were conditioned on confirmation and acceptance by the enumerators of the principles for conducting the census through direct interviews.
For individuals who did not have the technical and material conditions to fulfil the obligation of online self-enumeration, free access to rooms equipped with computer equipment was provided by municipal offices, statistical offices, and by Statistics Poland. This equipment had installed software enabling the self-enumeration via the Internet, adapted to the needs of people with disabilities, along with protective measures appropriate to the current pandemic threat. Moreover, upon request, necessary assistance with the use of the interactive application was provided.
Legal obligation to collect data
Under the provisions of the Act on the National Population and Housing Census 2021, participation in the census was mandatory, and respondents were obliged to provide accurate, comprehensive, and truthful answers. Adults living together with minors or with persons who were absent were required to respond on their behalf. The compulsory form of participation in the census was internet self-enumeration, while supplementary methods were telephone (CATI) or direct interviews (CAPI), conducted by census enumerators. If a census enumerator contacted a respondent who had not completed the internet self-enumeration by the time of this contact, giving an interview to the enumerator became mandatory. Providing answers that were inconsistent with the facts or refusing to respond could lead to legal consequences set out in the provisions of Articles 56 and 57 of the Act on Public Statistics.
Census Promotional Activities
The main goal of the planned promotional activities was to convey fundamental information about the National Population and Housing Census 2021 to recipients and to encourage respondents to be open and actively participate in the census, especially in internet self-enumeration. Supplementary objectives were also defined, such as acquainting the public with the value of the census as a common public interest, communicating the benefits of using census results for making significant decisions in private and professional life, engaging public institutions in informing about the census obligation, and building awareness of the census obligation among respondents.
The campaign promoting the 2021 Census was carried out under the slogan "We Count for Poland," and the hashtag and slogan #EveryCountCounts was also active on social media. As part of the materials that were used to popularise the census, various resources were developed, including: radio and television spots, a self-enumeration instructional video, animations and educational materials, infographics, leaflets, posters, and colouring books. Promotional materials were also produced, which served as prizes in numerous competitions and were also given to respondents who completed the census at census sites organised by the Statistics Poland and statistical offices. The leading information channel in the campaign was the website.
At various stages of launching a given method of census implementation, promotional activities included:
campaigns in public and commercial media, as well as on social media (radio, press, television, and the Internet);
sending out letters (including letters from the President of Statistics Poland to respondents) and mailings;
ongoing handling of inquiries from 2021 Census respondents and users of census data (by phone, email, through Statistics` Poland channels in social media);
setting up self-enumeration stations in offices and public spaces and supporting respondents in fulfilling their census obligation in this formula;
engagement of census ambassadors;
preparation and purchase of promotional materials, including posters, leaflets, and contest prizes;
an internal information campaign directed through internal communication channels to public statistics employees.
The promotion of the census was also carried out in collaboration with the commitment of employees of Municipal Census Offices, representatives of national and ethnic minorities, and communities using regional languages, representatives of churches and religious associations, organisations and associations working for people with special needs and seniors, public administration, uniformed services, curators, and non-governmental organisations. The Central Census Office paid particular attention to people with special needs in its promotional activities. These included a census form fully translated into Polish Sign Language, as well as commercials and instructional videos that were also equipped with audio description and subtitles. Moreover, flyers were prepared in a version for the visually impaired and in Braille. A video chat was also launched for sign language users. At Statistics Poland, the Customer Service Department held on-call hours to support respondents in fulfilling their census obligation and to reduce accessibility and information-technical barriers. In addition, a special census helpline was operated, designed to optimize customer service and satisfaction.
18.3.2. Data collection - Register based data
In the 2021 population and housing census, Poland continued to use administrative data sources to develop the census's informational scope. As part of the preparatory work for the 2021 National Population and Housing Census, an analysis was carried out of the informational scope of 47 registers and information systems of public administration and non-public information systems that could be potential data sources for the 2021 CENSUS, resulting in the selection of 35 systems.
The decision on the usefulness of information systems for the census was preceded by a quality assessment according to the adopted standard. The preparatory work for the 2021 Census applied the results of previous research and projects carried out by public statistics, regarding methodologies in the field of, among others:
a) identification of new data sources – administrative registers for statistical studies;
b) assessment of the utility of administrative registers, considering the requirements of public statistics;
c) analysis of methodological compliance of administrative registers with the public statistics system;
d) determination of the degree of compliance of informational features of administrative registers with the public statistics system, presentation of the consistency of public administration systems with the public statistics system;
e) cooperation with administrators of administrative registers to achieve consistency of informational features of objects – mutually beneficial and agreed informational standards that allow for sharing information;
f) Improvement of the quality of the state's information system by increasing the interoperability of systems and reducing the redundancy of data collected in the systems.
The designed work adopted the following priorities:
a) ensure full coverage of basic variables in 2021 Census by data from public administration information systems;
b) ensure, as fully as possible, coverage of additional variables in 2021 Census by data from public administration information systems;
c) acquisition of data from public administration information systems with a PESEL number and address;
d) linking administrative data from various systems at the unit level – micro-integration of various sources;
e) preparation of complete and up-to-date lists for 2021 Census.
The quality assessment of administrative registers planned for use in the 2021 population and housing census was also based on the expert knowledge of statistics employees. All features in the registers and systems were assessed in terms of the possibility of obtaining information about the population, dwellings, and buildings. The analysis was based on legal acts concerning the conducted registers and metadata obtained from managers. The quality assessment of the register covered three areas:
a) information about the quality of the register,
b) information about the quality of data from the register,
c) general assessment of the administrative register.
Administrative collections were transferred through TRANSGUS, i.e., a web application allowing data transfer to Statistics Poland resource servers, where it underwent a preliminary check based on a metadata sheet provided by the manager. In case of lack of qualitative or quantitative completeness, cooperation with the manager was undertaken to resend the correct set.
An important element in preparing data sets is improving the quality of input collections. To achieve this goal, transformation of external systems into statistical data sets takes place. Data cleaning consists of a series of procedures such as data verification, standardisation, deduplication, and data supplementation. These methods are part of ensuring data quality improvement, making the data useful for public statistics, i.e., current, accurate, and complete. One of the more complex processes is address standardisation. In the process of standardising address data, reference data from the TERYT register closest to the study period is used.
Actions performed on the collections aim to unify the way names are recorded, according to the norms prevailing in public statistics, and to harmonise records so that all address fields form a correct logical sequence. The prepared sets are used in the process of calculating the resulting data.
18.3.3. Data collection - Sample survey based data
Not applicable, full survey.
18.3.4. Data collection - Data from combined methods
The 2021 population and housing census was carried out using a mixed method, i.e., using data from administrative sources and data collected from respondents. It should be emphasised that the decision to use a mixed method in the implementation of the 2021 population and housing census is part of a Europe-wide trend to utilize informational resources from existing administrative registers in censuses.
For most census topics, the source of data was information obtained from the electronic census application. Data from registers and administrative systems were used supplementally in the absence of information from respondents. However, for selected topics, such as sex, age, marital status, country of birth, and country of citizenship, the priority data source was information from administrative registers due to their legal confirmation (detailed information is described in section 18.1.3).
It is worth noting that the combination of data from multiple administrative sources for the purpose of determining values for a topic was cascading, i.e., data for the subject area under investigation were first obtained from one data source (considered as a reference due to the inclusiveness of the population and their timeliness), and then information from subsequent sources was added. The combination of characteristics from different data sources occurred at the record level thanks to the possibility of combining through UNS identifiers (at the level of individual data identifiable in the registers and collections obtained from the census application). In the case of information about dwellings and buildings, the combination took place through assigned identifiers for apartments and buildings.
For topics for which it was not possible to establish values based on any of the above sources, positional imputation methods were used (described in section 18.5). It should be added that for 91.6% of the residing population, the source of data was information from the census application.
Information about the source of origin for individual topics were saved in special metadata "origin" file, created parallel to the Resulting Census Data Set (WZDS). This collection indicates the source of origin of individual topic values, i.e., from data from the census application, registers, imputations, or corrections.
18.4. Data validation
The data processed throughout the entire census operation were monitored and validated at every stage, from the collection of source data to the development of resulting information. In the case of source data coming from the electronic application, they were checked during their entry by the respondent using developed validation rules. These rules enforced the respondent to answer mandatory questions and checked the consistency of the answers given with the responses to previous questions.
For data collected in the electronic application, a completeness indicator was determined for filling in information for a given person, and based on this indicator, a record was selected when a person was enumerated more than once, i.e., they completed a self-enumeration and additionally were enumerated by another respondent.
In the case of data from administrative sources, after their transfer by the manager, the compliance of the data range with the scope described in the Act on the National Census of Population and Housing in 2021 was checked. The range of values for the transferred variables was also checked. A completeness indicator for filling in each of the transferred variables was also determined. The results of the checks in the form of post-control reports were the basis for assessing whether the set could be further processed or should be transferred by the manager again after providing additional explanations from their side.
In the case of developing resulting variables, validation elements were included in the variable derivation algorithm. An example could be age and variables related to economic activity topics. Deriving each variable value first required checking whether the person is 15 years old or more. Some variables, such as variables from the area of families, required checking relationships between person records to correctly identify families, assign them the appropriate family type, and determine the appropriate position in the family.
After the development of the resulting variables, data validation was also performed, which involved checking the range of calculated values and their consistency with the values of other variables, e.g., whether a person with the marital status of single does not form a family of the marriage type.
Data control also took place after the calculation of the resulting tables intended for publication. It concerned the control of values that were repeated in individual tables, most often in the form of aggregated data, e.g., in the columns and rows "Total".
Additionally, data from the area of economic activity were assessed through comparisons conducted both with information from available register databases and with the results of the Labour Force Survey (BAEL), surveys of enterprises and budget sphere units regarding employees, and the "Unemployed and job seekers registered at employment offices" study.
18.5. Data compilation
Stages of Processing Census Data
For the tasks associated with storing, processing, and analysing population and housing census data, two main IT environments were used:
Operational Microdata Base (OMB) – an environment with very limited and strictly protected and controlled access, where identifiable data are processed. Part of the processes requiring identifiable data took place in this environment.
Analytical Microdata Base (ABM) – an environment where anonymised data are stored and processed, characterised by less restrictive access.
In addition to the two mentioned, there is also a computer environment for managing and controlling the course of the questionnaire-based census survey (CORstat-census), used until the end of the implementation phase of the questionnaire-based census survey.
In the entire process of developing the results of the 2021 Census, 7 main stages (groups of processes) of processing can be identified; however, their order (numbering) should be considered conventional, as some of them took place simultaneously, and many were repeated multiple times (e.g., as data sources were updated):
1. Acquisition of sets from administrative registers and their adaptation to statistical needs – i.e., transforming them into statistical data sets. The first action at this stage was the very acquisition of the register sets and the metadata describing them from the register managers and their preliminary checking in terms of the conformity of structures and contents of the sets with the provisions of the documentation regarding the transfer of the sets.
In the next step of this stage, raw register sets, which varied in terms of technical, formal, and substantive aspects, were adapted to statistical needs. In particular, these were actions such as the identification of study units (recognition of units and assigning artificial unique identifiers), unification of the format and structure of sets, and standardisation of variables, i.e., recoding raw variable values into codes compliant with classifications in force in statistics, including, for example, the conversion of various address records into appropriate codes of the nomenclature of the territorial division used in public statistics - TERYT.
The process of acquiring and adapting registry collections took place in the OBM and was repeatable as updated registry collections were delivered.
2. Creation of a (pre)census list based on register data. To be precise, the List of persons, addresses and dwellings is a set of relational tables, each related to a different type of unit under observation (i.e., persons, dwellings, and buildings). This list was the main source of information about the units that were to be enumerated and, as such, served as the operational register of the census and a key tool for managing the census, controlling the completeness of the enumerated persons, buildings, and dwellings, as well as being the basis of the authentication procedure for respondents in the online self-enumeration. Moreover, the list, before the census survey, was an approximation of the target populations of statistical units determined based on existing register data (predefined target population). The process was carried out within the OBM system. This process was repeatedly carried out (even after the commencement of the census survey) as updated registry collections were acquired (until the data was updated to the reference moment of the census).
After the census survey, based on the data obtained in the census, the (pre)census list was updated – mainly in terms of new statistical units identified in the census survey - providing the basis for creating a post-census list, which constituted the subject core of the census result collections.
3. Creation of Domain Data Sets (DZD), was a process carried out in the OBM involving the full integration of multi-source resources acquired from registries, dispersed in many diverse collections, into data table structures that meet the methodological and substantive requirements of the census survey.
As a result of this process, data from registers were organised into sets whose structures were created according to a model that considered both the types of statistical units being studied (buildings, dwellings, persons, etc.) and thematic groups (domains) within a given type of units. For example, in relation to persons, separate sets were created for thematic domains such as demographic and social characteristics, socio-economic characteristics, disability, migration, foreigners, and economic activity.
The systematic organisation of registry data facilitated the analysis of data within a given domain and ensured the optimisation of processing procedures since some topics only concerned parts of the population (optimal set sizes).
The algorithms used at this stage for generating variables and their validation and correction constituted a further step towards improving the quality of registry data and their optimal adaptation to the needs of developing resulting census variables.
4. The census survey process (questionnaire survey phase) – involved collecting data from respondents, carried out through three main survey techniques: Internet self-enumeration (CAWI), personal interview (CAPI), and telephone interview (CATI). (In the GUS nomenclature, survey data collected by all techniques are collectively referred to as data from “CAxI channels”). This process was executed within the CORstat-census system, which successively and continuously transmitted sets of electronic records from completed surveys to the OBM system.
5. Creation of Data Tables from CAxI Channels – This involved the integration, verification, and cleaning of data from CAxI channels (questionnaire survey). This process took place in the OBM and included transforming the questionnaire data streams received from CORstat-census into elementary data sets corresponding to individual study units and each of the various thematic modules and types of questionnaires for a given type of unit (for example, data about persons were in several modules, and some of them were filled out only for one respondent in a dwelling, while some types of questionnaires were dedicated to special categories of persons, e.g., those temporarily abroad).
The elementary sets were then appropriately combined to form main sets corresponding to the units under study, i.e., persons, dwellings, and buildings, or special parts of the study, i.e., the set of collective living quarters (CLQs) and the set of persons enumerated in them (individuals in OZZ). The next step involved the verification and cleaning of the sets. In this phase of the process, the identification of study units was carried out, consisting of the recognition of units – including, for example, persons without an official identifier (PESEL number) or dwellings with incorrectly or imprecisely provided address data by the respondent – and then assigning them artificial unique identifiers.
In the subsequent step, the optimal data selection (deduplication) was carried out, i.e., selecting the best version of the same survey (in some cases the same survey was edited several times by respondents, and each version was saved in the system) as well as selecting the most adequate and qualitatively best data record for the unit (e.g., by design, persons could have been enumerated multiple times – in different and independent surveys – enumerated by household members and enumerated independently). As a result of this stage, datasets were created (according to the type of units surveyed) containing optimal quality (most adequate and reliable) data from the questionnaire-based census survey (Data Tables from CAxI channels), intended for processing census results.
6. Post-enumeration control survey process - this was a representative survey on a sample from the enumerated population, whose main goal was to analyse the quality of the basic census survey in terms of assessing errors in the measurement of characteristics, i.e., the content of responses obtained in the main survey. At this stage, the following main groups of processing activities can be distinguished:
Selection of the register sample for the survey: Based on the data developed from CAxI from the main survey (CAxI data), a sampling frameregister was created for the control survey according to the guidelines/concept of the control survey, and then a sample of individuals for the survey was drawn and transferred to the CORstat-census system.
The implementation phase of the control survey in the field, serviced by the CORstat-census system, ended with the transfer of collected data to the OBM.
Creation of the Set for Analysis - Based on data from the control survey and the corresponding data (appropriate records and variables) from the main survey, which formed the basis for conducting comparative analyses and ultimately developing reports on quality assessment.
7. Creation of the Resulting Census Data Sets (WZDS). The WZDS is the proper and essential database from the perspective of the objectives of the population census, comprising several tables of data interconnected by relationships, corresponding to all types of statistical units that were to be characterised based on the 2021 population census. The WZDS is the basis for deriving all final results of the 2021 population census. Therefore, the WZDS compiled the appropriate data records representing (or containing) all the target population units and the appropriate variables representing all the census topics. In the WZDS, appropriate indications (flagging) of units (records) belonging to appropriately defined target populations surveyed in the census were also made (see the "Estimation" section point).
The Resulting Census Data Sets (WZDS) were created based on developed and implemented structures (lists of needed variables) and algorithms for deriving variable values. In terms of data records, the core of the WZDS became the post-census list, which is the List of persons, addresses and dwellings updated after the questionnaire survey, defining the set of subjects (observations) of the individual data tables to be described with variable values.
The WZDS model provided for appropriate data tables for all types of statistical units studied, including new entities generated only at the stage of WZDS, such as households and families. Correspondingly, for each data table, a corresponding metadata table (known as the "source origin file") was created, matching in structure and the number of records, which recorded the source of origin for each individual variable value in the data table.
In the process of processing the WZDS, two essential (sub)stages can be distinguished: 1) processing in the OBM environment (protected) and 2) processing in the ABM environment.
In the protected environment of the OBM, all operations for processing in the WZDS that required access to source data (CAxI and DZD data) and the full set of identifying features (such as surnames, complete addresses, etc.) were performed. Calculating the values of the WZDS variables in the OBM environment was done by executing algorithms that referred to data prepared in earlier processing stages, encompassing two main sources of data: data from the questionnaire survey (CAxI data) and registry data organised in domain tables (DZD). Depending on the nature of the variable – for instance, whether official information (such as birth date) or information declared by the respondent is more reliable – the algorithms assumed the priority of CAxI or DZD respectively. The decisive criterion was the availability of information in the priority data source, and for example, in the absence of preferred information from the respondent, data from the DZD (registry data) was used. In some situations, it was necessary to use complex algorithms, such as those distinguishing original types of sources (specific types of registers), containing conditional selection instructions, comparing values from different sources, or referring to other auxiliary variables.
In the OBM environment, where information identifying persons were available, new objects (statistical units) such as households and families were generated, and separate tables within the WZDS were created for them (details in the "Generation of households and families" section).
After completing the necessary operations in the OBM, the WZDS tables underwent a data anonymisation procedure. This involves generating new versions of data tables stripped of variables that enable direct identification. These tables were then exported to the ABM, along with their corresponding metadata tables.
In the ABM environment, the data tables and the WZDS transferred from the OBM were given a new data structure, expanded with new resulting variables. The algorithms for calculating the values of new variables in ABM operated exclusively on data available in the WZDS. At this stage of WZDS processing, the algorithms generally included secondary transformation procedures, which consisted of generating new versions (new necessary representations) of existing variables, for example, by grouping (creating broader classes) of original variables, but also creating new derivative variables based on the values of two or more source variables. In the ABM environment, the majority of such WZDS variables were created that required inter-object transformations, i.e., based on operations of retrieving, compiling, and transferring values between different types of statistical units (e.g., between persons, families, households, dwellings, etc.).
The processing of the WZDS in both the OBM and the ABM was divided into multiple steps, and the WZDS processing stages in both environments (OBM and ABM) were iterative (repeatable). After each step or iteration, validation actions were carried out, based on rules for checking correctness, and generating and analysing reports that check the effect and quality of the transformations. Based on these, algorithms for calculating variables were corrected (refined), and, if necessary, additional data editing operations were carried out, i.e., corrections and filling in missing values (details in the "Record editing" section).
All processing algorithms, both assigning and modifying a given variable value, contained instructions for recording in the corresponding place of the metadata table (“origin file”), indicating the source of the value's origin.
In the ABM environment, where the WZDS data obtained their final form, appropriate meta-information (various types of dictionaries) was developed for them, which allowed for the proper interpretation of codes – variable values in the data tables.
Based on the WZDS data and metadata in the ABM environment, appropriate data needed for the analysis and dissemination of census results were generated, including reports and publication preparations, such as data (e.g., hypercubes) for reports and statements on quality.
Capturing
The census in Poland was conducted based on questionnaire surveys and data from registers, according to the assumptions (included in the census law). The data obtained from each source, including from the census survey (census questionnaire), were in electronic form. Paper questionnaires were not used.
Coding
Most questions in the census questionnaires were pre-coded, meaning that directly under the questions, there were proposed answers in the form of multiple-choice options (cafeteria) or (in the case of a larger number of possible answers) drop-down lists (answer dictionary). Selecting an answer resulted in the recording of the appropriate code in the electronic data sets (CAxI data sets). The questions, along with predefined answers, were consistent with accepted national and international definitions and classifications, or sufficiently detailed to be transformed (e.g., aggregated) into the expected classifications.
For a few questions in the census questionnaire, the possibility of free text entry was used, which required the application of procedures (generally automatic) for classifying and coding according to an algorithm developed by experts.
Regarding data from various registers, it was necessary to apply procedures to unify (standardise) the non-uniform records and adjust them to the definitions and classifications applicable in the census.
Identifying variable(s)
In the processes of processing census data, many different types of identifying variables were used to identify various kinds of statistical units. Among these, one should distinguish between primary (natural and raw) identifying variables, which exist and are used also outside of public statistics and the census itself, and variables – artificial identifiers, created solely within and for the needs of census data processing systems. The use of artificial identifiers was necessary to optimize the process of integrating data from various sources, including detecting duplicated data records for some units. Moreover, artificial identifiers allowed maintaining data integrity in the phases of processing and analysis carried out after the data anonymisation procedure (removing variables allowing direct identification of units), i.e., they ensured a relationship between records of different datasets while simultaneously preventing the direct identification of units. In relation to persons – before the assignment of artificial identifiers – the official identifier used in the population register in Poland – PESEL number, was primarily used for the identification of units, and in addition – especially in its absence – natural identifying characteristics such as first names, last names, parents' names, as well as secondary identifying characteristics, such as date of birth, sex, citizenship, etc. As part of the procedure for identifying and recognising units (detailed description of the procedure in the "record linkage" section) all data records about persons from each source set were assigned an artificial identifier – Unique Statistical Number (UNS), valid in all statistical sets and in all subsequent processing stages. For family and households, only artificial identifiers (id_household, id_family) were used, assigned at the time of creating sets (delineating) households and families (more detailed information in the section concerning the generation of households and families). In relation to dwellings and buildings, artificial identifiers were also used, necessary for optimising the process of data integration and ensuring relationships between records of different datasets.
Record editing
In the census questionnaire (census application), a set of rules was applied to enforce consistency and logical coherence of the responses given. Similarly, the algorithms for calculating (creating) variables in the result datasets assumed a certain scope and logical coherence of the values assigned (see section 18.4 "Data validation"). In addition to this, during the processing of the Resulting Census Data Set (WZDS), a series of actions were taken aimed at validating and then correcting the data. Rules were applied to check the content of variables and their mutual relations, as well as to generate reports presenting distributions and variable relations in stages. Depending on the situation and processing phase, corrections to the basic algorithms were made or data correcting procedures were implemented. Most corrections were automatic and resulted from the logical assumptions of relationships between variables; in rare cases, such as outliers or less credible values, ad hoc point corrections were applied. All types of data corrections were performed using computer instructions, meaning that at no stage was direct (manual) editing in data records applied.
Generally, most data for census topics were obtained from the questionnaire census survey, and in cases where data was not obtained in the survey, it was sourced from registry sets. This procedure ensured sufficient completeness in relation to most census topics (characteristics) concerning persons. However, there were a few cases of individual characteristics that were not represented at all in registry sets or were represented only fragmentarily, resulting in too large (unacceptable) a number of missing values.
In cases of excessive data missing for variables, positional imputation procedures were used. An example of such a characteristic was education. In the absence of data on the level of education from the basic sources of the census, i.e., data from respondents (questionnaire census survey) and data from registries, statistical imputation was applied. The hot-deck imputation method was used, i.e., replacing missing data with values from other complete (with known education variable values) census dataset records. The assignment of data was carried out within common imputation classes (groups), to which recipients (records with missing data) and donors (records with known values of the education characteristic) were assigned, according to the adopted criterion of proximity. The allocation of donors and recipients to common imputation groups was based on auxiliary characteristics (proximity criterion in the context of education) such as sex, age, and the unit of territorial division of the country. Assigning important feature values to recipients was done randomly – with a probability proportional to the distribution of feature values among donors – within the common imputation class.
Imputation of variables related to dwellings and buildings was done by deductive method. This concerned variables such as useful floor space, number of rooms in an dwellings, title of ownership, water supply system, flush toilet, type of heating and bathroom. In the absence of data for the aforementioned variables from basic census sources, i.e., data from respondents (questionnaire census survey) and data from registries, deductive imputation was applied - missing data were supplemented with values calculated from other complete census dataset records or deduced based on other variables completed for a specific dwelling.
Record imputation
No data records about persons were imputed.
Record deletion
In the Resulting Census Data Set (WZDS), no data records about persons were removed.
Estimation
All planned estimates – calculating the counts of the surveyed populations, distributions of characteristics, and other statistical operations needed to present the results – could be performed based on census data sets, which contained individual data records for all statistical units entering into the surveyed census populations (individual enumeration – data for the total population). Records of each set of statistical units contain assignment to territorial units and spatial location, which allows estimating the size of census populations and their characteristics for all necessary territorial arrangements (universality within the defined territory).
Determining the census population of persons, i.e., including persons in the usually resident population, as well as their spatial location, was based on appropriate algorithms, which primarily took into account respondents' answers to a sequence of questions regarding their place of residence, the nature of the demand (permanent vs temporary), and the time spent in a given place.
For persons who were ultimately not covered by the questionnaire survey, separate algorithms were applied, based on the content of registry data, including data on registration addresses and other information concerning signs of life (number of registries, types and specificity of registries). In the algorithms determining the place and nature of the stay of persons available exclusively in the registries, separate paths and hierarchies of source credibility were applied depending on the population segment (e.g., foreigners, children, people of mobile age, seniors).
For this separately identified the usually resident population, all required characterising features were individually determined, which allowed for any combination and cross-referencing of features. Determining (calculating) the value of each characteristic of persons was carried out based on algorithms, which, according to the specificity of the characteristic and the availability of a given type of source, reached to the results of the questionnaire survey or registry data (detail in the description of stage no. 7, in the section "Processing stages").
Determining the population of dwellings was preceded by the verification of dwelling addresses conducted during the preparatory work for the census. Information collected during the census on the size of dwellings, the period of construction of the building in which the dwelling is located, and the equipment of dwellings with technical and sanitary installations allowed for assessing the standard of dwelling stocks. The analysis of information characterizing the dwellings, including in particular conventional dwellings, with information concerning the people who inhabit them, allowed for determining the housing arrangements of the population residing in the country as of March 31, 2021.
Record linkage including identifying variable(s) used for the record linkage
Generally, individual datasets for various statistical units (e.g., persons, families, dwellings, buildings) within the same processing stages were organised in the form of relational tables. Establishing relations between tables within the same processing stage (ensuring data set integrity), but also transferring relevant data between subsequent processing stages was made possible through the use of artificial (technical) unique identifiers (relationship keys), for each type of statistical units. The assignment of artificial identifiers took place within the procedures of identifying (recognising) statistical units, which were implemented in relation to all source data (data from registers and later from the questionnaire census survey), entering the processing system (operational microdata base - OBM)), especially at processing stages numbered: 1, 2, 3, and 5.
The recognition of a unit and the assignment of an artificial identifier took place through the pairing (linking) of raw data entering the processing system with the already existing reference tables of units in the system, which contained artificial identifiers.
The initial reference unit tables were initiated based on chronologically earliest introduced registry sets (e.g., the set of persons from the PESEL registry) deduplicated and assigned an artificial identifier (subsequent integer number). In the process of introducing subsequent registry sets, the reference unit tables were donors of artificial identifiers for records from new sets, and at the same time, they were expanded by additional records (units) identified in new sets. Units from the raw (input) set, after successful linkage, received an artificial identifier from the reference table. On the other hand, units that did not pair with the reference table were subject to verification (in terms of the credibility of the unit's existence), and then were added to the reference unit table, expanding it as new identified statistical units (new records) recognised in the processing system.
These procedures were repeated for each newly acquired registry set and for updated versions of sets. As a result of identifying registry sets, it was possible to create a (pre)census personal-address-residence list, which – among other functions – served as reference tables for identifying statistical units within data acquired from the questionnaire survey (CAxI data).
The process of linkage new sets entering the system with the reference unit tables was carried out based on natural identifying variables available in the raw (input) data sets. In the case of persons, identifying linkage took place – depending on availability – on the basis of such data as PESEL number, first names, surnames, date of birth, sex, mothers' names and surnames, address data, citizenship, country of birth, etc. The result of linkage records concerning persons was to assign the data records a Unique Statistical Number (UNS).
Most records concerning persons were paired unambiguously (deterministically) by PESEL number. In the absence of a PESEL number, data were paired by other characteristics of the persons. Due to the varied availability of other identifying features, their incompleteness, diversity of records and errors in records (e.g., first names, surnames or even dates of birth), multilevel and multi-stage linkage methods were used, based on measures of similarity in the vector of identifying features and the probability of record match. Algorithms for using the measure (indicator) of similarity considered the variability of availability of identifying features and various functions of comparing strings of characters, patterns of numerical recording errors (e.g., dates of birth), as well as qualifying critical values – thresholds for acceptance/rejection of conformity.Początek formularzaDół formularza
Generation of households and families
The delineation of households and families was carried out in the course of a secondary process, based on a previously prepared set of persons, i.e., the determined usual resident population. Essentially, the generation of new units was accomplished by combining (grouping) persons, i.e., assigning them to new units – households and families, giving these extracted units unique artificial identifiers, and creating records for them in new datasets (tables) provided respectively for households and families. In terms of households, the housing concept (definition) of a household was adopted. Consequently, the delineation of households was based on an algorithm that refers to the housing criterion, according to which all people living in one dwelling (common dwelling identifier) were included in one household.
The delineation of families – according to the adopted definition – was carried out within the people belonging to the same household (same dwelling), based on a complex algorithm, considering various data about persons. The basis of the family generation procedure was the processed and appropriately assigned data about the relations between persons, obtained from respondents in the questionnaire survey (CAxI data), in which respondents listed together – within the framework of so-called one housing survey – could indicate among themselves spouses/partners and parents. In the absence of appropriate data from CAxI, reference was made to relevant data on marital relations and parent-child relations, available in processed registry collections. In the absolute absence of appropriate data on relationships between persons, attempts were made to determine them on the basis of probability using various individual auxiliary characteristics of persons such as surnames, sex, and age (generational groups), as well as appropriate interpretation of data on the composition of the household.
For the created data records about households and families, variables characterising these units were generated, generally calculated by applying various aggregating functions operating on the groups of data records about persons belonging to the household/family, e.g., counting the number of members, the number of specific categories of members, such as children, etc.
Measures to identify or limit unit-no-information
Factors conducive to minimising unit-level data deficiencies in the census (omissions of units belonging to the census populations) include the solutions adopted for the implementation of the census, primarily the use of a large amount of data resources from registers and information systems, and a wide range of methods for their exploration, as well as a high degree of implementation (completeness) of the questionnaire-based census study. Moreover, solutions and actions taken in identification (recognition) of statistical units within the entire resource of collected data seem to be significant in this matter.
Description of the main topics dissaminated for population, households, families, living quarters and dwellings.
23 April 2025
The information is given separately for each census topic. See the sub-concepts 3.4.1 - 3.4.37.
In the 2021 National Population and Housing Census, the statistical units were: building, dwelling, person. Secondarily derived units (based on person data) were household and family.
"Target population" means the set of all statistical units in a defined geographical area at the reference date that are eligible for a survey on one or more specified topics. The target population includes each valid statistical unit exactly once.
In accordance with the provisions of the Census Act 2021, the Population and Housing Census was conducted on the territory of the Republic of Poland and covered the following population groups:
Polish citizens residing in Poland who have their place of residence (understood as the place of permanent or temporary registration or as a place of permanent or temporary residence) in dwellings, buildings, other premises other than a dwelling or in collective living quarters;
foreigners residing in Poland permanently and staying temporarily (whether registered or not) in dwellings, buildings, other premises other than a dwelling or in collective living quarters;
Polish citizens residing abroad (regardless of the period of residence) who had not deregistered from permanent residence in Poland;
homeless persons persons living in the streets without a shelter – Poilsh citizens and foreigners;
moreover:
dwellings, buildings, other occupied premises other than a dwelling.
A person who deregistered from permanent residence in Poland due to the permanent stay abroad is not obliged to participate in the National Population and Housing Census 2021.
The census did not include:
heads and foreign staff of diplomatic representations and consular offices of foreign countries, their family members and other persons enjoying privileges and immunities under the law, international agreements or generally recognized international customs;
apartments, buildings, facilities and premises owned by diplomatic representations and consular offices of foreign countries
Information is provided in the sub-concepts 5.1 - 5.3.
Information on the accuracy of individual topics in accordance with the requirements of Commission Implementing Regulation (EU) 2017/543 of 22 March 2017. See the sub-concepts 13.1.1 - 13.1.35.
Counts of statistical units should be expressed in numbers and where is needed rate per inhabitants enumerated in the country.
Stages of Processing Census Data
For the tasks associated with storing, processing, and analysing population and housing census data, two main IT environments were used:
Operational Microdata Base (OMB) – an environment with very limited and strictly protected and controlled access, where identifiable data are processed. Part of the processes requiring identifiable data took place in this environment.
Analytical Microdata Base (ABM) – an environment where anonymised data are stored and processed, characterised by less restrictive access.
In addition to the two mentioned, there is also a computer environment for managing and controlling the course of the questionnaire-based census survey (CORstat-census), used until the end of the implementation phase of the questionnaire-based census survey.
In the entire process of developing the results of the 2021 Census, 7 main stages (groups of processes) of processing can be identified; however, their order (numbering) should be considered conventional, as some of them took place simultaneously, and many were repeated multiple times (e.g., as data sources were updated):
1. Acquisition of sets from administrative registers and their adaptation to statistical needs – i.e., transforming them into statistical data sets. The first action at this stage was the very acquisition of the register sets and the metadata describing them from the register managers and their preliminary checking in terms of the conformity of structures and contents of the sets with the provisions of the documentation regarding the transfer of the sets.
In the next step of this stage, raw register sets, which varied in terms of technical, formal, and substantive aspects, were adapted to statistical needs. In particular, these were actions such as the identification of study units (recognition of units and assigning artificial unique identifiers), unification of the format and structure of sets, and standardisation of variables, i.e., recoding raw variable values into codes compliant with classifications in force in statistics, including, for example, the conversion of various address records into appropriate codes of the nomenclature of the territorial division used in public statistics - TERYT.
The process of acquiring and adapting registry collections took place in the OBM and was repeatable as updated registry collections were delivered.
2. Creation of a (pre)census list based on register data. To be precise, the List of persons, addresses and dwellings is a set of relational tables, each related to a different type of unit under observation (i.e., persons, dwellings, and buildings). This list was the main source of information about the units that were to be enumerated and, as such, served as the operational register of the census and a key tool for managing the census, controlling the completeness of the enumerated persons, buildings, and dwellings, as well as being the basis of the authentication procedure for respondents in the online self-enumeration. Moreover, the list, before the census survey, was an approximation of the target populations of statistical units determined based on existing register data (predefined target population). The process was carried out within the OBM system. This process was repeatedly carried out (even after the commencement of the census survey) as updated registry collections were acquired (until the data was updated to the reference moment of the census).
After the census survey, based on the data obtained in the census, the (pre)census list was updated – mainly in terms of new statistical units identified in the census survey - providing the basis for creating a post-census list, which constituted the subject core of the census result collections.
3. Creation of Domain Data Sets (DZD), was a process carried out in the OBM involving the full integration of multi-source resources acquired from registries, dispersed in many diverse collections, into data table structures that meet the methodological and substantive requirements of the census survey.
As a result of this process, data from registers were organised into sets whose structures were created according to a model that considered both the types of statistical units being studied (buildings, dwellings, persons, etc.) and thematic groups (domains) within a given type of units. For example, in relation to persons, separate sets were created for thematic domains such as demographic and social characteristics, socio-economic characteristics, disability, migration, foreigners, and economic activity.
The systematic organisation of registry data facilitated the analysis of data within a given domain and ensured the optimisation of processing procedures since some topics only concerned parts of the population (optimal set sizes).
The algorithms used at this stage for generating variables and their validation and correction constituted a further step towards improving the quality of registry data and their optimal adaptation to the needs of developing resulting census variables.
4. The census survey process (questionnaire survey phase) – involved collecting data from respondents, carried out through three main survey techniques: Internet self-enumeration (CAWI), personal interview (CAPI), and telephone interview (CATI). (In the GUS nomenclature, survey data collected by all techniques are collectively referred to as data from “CAxI channels”). This process was executed within the CORstat-census system, which successively and continuously transmitted sets of electronic records from completed surveys to the OBM system.
5. Creation of Data Tables from CAxI Channels – This involved the integration, verification, and cleaning of data from CAxI channels (questionnaire survey). This process took place in the OBM and included transforming the questionnaire data streams received from CORstat-census into elementary data sets corresponding to individual study units and each of the various thematic modules and types of questionnaires for a given type of unit (for example, data about persons were in several modules, and some of them were filled out only for one respondent in a dwelling, while some types of questionnaires were dedicated to special categories of persons, e.g., those temporarily abroad).
The elementary sets were then appropriately combined to form main sets corresponding to the units under study, i.e., persons, dwellings, and buildings, or special parts of the study, i.e., the set of collective living quarters (CLQs) and the set of persons enumerated in them (individuals in OZZ). The next step involved the verification and cleaning of the sets. In this phase of the process, the identification of study units was carried out, consisting of the recognition of units – including, for example, persons without an official identifier (PESEL number) or dwellings with incorrectly or imprecisely provided address data by the respondent – and then assigning them artificial unique identifiers.
In the subsequent step, the optimal data selection (deduplication) was carried out, i.e., selecting the best version of the same survey (in some cases the same survey was edited several times by respondents, and each version was saved in the system) as well as selecting the most adequate and qualitatively best data record for the unit (e.g., by design, persons could have been enumerated multiple times – in different and independent surveys – enumerated by household members and enumerated independently). As a result of this stage, datasets were created (according to the type of units surveyed) containing optimal quality (most adequate and reliable) data from the questionnaire-based census survey (Data Tables from CAxI channels), intended for processing census results.
6. Post-enumeration control survey process - this was a representative survey on a sample from the enumerated population, whose main goal was to analyse the quality of the basic census survey in terms of assessing errors in the measurement of characteristics, i.e., the content of responses obtained in the main survey. At this stage, the following main groups of processing activities can be distinguished:
Selection of the register sample for the survey: Based on the data developed from CAxI from the main survey (CAxI data), a sampling frameregister was created for the control survey according to the guidelines/concept of the control survey, and then a sample of individuals for the survey was drawn and transferred to the CORstat-census system.
The implementation phase of the control survey in the field, serviced by the CORstat-census system, ended with the transfer of collected data to the OBM.
Creation of the Set for Analysis - Based on data from the control survey and the corresponding data (appropriate records and variables) from the main survey, which formed the basis for conducting comparative analyses and ultimately developing reports on quality assessment.
7. Creation of the Resulting Census Data Sets (WZDS). The WZDS is the proper and essential database from the perspective of the objectives of the population census, comprising several tables of data interconnected by relationships, corresponding to all types of statistical units that were to be characterised based on the 2021 population census. The WZDS is the basis for deriving all final results of the 2021 population census. Therefore, the WZDS compiled the appropriate data records representing (or containing) all the target population units and the appropriate variables representing all the census topics. In the WZDS, appropriate indications (flagging) of units (records) belonging to appropriately defined target populations surveyed in the census were also made (see the "Estimation" section point).
The Resulting Census Data Sets (WZDS) were created based on developed and implemented structures (lists of needed variables) and algorithms for deriving variable values. In terms of data records, the core of the WZDS became the post-census list, which is the List of persons, addresses and dwellings updated after the questionnaire survey, defining the set of subjects (observations) of the individual data tables to be described with variable values.
The WZDS model provided for appropriate data tables for all types of statistical units studied, including new entities generated only at the stage of WZDS, such as households and families. Correspondingly, for each data table, a corresponding metadata table (known as the "source origin file") was created, matching in structure and the number of records, which recorded the source of origin for each individual variable value in the data table.
In the process of processing the WZDS, two essential (sub)stages can be distinguished: 1) processing in the OBM environment (protected) and 2) processing in the ABM environment.
In the protected environment of the OBM, all operations for processing in the WZDS that required access to source data (CAxI and DZD data) and the full set of identifying features (such as surnames, complete addresses, etc.) were performed. Calculating the values of the WZDS variables in the OBM environment was done by executing algorithms that referred to data prepared in earlier processing stages, encompassing two main sources of data: data from the questionnaire survey (CAxI data) and registry data organised in domain tables (DZD). Depending on the nature of the variable – for instance, whether official information (such as birth date) or information declared by the respondent is more reliable – the algorithms assumed the priority of CAxI or DZD respectively. The decisive criterion was the availability of information in the priority data source, and for example, in the absence of preferred information from the respondent, data from the DZD (registry data) was used. In some situations, it was necessary to use complex algorithms, such as those distinguishing original types of sources (specific types of registers), containing conditional selection instructions, comparing values from different sources, or referring to other auxiliary variables.
In the OBM environment, where information identifying persons were available, new objects (statistical units) such as households and families were generated, and separate tables within the WZDS were created for them (details in the "Generation of households and families" section).
After completing the necessary operations in the OBM, the WZDS tables underwent a data anonymisation procedure. This involves generating new versions of data tables stripped of variables that enable direct identification. These tables were then exported to the ABM, along with their corresponding metadata tables.
In the ABM environment, the data tables and the WZDS transferred from the OBM were given a new data structure, expanded with new resulting variables. The algorithms for calculating the values of new variables in ABM operated exclusively on data available in the WZDS. At this stage of WZDS processing, the algorithms generally included secondary transformation procedures, which consisted of generating new versions (new necessary representations) of existing variables, for example, by grouping (creating broader classes) of original variables, but also creating new derivative variables based on the values of two or more source variables. In the ABM environment, the majority of such WZDS variables were created that required inter-object transformations, i.e., based on operations of retrieving, compiling, and transferring values between different types of statistical units (e.g., between persons, families, households, dwellings, etc.).
The processing of the WZDS in both the OBM and the ABM was divided into multiple steps, and the WZDS processing stages in both environments (OBM and ABM) were iterative (repeatable). After each step or iteration, validation actions were carried out, based on rules for checking correctness, and generating and analysing reports that check the effect and quality of the transformations. Based on these, algorithms for calculating variables were corrected (refined), and, if necessary, additional data editing operations were carried out, i.e., corrections and filling in missing values (details in the "Record editing" section).
All processing algorithms, both assigning and modifying a given variable value, contained instructions for recording in the corresponding place of the metadata table (“origin file”), indicating the source of the value's origin.
In the ABM environment, where the WZDS data obtained their final form, appropriate meta-information (various types of dictionaries) was developed for them, which allowed for the proper interpretation of codes – variable values in the data tables.
Based on the WZDS data and metadata in the ABM environment, appropriate data needed for the analysis and dissemination of census results were generated, including reports and publication preparations, such as data (e.g., hypercubes) for reports and statements on quality.
Capturing
The census in Poland was conducted based on questionnaire surveys and data from registers, according to the assumptions (included in the census law). The data obtained from each source, including from the census survey (census questionnaire), were in electronic form. Paper questionnaires were not used.
Coding
Most questions in the census questionnaires were pre-coded, meaning that directly under the questions, there were proposed answers in the form of multiple-choice options (cafeteria) or (in the case of a larger number of possible answers) drop-down lists (answer dictionary). Selecting an answer resulted in the recording of the appropriate code in the electronic data sets (CAxI data sets). The questions, along with predefined answers, were consistent with accepted national and international definitions and classifications, or sufficiently detailed to be transformed (e.g., aggregated) into the expected classifications.
For a few questions in the census questionnaire, the possibility of free text entry was used, which required the application of procedures (generally automatic) for classifying and coding according to an algorithm developed by experts.
Regarding data from various registers, it was necessary to apply procedures to unify (standardise) the non-uniform records and adjust them to the definitions and classifications applicable in the census.
Identifying variable(s)
In the processes of processing census data, many different types of identifying variables were used to identify various kinds of statistical units. Among these, one should distinguish between primary (natural and raw) identifying variables, which exist and are used also outside of public statistics and the census itself, and variables – artificial identifiers, created solely within and for the needs of census data processing systems. The use of artificial identifiers was necessary to optimize the process of integrating data from various sources, including detecting duplicated data records for some units. Moreover, artificial identifiers allowed maintaining data integrity in the phases of processing and analysis carried out after the data anonymisation procedure (removing variables allowing direct identification of units), i.e., they ensured a relationship between records of different datasets while simultaneously preventing the direct identification of units. In relation to persons – before the assignment of artificial identifiers – the official identifier used in the population register in Poland – PESEL number, was primarily used for the identification of units, and in addition – especially in its absence – natural identifying characteristics such as first names, last names, parents' names, as well as secondary identifying characteristics, such as date of birth, sex, citizenship, etc. As part of the procedure for identifying and recognising units (detailed description of the procedure in the "record linkage" section) all data records about persons from each source set were assigned an artificial identifier – Unique Statistical Number (UNS), valid in all statistical sets and in all subsequent processing stages. For family and households, only artificial identifiers (id_household, id_family) were used, assigned at the time of creating sets (delineating) households and families (more detailed information in the section concerning the generation of households and families). In relation to dwellings and buildings, artificial identifiers were also used, necessary for optimising the process of data integration and ensuring relationships between records of different datasets.
Record editing
In the census questionnaire (census application), a set of rules was applied to enforce consistency and logical coherence of the responses given. Similarly, the algorithms for calculating (creating) variables in the result datasets assumed a certain scope and logical coherence of the values assigned (see section 18.4 "Data validation"). In addition to this, during the processing of the Resulting Census Data Set (WZDS), a series of actions were taken aimed at validating and then correcting the data. Rules were applied to check the content of variables and their mutual relations, as well as to generate reports presenting distributions and variable relations in stages. Depending on the situation and processing phase, corrections to the basic algorithms were made or data correcting procedures were implemented. Most corrections were automatic and resulted from the logical assumptions of relationships between variables; in rare cases, such as outliers or less credible values, ad hoc point corrections were applied. All types of data corrections were performed using computer instructions, meaning that at no stage was direct (manual) editing in data records applied.
Generally, most data for census topics were obtained from the questionnaire census survey, and in cases where data was not obtained in the survey, it was sourced from registry sets. This procedure ensured sufficient completeness in relation to most census topics (characteristics) concerning persons. However, there were a few cases of individual characteristics that were not represented at all in registry sets or were represented only fragmentarily, resulting in too large (unacceptable) a number of missing values.
In cases of excessive data missing for variables, positional imputation procedures were used. An example of such a characteristic was education. In the absence of data on the level of education from the basic sources of the census, i.e., data from respondents (questionnaire census survey) and data from registries, statistical imputation was applied. The hot-deck imputation method was used, i.e., replacing missing data with values from other complete (with known education variable values) census dataset records. The assignment of data was carried out within common imputation classes (groups), to which recipients (records with missing data) and donors (records with known values of the education characteristic) were assigned, according to the adopted criterion of proximity. The allocation of donors and recipients to common imputation groups was based on auxiliary characteristics (proximity criterion in the context of education) such as sex, age, and the unit of territorial division of the country. Assigning important feature values to recipients was done randomly – with a probability proportional to the distribution of feature values among donors – within the common imputation class.
Imputation of variables related to dwellings and buildings was done by deductive method. This concerned variables such as useful floor space, number of rooms in an dwellings, title of ownership, water supply system, flush toilet, type of heating and bathroom. In the absence of data for the aforementioned variables from basic census sources, i.e., data from respondents (questionnaire census survey) and data from registries, deductive imputation was applied - missing data were supplemented with values calculated from other complete census dataset records or deduced based on other variables completed for a specific dwelling.
Record imputation
No data records about persons were imputed.
Record deletion
In the Resulting Census Data Set (WZDS), no data records about persons were removed.
Estimation
All planned estimates – calculating the counts of the surveyed populations, distributions of characteristics, and other statistical operations needed to present the results – could be performed based on census data sets, which contained individual data records for all statistical units entering into the surveyed census populations (individual enumeration – data for the total population). Records of each set of statistical units contain assignment to territorial units and spatial location, which allows estimating the size of census populations and their characteristics for all necessary territorial arrangements (universality within the defined territory).
Determining the census population of persons, i.e., including persons in the usually resident population, as well as their spatial location, was based on appropriate algorithms, which primarily took into account respondents' answers to a sequence of questions regarding their place of residence, the nature of the demand (permanent vs temporary), and the time spent in a given place.
For persons who were ultimately not covered by the questionnaire survey, separate algorithms were applied, based on the content of registry data, including data on registration addresses and other information concerning signs of life (number of registries, types and specificity of registries). In the algorithms determining the place and nature of the stay of persons available exclusively in the registries, separate paths and hierarchies of source credibility were applied depending on the population segment (e.g., foreigners, children, people of mobile age, seniors).
For this separately identified the usually resident population, all required characterising features were individually determined, which allowed for any combination and cross-referencing of features. Determining (calculating) the value of each characteristic of persons was carried out based on algorithms, which, according to the specificity of the characteristic and the availability of a given type of source, reached to the results of the questionnaire survey or registry data (detail in the description of stage no. 7, in the section "Processing stages").
Determining the population of dwellings was preceded by the verification of dwelling addresses conducted during the preparatory work for the census. Information collected during the census on the size of dwellings, the period of construction of the building in which the dwelling is located, and the equipment of dwellings with technical and sanitary installations allowed for assessing the standard of dwelling stocks. The analysis of information characterizing the dwellings, including in particular conventional dwellings, with information concerning the people who inhabit them, allowed for determining the housing arrangements of the population residing in the country as of March 31, 2021.
Record linkage including identifying variable(s) used for the record linkage
Generally, individual datasets for various statistical units (e.g., persons, families, dwellings, buildings) within the same processing stages were organised in the form of relational tables. Establishing relations between tables within the same processing stage (ensuring data set integrity), but also transferring relevant data between subsequent processing stages was made possible through the use of artificial (technical) unique identifiers (relationship keys), for each type of statistical units. The assignment of artificial identifiers took place within the procedures of identifying (recognising) statistical units, which were implemented in relation to all source data (data from registers and later from the questionnaire census survey), entering the processing system (operational microdata base - OBM)), especially at processing stages numbered: 1, 2, 3, and 5.
The recognition of a unit and the assignment of an artificial identifier took place through the pairing (linking) of raw data entering the processing system with the already existing reference tables of units in the system, which contained artificial identifiers.
The initial reference unit tables were initiated based on chronologically earliest introduced registry sets (e.g., the set of persons from the PESEL registry) deduplicated and assigned an artificial identifier (subsequent integer number). In the process of introducing subsequent registry sets, the reference unit tables were donors of artificial identifiers for records from new sets, and at the same time, they were expanded by additional records (units) identified in new sets. Units from the raw (input) set, after successful linkage, received an artificial identifier from the reference table. On the other hand, units that did not pair with the reference table were subject to verification (in terms of the credibility of the unit's existence), and then were added to the reference unit table, expanding it as new identified statistical units (new records) recognised in the processing system.
These procedures were repeated for each newly acquired registry set and for updated versions of sets. As a result of identifying registry sets, it was possible to create a (pre)census personal-address-residence list, which – among other functions – served as reference tables for identifying statistical units within data acquired from the questionnaire survey (CAxI data).
The process of linkage new sets entering the system with the reference unit tables was carried out based on natural identifying variables available in the raw (input) data sets. In the case of persons, identifying linkage took place – depending on availability – on the basis of such data as PESEL number, first names, surnames, date of birth, sex, mothers' names and surnames, address data, citizenship, country of birth, etc. The result of linkage records concerning persons was to assign the data records a Unique Statistical Number (UNS).
Most records concerning persons were paired unambiguously (deterministically) by PESEL number. In the absence of a PESEL number, data were paired by other characteristics of the persons. Due to the varied availability of other identifying features, their incompleteness, diversity of records and errors in records (e.g., first names, surnames or even dates of birth), multilevel and multi-stage linkage methods were used, based on measures of similarity in the vector of identifying features and the probability of record match. Algorithms for using the measure (indicator) of similarity considered the variability of availability of identifying features and various functions of comparing strings of characters, patterns of numerical recording errors (e.g., dates of birth), as well as qualifying critical values – thresholds for acceptance/rejection of conformity.Początek formularzaDół formularza
Generation of households and families
The delineation of households and families was carried out in the course of a secondary process, based on a previously prepared set of persons, i.e., the determined usual resident population. Essentially, the generation of new units was accomplished by combining (grouping) persons, i.e., assigning them to new units – households and families, giving these extracted units unique artificial identifiers, and creating records for them in new datasets (tables) provided respectively for households and families. In terms of households, the housing concept (definition) of a household was adopted. Consequently, the delineation of households was based on an algorithm that refers to the housing criterion, according to which all people living in one dwelling (common dwelling identifier) were included in one household.
The delineation of families – according to the adopted definition – was carried out within the people belonging to the same household (same dwelling), based on a complex algorithm, considering various data about persons. The basis of the family generation procedure was the processed and appropriately assigned data about the relations between persons, obtained from respondents in the questionnaire survey (CAxI data), in which respondents listed together – within the framework of so-called one housing survey – could indicate among themselves spouses/partners and parents. In the absence of appropriate data from CAxI, reference was made to relevant data on marital relations and parent-child relations, available in processed registry collections. In the absolute absence of appropriate data on relationships between persons, attempts were made to determine them on the basis of probability using various individual auxiliary characteristics of persons such as surnames, sex, and age (generational groups), as well as appropriate interpretation of data on the composition of the household.
For the created data records about households and families, variables characterising these units were generated, generally calculated by applying various aggregating functions operating on the groups of data records about persons belonging to the household/family, e.g., counting the number of members, the number of specific categories of members, such as children, etc.
Measures to identify or limit unit-no-information
Factors conducive to minimising unit-level data deficiencies in the census (omissions of units belonging to the census populations) include the solutions adopted for the implementation of the census, primarily the use of a large amount of data resources from registers and information systems, and a wide range of methods for their exploration, as well as a high degree of implementation (completeness) of the questionnaire-based census study. Moreover, solutions and actions taken in identification (recognition) of statistical units within the entire resource of collected data seem to be significant in this matter.
For 2021 Census purpose data from an electronic application obtained from respondents and data from registers and administrative systems were used. A detailed description of the sources is presented in the sub-concepts 18.1.1 - 18.1.4.
Data on population and housing censuses are disseminated every decade.
The census reference date is March 31, 2021.
Population by grid was published in December 2022.
Final census data for dwellings, population, households and families by layout and breakdowns in accordance with EU implementing regulations available by March 31, 2024.
The definitions adopted and the breakdowns for themes ensure comparability of results at EU level.
Geographical area (GEO)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
SEX (SEX)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Age (AGE)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Legal marital status (LMS)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level. Marital status was defined for persons aged 15 and over and was defined as marital status according to Polish law (the Law on Civil Status Acts). Polish law allows women from the age of 16 and men from the age of 18 to marry.
Polish law does not allow same-sex marriages. As a result, the Polish census did not provide information on same-sex relationships, neither in law nor in fact, in marriage. Polish legislation does not provide for registered partnerships.
Household status (HST)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level. No information has been developed in the classification of a topic specified as optional for same-sex marriages/partnerships
Persons living in a household, undetermined category – in the Polish census there are no persons classified in this category (there are only people living in the household: in the biological family or outside the biological family).
Persons not living in a household, undetermined category – in the Polish census there are no persons classified in this category (there are only people living in a private household and not living in a private household: in collective living quarters and homeless persons).
Family status (FST)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level. No information has been developed in the classification of the topic specified as optional.
According to the definition only first-degree relationships between children and adults (between parents and children) are taken into account to determine families.
Polish law does not allow the registration of partnerships. Therefore, the Polish census did not compile information on partners in registered partnerships. There was also no information on same-sex couples.
Item “undetermined” – in the Polish census there are no persons classified in this category (there are only persons
of established position in the family and non-family members).
‘Not applicable’ – includes persons who do not form a biological family.
Size of family nucleus (SFN)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Type of household (TPH)
The adopted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level.
Size od private household (SPH)
The accepted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level
Educational atainment (EDU)
The accepted definition of the concept and the levels and categories of breakdown ensure comparability of results at EU level
Size of the locality (LOC)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Place of birth (POB)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Country of citizenship (COC)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Year of arrival in the country (YAE)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Place of usual residence one year prior to the census (ROY)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level
Current activity status (CAS)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Status in employment (SIE)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Occupation (OCC)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Industry (IND)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Location of place of work (LPW)
The adopted definition of the concept, as well as the breakdowns, ensure the comparability of results at the EU level.
Tenure status of households (TSH)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Housing arrangements (HAR)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Type of living quarter (TLQ)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Occupancy status of conventional dwelling (OWS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Number of occupants (NOC)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Type of ownership (OCS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Useful floor space (UFS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Number of rooms (NOR)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Density standard (floor space) (DFS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Density standard (number of rooms) (DRM)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results
at the EU level.
Water supply system (WSS)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Toilet facilities (TOI)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Bathing facilities (BAT)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Type of heating (TOH)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Dwellings by type of building (TOB)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.
Dwellings by period of construction (POC)
The adopted definition of the concept and the levels and categories of division ensure the comparability of results at the EU level.