Back to top
National reference metadata

Germany

Reference metadata describe statistical concepts and methodologies used for the collection and generation of data. They provide information on data quality and, since they are strongly content-oriented, assist users in interpreting the data. Reference metadata, unlike structural metadata, can be decoupled from the data.

For more information, please consult our metadata website section.

Close

Census 2011 round (cens_11r)

National Reference Metadata in Euro SDMX Metadata Structure (ESMS)

Compiling agency: Statistisches Bundesamt Telefon: +49 611 75 1 Fax: +49 611 72 4000 poststelle@destatis.de https://www.destatis.de/kontakt/ Gustav-Stresemann-Ring 11 65189 WiesbadenDeutschland

Need help? Contact the Eurostat user support


Short metadata
Full metadata
Restricted from publication
16 September 2014

The EU programme for the 2011 population and housing censuses include data on persons, private households, family nuclei, conventional dwellings and living quarters

Persons enumerated in the 2011 census are those who were usually resident in the territory of the reporting country at the census reference date. Usual residence means the place where a person normally spends the daily period of rest, regardless of temporary absences for purposes of recreation, holidays, visits to friends and relatives, business, medical treatment or religious pilgrimage

Data are available at different levels of geographical detail: national, NUTS2, NUTS3 and local administrative units (LAU2)

9 May 2011

Counts of statistical units

Part of survey: household survey (sample survey)

First, the questionnaires were digitalised. Once these images were available (bit images), they were read-in, interpreted and if necessary edited manually. Free text entries were then digitally signed, with the exception of the ‘occupation’ characteristic. Then the personal data from the electronic survey list (list of all individuals whose existence at a sample address had been confirmed) and the read-in documents were collated. After the exclusion of the questionnaires for which no one in the electronic survey list had been registered at the relevant address during the confirmation of existence phase, the remaining total data sets had to be checked for plausibility. The data sets were checked for missing or implausible entries. Any errors identified were made plausible using deterministic imputation or a donor imputation method. For all processing steps, corresponding quality marks were set to enable the analysis of case numbers and so forth.

When processing questionnaires from the online survey, the initial digitalisation step was not performed.

Coding the ‘occupation’ characteristic:

The household survey of the 2011 census conducted on a sampling basis was set as the data source for the collection of information on occupation . The information on occupation was entered as free text.

First, the free text entries were automatically coded. A software-based comparison of occupation information was carried out using an alphabetical index. Then the occupation information that could not be coded automatically was coded manually using a computer.

Part of survey: special areas (full census)

The paper questionnaires were scanned and the images saved. Part of the documentation was not intended for scanning from the very start (e.g. questionnaires for sensitive special areas), while others could not be fully processed using the scanning method. In these cases the information was entered manually by the Länder statistical offices.

Actually processing the data involved the following steps: collating questionnaires and electronic survey lists, comparing with and linking to the population registers, checking for duplication, determining residential status, checking plausibility/imputation and finally transmitting the data to the reference data set. The procedures differed slightly according to the collection method (sensitive or non-sensitive special areas, barracks).

The plausibility of the data was checked in two stages: first the data sets were run through the test program for the information on housing conditions and for demographic characteristics and then – if the special area address was also a sample address – the test program for the additional characteristics from the characteristics catalogue for the household sample.
In the IDEV online procedure, an initial plausibility test for formal correctness was already applied during data entry. The Länder statistical offices thus received data already tested for plausibility, but which nevertheless still underwent the additional standard plausibility testing during the data processing phase.

In the plausibility tests performed by the Länder statistical offices, a simple imputation was already carried out for missing characteristics categories using the data from the electronic survey list. If this was not possible, the information was taken from the population register. This meant that all that remained for the actual imputation process step were only those personal data sets that represented missing entries (undercoverage). For these, a national frequency distribution was created for each type of area, as soon as a sufficiently high number of returns had been received. This was then used to impute the data.

Household generation

Household and family characteristics were obtained using an automated procedure: information from the population registers and the comprehensive building and housing census carried out as a postal survey of owners was automatically collated to comprehensively generate households. This ‘household generation’ has the benefit of making household and family information – including for very small areas – available without any additional collection of characteristics. It is therefore the ideal tool for obtaining statistical information on households as part of the German census.

Generating households entails: firstly, collecting individuals into families and households by address, and secondly, linking households with actual dwellings at the address in question. As a result, the German census represents residential households. This is consistent with the EU guidelines, but does result in lower comparability at national level, for example with Germany’s annual Mikrozensus, as this looks at economic households. For the EU, in line with the guidelines, household generation only takes into account individuals with a sole or main dwelling, whereas in the German definition individuals with secondary residences are also included (i.e. under the German definition of a household, an individual may belong to several households at the same time).
Statistical register revision (correction) for over- and under-coverage, the scale and structure of which has been estimated from the sampling survey, is also carried out as part of household generation. While doing so, it was ensured that the deletions and imputations in the register data for individuals did not distort household structures.

Household generation takes place in a number of steps as described below:

Step 1: Formation of first household relationships from register information - pointers

The population register contains links (pointers) between individuals. These links provide unambiguous information about specific relationships between two registered individuals. The link can be a marriage between two individuals of the opposite sex or a registered partnership between two individuals of the same sex. Alternatively, the link can be a parent-child relationship or another form of legal agency, whereby children are normally only linked until they reach their 18th birthday.

These links from the population register are used to generate the first multi-person households. After this step, each person on the register is allocated to a provisional household (if no links exist they are allocated to a single-person household), which can change at any stage of the household generation process.

Step 2: Evaluation of dwelling occupant information and first links between households and dwellings

To create links between the dwellings recorded in the building and dwelling census and individuals, an automated comparison is carried out of name fields from the two parts of the survey, which is known as the automated name comparison. To this end, the names of two occupants for each residential unit were requested during the building and dwelling census.

Step 3: Formation of additional household relationships from register information - references

Register information on individuals was used to obtain evidence of further household bonds. This leads to the generation of non-registered partnerships or grandparent-child relationships in addition to traditional household structures. For example, non-registered partnerships are recognised using information on marital status, moving-in date and residential address.

Step 4: Allocation of households to dwellings or existing households according to statistical generation criteria

The remaining unlinked occupied dwellings at an address are allocated to the households at this address not yet linked to a dwelling. This is done using statistical criteria, including on the basis of the household structures extrapolated from the sample.

Step 5: Classifying the households generated

All the households and families created during household generation and the individuals within them are given three different classifications for the following characteristics: type of household, size of household, individual status within the household, type of family nucleus, size of family nucleus, individual status within the family.

Part of survey: building and dwelling census (full census)

Data plausibility testing was carried out primarily through an automated procedure. For each building, it was checked whether the data transmitted were complete and consistent (i.e. plausible). If this was not the case, any errors had to be corrected and missing information added (imputation).

Missing and incorrect characteristics (item non-response) were corrected by:

deterministic imputation, where correction is carried out using unambiguous relationships between plausible and missing/incorrect characteristics.

imputation according to the nearest-neighbour principle using CANCEIS (Canadian Census Edit and Imputation System), an imputation software developed by Statistics Canada.

A number of dwellings missing entirely in buildings were imputed with CANCEIS.

If there was no data for an entire building (unit non-response), it was checked whether a neighbouring building could be used for imputation. The condition for imputation of this type was that the missing building had to be in an area where the buildings are fairly uniform. In these areas (e.g. in detached housing areas), it was assumed that the missing building had similar or identical building and dwelling characteristics to the buildings in the immediate vicinity. In these cases the imputation was based on a neighbouring building.

If the area around the missing building was not uniform, imputation was not possible. In this case the property had to be visited by a municipal interviewer to collect the most important building characteristics.

Data on population and housing censuses are disseminated every decade

All hypercubes were made available to Eurostat on 31 March 2014.

see 3.4