Data Collection - Main Module (Theme)


Data collection is a “systematic process of gathering data for official statistics” (SDMX, 2009).

It is a very articulated process that develops itself along different steps of the survey process: from the design phase of the data collection methodology through the finalisation of the collected information (GSBPM, 2009), in order to collect data for statistical purposes by using many different techniques that can or cannot be assisted by computer and can or cannot need the support of interviewers (main ones: CAPI, CATI, WEB, PAPI, mail questionnaires and direct observation).

The choice of the technique to use depends on many factors (survey theme, timing of data delivery, difficulty in founding the information required, type of respondents involved, budget, etc.) and it is generally taken during the design phase of the process since the technique influences the way the data collection is carried out as well as the design of the survey questionnaire.

The use of mixed-mode, that is the combination of different data collection techniques for the same survey, can overcome those limitations that are specific of each technique and, if correctly designed, can reduce the unit non response rate.

A general trend among the NSIs is to gather the information they need by using administrative data in order to reduce respondent burden as well as costs. This is because NSIs can take the advantage of using already existing data, stored in public archives hold by other public organisations that have already performed a “data collection” phase, according to their needs and purposes that, anyway, might differ from the statistical ones. This trend is helped by the IT rapid developments in creating tools to facilitate the access to administrative data. Tools like these – the oldest EDI and the newest XBRL – represent another way of collecting data from public institutions as well as from enterprises, since they are based on the exchange of information among the data provider and the NSI on the base of a common and agreed structured data model.

Data collection process is not only a matter of interviewing techniques, but also of contact strategies as well as of monitoring activities: the first set of activities is necessary to get in touch with respondents and may vary according to the type of respondent unit (large or small enterprise, new enterprise, etc.). The second set of activities is important to keep under control the data collection while it is in progress and to take proper actions to improve or modify any factors that may badly interfere with data quality.

At the end of the data collection phase, information is ready to enter the next phase of the survey process, represented by the “Phase 5. Process” of the GSBPM, when data records are cleaned and prepared for the analysis. The way the following steps are faced and performed depends on how data collection is finalised since this depends on the mode(s) used to collect information.


To read the entire document, please access the pdf file (link under "Related Documents" on the right-hand-side of this page).


Your feedback is appreciated. Please send your remarks, suggestions for improvement, etc. to memobust@cbs.nl.