enEnglish
CROS

Automatic Editing (Method)

Summary

The goal of automatic editing is to accurately detect and treat errors and missing values in a data file in a fully automated manner, i.e., without human intervention. Methods for automatic editing have been investigated at statistical institutes since the 1960s. In practice, automatic editing usually implies that the data are made consistent with respect to a set of predefined constraints: the socalled edit rules or edits. The data file is checked record by record. If a record fails one or more edit rules, the method produces a list of fields that can be imputed so that all rules are satisfied.

In this module, we focus on automatic editing based on the (generalised) Fellegi-Holt paradigm. This means that the smallest (weighted) number of fields is determined which will allow the record to be imputed consistently. Designating the fields to be imputed is called error localisation. In practice, error localisation by applying the Fellegi-Holt paradigm often requires dedicated software, due to the computational complexity of the problem.

Although the imputation of new values for erroneous fields is often seen as a part of automatic editing, we do not discuss this here, because the topic of imputation is broad and interesting enough to merit a separate description. We refer to the theme module “Imputation” and its associated method modules for a treatment of imputation in general and various imputation methods.

 

To read the entire document, please access the pdf file (link under "Related Documents" on the right-hand-side of this page).

 

Your feedback is appreciated. Please send your remarks, suggestions for improvement, etc. to memobust@cbs.nl.