Automatic Coding Based on Pre-coded Datasets (Method)


For a number of variables in questionnaires, one wants the answer in closed form, e.g. "city"; this is a relatively simple classifying task. Sometimes this task is much harder, e.g. when trying the get a code for occupation. One approach is to ask an open question ("what is your occupation") and then try and code this text at the statistical office. For the sake of efficiency, that coding process will start by an automatic step.

Here we will describe the coding of open text answers based on existing sets of correctly coded answers. We will briefly look at some existing techniques and then focus on one method in more detail as an example.


To read the entire document, please access the pdf file (link under "Related Documents" on the right-hand-side of this page).


Your feedback is appreciated. Please send your remarks, suggestions for improvement, etc. to memobust@cbs.nl.