Navigation path

MERLIN Corpus Project

The MERLIN corpus project seeks to provide language teachers, textbook authors and testers with empirical data for exploring second language competence levels


The MERLIN project (2012-2014), co-financed in the framework of Key Activity 2 Languages of the Lifelong Learning Programme, is developing a didactically motivated online platform to enable users of the Common European Framework of Reference for Languages (CEFR) to explore authentic written learner productions for Italian, Czech, and German.

The project addresses the growing needs for illustrating the CEFR and its levels by means of empirical data. It has become the most important European instrument of reference for language teaching and certification. Nonetheless, there is a lot of uncertainty about what the CEFR levels – which everybody is expected to use - actually mean.

The core of the platform is a thoroughly annotated corpus of rated learner texts. The annotation reflects current developments in second language acquisition research as well as indicators and features of the learner language that testers, textbook authors, and teachers reported to the project team in a questionnaire study.

The MERLIN team, composed of linguists, computational linguists, language testers, and language teaching institutions, is currently working on the annotation of roughly 2,500 texts.

Once the MERLIN platform is launched at the end of 2014, all MERLIN data will be freely accessible online. Users will be able to look at full texts and the standardized test tasks they refer to. For each text, an analytic profile of aspects of second language (L2) competence (e.g., grammatical accuracy, vocabulary control, sociolinguistic appropriateness) will be displayed. Furthermore, users can run searches for specific language characteristics that represent challenges in the process of language learning and find out how learners’ mastery of those features relates to the CEFR level system.

For more information, please visit the MERLIN website.