Tutorial:ESS glossary
The ESS glossary is a pilot project in Statistics Explained for building a fully integrated multilingual glossary serving the European statistical system (ESS). ESS members (Eurostat and NSIs) can contribute voluntarily and all members, whether or not participating, can freely make use of results in whatever way they see fit (linking to them, integration in their website, extraction for a publication, insertion in a database, ...), in line with the Mediawiki and Statistics Explained open-source philosophy.
Below an overview is given of the basic principles, the concrete arrangements and procedures and practical information for contributors.
Introduction
Statistics Explained serves three functions for the ESS glossary:
- common working platform for all contributors in Eurostat, NSIs and elsewhere, with precise access control, discussion (talk) pages and excellent versioning facilities;
- central repository of the 'mother version' in English and all derived versions in other languages, page-by-page connected;
- a public user interface, totally web-adapted, Google-findable, multilingual and user-friendly, showing all output immediately following upon a fast and efficient double validation.
Basic principles
- The ESS glossary is user-oriented, not producer-oriented or legal-formal; although agreement with definitions in manuals or legal acts is obviously preferable, this is not the objective nor the main consideration; understandability is! The glossary is the electronic equivalent of a paper publication's list of terms at the end or text box with definitions, serving people reading articles, NOT a collection of 'official' definitions.
- Participation in the project and all contributions are voluntary. There is no timing and no deadline, no imposed order of creating new items in a language, no statistical domain or topic with a higher priority, no language version taking precedence over or having to wait for another one (with the one exception of the the English 'pivotal' mother version, see below). What happens in one language has no implications for what should happen in others. This does not mean, of course, that a random order for creating items is best; an obvious way to rank glossary items is according to usefulness, as measured by the frequency of consultation of the English version (not hard to do, at the bottom centre of each page the cumulative number of visits is displayed).
- New language versions of a glossary page must be faithful and complete translations of the English version - this is necessary to maintain coherence over languages. If no English version exists as yet, it should be created first. If the English version is felt to be incomplete or even wrong, it can be improved and published according to the general double validation rules, together with the new language version. For the next question, whether other non-English versions which already exist then should be aligned to the new English version as well, two cases should be distinguished:
- if an error was corrected, this should also be done in the other language versions (but not by the NSI, just alert the Statistics Explained administrators);
- if it was simply an improvement without the original being wrong, the other-language versions can stay as they were for the time being, awaiting some future alignment of versions (in which case the history of the English original will clearly indicate whether this is necessary). Glossary pages in any language should respect the basic standard structure (see Model:Glossary page).
- Every other-language glossary page or (preferably) set of glossary pages is 'owned' by and the responsibility of the NSI who has created it (in the case of several NSIs sharing a language and jointly creating items this needs an agreement). It becomes a stable page and publicly visible by a double validation procedure: of content(translation from English) by the NSI and format by Eurostat's Dissemination unit. This is quite similar to the procedure for validating articles, with Eurostat production units validating content and Dissemination unit format and respect for standards).
- In line with open-source philosophy, all new glossary pages can be freely used by all, either by linking to them in Statistis Explained or by extracting and re-using them in another environment or format. Logically, the first one who will probably have a use for a new language version of a glossary item is the NSI which has created it.
Concrete arrangements and procedures
Statistics Explained glossary: basic characteristics
The structure of the English version of the glossary, containing some 1350 items and 200 abbreviations, can be found here.
A multilingual Mediawiki plugin made it technically possible to insert the French and German version of the Eurostat Yearbook glossary, resulting in an additional 450 French and German items. The different language versions are connected, just like in Wikipedia, via links under the heading 'In other languages'/'In andere Sprachen'/'En d'autres langues' etc. in the left column. The name of the page is simply the English name with the addition of slash and language code, for instance 'Main Page/de', 'Main Page/el' etc.
The navigation on the top and in the left column is in the chosen language when logged out, but for all languages in English when logged in. This makes sense, because the first view is for users, while the second one is strictly for editors - who are not supposed to be familiar with 22 different languages ...
• NSIs are invited, on a strictly voluntary basis, to use the English version of Statistics Explained glossary items as a starting point for creating glossary pages in their national language(s), using the many functionalities of its wiki platform and storing them in a publicly accessible integrated multilingual ESS glossary in Statistics Explained.
• NSIs can freely opt to contribute glossary pages, logically in their national language(s), but contributions in English or other languages are also possible; the NSI provides a list of persons (for instance a principal contact plus a backup) who for this purpose will be given access and rights to edit pages in Statistics Explained (usernames will have a common structure, e.g. NSI-NL-X-Xxxx).
• Newly-created non-English glossary items should be a faithful translation of the English version. If no English version exists as yet, it should be created first; if the English version is felt to be incomplete or even wrong, it can be improved and published according to the general double validation rules. Glossary pages in any language should respect the basic standard structure.
Page ownership and validation
- Every glossary page created by a NSI is 'owned' and under the responsibility of the NSI which created it; a draft version needs to be 'sighted' for approval by the NSI and then validated by Eurostat-Dissemination to make it publicly visible. To simplify matters, a whole glossary in a particular language can be assigned to a NSI.
- The assignment of glossary pages in languages shared by several NSIs and/or Eurostat will need an agreement on who will assume responsibility (this is similar to the present practice in Statistics Explained for statistical articles co-authored by several units – one has to coordinate and be responsible); examples of languages used by more than one ESS member are: o fr=>Eurostat, FR, BE, LU; o de=>Eurostat, AT, DE; o nl=>BE,NL; o el=>CY,El, etc.
Resolution of disagreements
- Disagreements about a glossary page created by a NSI will be resolved in a discussion between the NSI and Eurostat. As long as no consensus is reached, the glossary page will not become publicly visible. This means in practice that both the NSI and Eurostat have a veto right.
Quality ensurance, updating, language version coherence
- Quality: Eurostat cannot verify content in over 20 languages; nor should it: content is the responsibility of the NSI owning the glossary page; this as well is analogous to current practice for statistical articles where author units guarantee content and Eurostat format and respect of standards;
- Updating/versions 'out of phase': non-English versions must be faithful to the English original, but if this original is amended it is not feasible to simultaneously change all existing versions in other languages; this is a real problem but hopefully infrequent for glossary pages supposed to be rather stable; also there is no real problem as long as other-language versions remain correct if not perfect; real errors, however, should also be corrected in all other-language versions.
- Integration of definitions across institutions and languages: harmonising of terminology and wording, identifying and resolving inconsistencies, disambiguating similar concepts in different statistical contexts (e.g. household, turnover, employment, …);
- Integration of workflows on a common work platform specifically designed for easy versioning and assignment of tasks, roles and responsibilities;
- Saving resources by eliminating double work (why should every NSI create English versions separately?);
- Added value for ESS members, NSIs as well as Eurostat, participants or not: more contextual information can be offered, in the own language (produced cheaply as a translation of English items) and in additional languages which could otherwise not be considered, at no extra cost;
- and, last but not least, added value for users, who have more glossary information available in their own language, normally not feasible for smaller NSIs, plus the possibility for anyone to switch to any preferred language and to compare across languages. It is a great help even when reading an English-language article, to be able to check out definitions in one's own language.
Practical arrangements
This section provides an overview of the practical arrangements to log in and obtain editing rights, how to work in the ESS glossary and how to get pages validated and online.
All procedures are extensively documented in tutorials within Statistics Explained. For an overview of all tutorials, go to Tutorial:Contents (or, from any page, click 'Help' in 'Navigation' in left column, and then 'Editorial help' in second line of Help page opening). Whenever necessary, links to specific further information are added below.
Getting started: login and editing rights
The governance system of Statistics Explained allows controlled access and editing rights for Eurostat staff and selected outsiders (such as external contractors, other Commission DG's or NSI's).
Further information: Statistics Explained tutorials
Creating a new language version
- log in first!
- go to the English version (preferably, although another existing language version might also do);
- click edit, copy the whole markup version;
- create new page: type in search window Glossary:English pagename/target language code (for instance Glossary:Euro area/es for Spanish version of euro area item) and click 'go';
- paste in the English version, save (this results in copy of English glossary page, with headings and lay-out and links all right or easily adaptable);
- insert level-1 heading in target language: =Glossary:Pagename translated=;
- adapt categories at the bottom, replace English pagename with target language one, this timewithout 'Glossary'
- replace English with translated content;
- adapt links if version in target language exists (glossary links, external links)
Note: see existing French and German versions as examples (also in markup - but do not change!), for instance Glossary:Gross domestic product (GDP)/de = Glossar:Bruttoinlandsprodukt (BIP).
Publishing new or updated pages
It also regulates how to publish pages (making them visible for outsiders) following a double validation procedure: content by author unit, other (formal) aspects by dissemination unit.