There are 7 metadata on EUROPA webistes that are compulsory: "Content-Language", "description", "reference", "creator", "classification", "keywords", "date".
Information resources must be made visible in a way that allows people to tell whether the resources are likely to be useful to them. Metadata is a systematic method for describing such resources and thereby improving access to them. In other words, it is data about data. If a resource is worth making available, it is also worth describing it with relevant metadata so as to maximise the ability of information seekers to locate it. This makes metadata extremely important in the World Wide Web. While the primary aim of metadata is to improve resource recovery, metadata sets are also being developed for different reasons, including:
- administrative control
- personal information
- Management information
- content rating
- rights management
Whether in the traditional context or in the Internet context, the key purpose of metadata is to facilitate and improve retrieval information.
Search engines consist of a software package that crawls the Web, extracts and organises the data in a database. People can then submit a query using a Web browser. The search engine locates the appropriate data in the database and displays it via the browser. Search engines usually have three major elements:
- The spider, also called the crawler, harvester, robot or gatherer. The spider visits a web page, reads it and then follows link to other pages within the site. The spider returns to the site on a regular basis, such as every month or two to look for changes/updates.
- The index. Everything the spider finds goes into the index. The index is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.
- Search engine software: this is the program that sifts through millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.
At the global level, Internet search engines were developed to search across multiple websites. Unfortunately the results offered are often big in numbers, but not very relevant. This is what information scientists call “high recall” but “low precision”. The low precision refers to not being able to locate the most relevant and useful documents. The introduction of the <META> element as part of HTML coding was in part an attempt to encourage search engines to extract and index better structured data such as description and keywords.
The metadata are also very important for the EUROPA search engine. They are used to categorise the results of a query (Content-Language and Classification) and to influence their ranking in the result list (Keywords, Description and Date).
Because of its prime importance in assisting information retrieval, metadata should be among the first things to consider where creating a site.
Metadata are one of the first steps you have to follow when creating a new site and must be present on all pages.
Developing a website without metadata is like stocking a library without providing an index system.
Descriptive META information has many benefits. They can make information easier to locate by providing search tools with more detailed indexing information, rate information to protect minors from viewing certain content, as well as a variety of other things. It is also related to the management of a website in that it helps provide meaning for a document's role in a global or local information space.
Metadata provide information for:
- specification of the character set to be used
- identifying documents (reference, title, language, etc.)
- management and administration purposes (expiry date, author, etc.)
- classification (description, etc.)
On EUROPA, only 7 metadata are compulsory: "Content-Language", "description", "reference", "creator", "classification", "keywords", "date".
Besides these compulsory metadata, the information provider may introduce other metadata if he deems them necessary for management purposes (e.g. "DateAlarm, "WritePermission", "Version").
For the full list of metadata as identified by the "Dublin Metadata Core Element Set", see http://dublincore.org/documents/dces/.
However, it is strictly forbidden to include in the metadata any information relating to the firms involved in designing, producing and updating web pages for EUROPA.
Detailed information on metadata is given in the "" section.
EUROPA templates content 7 compulsory metadata to fill in.
- <meta http-equiv="Content-Language" content="en">
- <meta name="description" content="Content should be a sentence that describes the content of the page">
- <meta name="reference" content="SITE_NAME">
- <meta name="creator" content="COMM/DG/UNIT">
- <meta name="classification" content="Numeric code from the alphabetical classification list common to all the institutions">
- <meta name="keywords" content="One or more of the commission specific keywords + European Commission, European Union, EU">
- <meta name="date" content="Date of creation of the page">
Translation of relevant metadata should be requested in all languages in which one intends to publish a site. Translation availability and translation delays should therefore be taken in mind when planning a new site available in various languages.
Metadata should be compliant with the IPG recommendations.
Further evaluation could be done by running relevant tests of the page(s) concerned in different search engines and measuring the accuracy and effectiveness of the information retrieved.
Work Guidelines and references
General information on metadata:
Further information on multilingualism issues can be requested by email to the EUROPA team.