The purpose of this phase is to build and integrate all the components and elements of the site in accordance with all applicable requirements and constraints.
This section provides detailed information on the actual work to be done to build a website during the lifecycle and process of site development and construction. This implementation-related information is broken down into various technical steps and activities, not all of which may be applicable for all the different types of websites that are covered by the IPG. The different standards that apply and other guidelines to be followed in order to prepare an IPG-compliant website are important inputs to these various activities.
We explain here how to publish/update website contents on the web hosting infrastructure of DIGIT/C1 ISHS.
DIGIT/C offers a high availability and high performance hosting infrastructure that is being comprised, among other elements, of back-end web server instances and application servers for hosting and serving both static and dynamic sites.
The dynamic sites supported by the standard Apache web servers are mainly sites based on Coldfusion and in some cases also sites using CGI scripts.
Dynamic sites based on particular technologies (i.e. Weblogic) are being hosted on individual application servers and are being integrated with the other related sites using reverse proxy mappings.
Direct HTTP access to the back-end web servers hosting the static sites is denied by the standard web server configuration.
Thus under the standard configuration it is possible to access the hosted sites only through a reverse proxy URL. (i.e. http://europa.eu/epso).
Purely static sites (sites without Coldfusion or CGI elements) depending on their user audiences (internal/external) are being hosted on one of the two core shared hosting infrastructures, namely the europa.(DOT)eu and Intracomm hosting infrastructure.
Each static web hosting infrastructure is comprised of two web hosting environments.
Notably an acceptance (staging) and a production hosting environment constitute the minimum infrastructure necessary for website publishing operations.
The underlying static sites hosting infrastructure can be fully failed-over between two remote Data Centres in Luxembourg.
Dynamic sites currently associated with the standard Apache web servers can fall under one of the following categories:
For CGI sites in addition to the shared acceptance and production hosting environments, a dedicated development hosting environment is also available, mainly to facilitate the maintenance of the existing CGIs.
Development web server hosting environments are also offered for the implementation of Coldfusion solutions which are preferable over CGI solutions even in simple cases like the construction of contact web forms where no database layer is involved.
Apart from the acceptance and production hosting environment types it is also possible to request other hosting environment types (i.e. test, training, etc). It should be mentioned that for each Coldfusion server instance a peer Apache web server instance is configured as the entry point to the Coldfusion applications hosted by the peer Coldfusion instance.
Thus the web servers are being configured to function as a front-end to Coldfusion servers.
Finally dynamic sites can be also developed for hosting by the application server infrastructure technologies not related to the standard Apache web servers (i.e. Weblogic or Oracle Application Server).
For ease of management and maintenance, we strongly recommend to follow a set of rules. These rules are not necessarily of any significance for the normal user browsing the web pages. However an easier management and maintenance will lead to reducing errors which in the end will benefit the user.
These recommendations should benefit all parties involved (users, webmasters and the hosting infrastructure services staff).
Keep it simple
Dynamic contents VS static contents
Static HTML pages will always be served faster than the same HTML information generated through a programmed interface. This is especially true for pages that are requested frequently and can be cached.
We strongly recommend using applications only to serve information that is really dynamic. A HTML page that changes once a day cannot be considered dynamic information.
The DIGIT/C1 hosting services run on UNIX based servers. File names are consistently case sensitive on UNIX systems, unlike on some proprietary systems. Creating hyperlinks with UPPERcase or mixed case might create problems when the pages are transferred from a Windows system to the web server. We strongly recommend using only lowercase in filenames and hyperlinks, even when generating data on a UNIX system, because this data might later be maintained from a Windows system.
Application.cfm is an important exception. This file is the base of every ColdFusion application, and this filename is most definitely case sensitive. In addition, special characters other than the underscore ( _ ) and the dot ( . ) characters should not be used and the length of filenames should be kept as short as possible and never be longer than 255 characters. Furthermore the length of URLs should be also kept as short as possible and never exceed 2000 characters.
Each static site should have a file called "index.html" or "index.htm" as the first entry point into it. This allows a user to return to the entry page of a site by simply truncating the URL. Or, the user can access a site using a shorter, truncated, URL. It also makes the entry point to the document visible for the maintainer of the data. For example, the web server will respond to the URL "http://europa.eu/mysite/" with the data in file "/ec/prod/app/web/euroots/europa.eu/htdocs/mysite/index.html".
If there would not be an "index.html" file in that sub directory, then the user would receive either a directory listing all files within the "mysite" directory or a 'Not Found' (404) or a 'Forbidden' (403) page depending on the configuration of the underlying web server.
For multilingual sites, index.html should be a splash page with links to the individual language index pages (index_en.htm, index_fr.htm, index_el.htm, etc.)
The full list of default index file names is:
"Incomplete" URLs pointing to directories instead of to files should have an "ending slash" (e.g. "/publishing/" instead of "/publishing"). Upon reception of an "incomplete" URL without "ending slash", the web server will respond with a "redirect", telling the browser to request the URL with the "ending slash". In other words, an "incomplete" URL will trigger an extra request to the server causing longer response times for the end user.
Use relative links (intra-domain links)
Relative links should be used instead of absolute links when linking between pages within the same site domain.
Do not include "http://www.cc.cec/", "http://europa.eu", "http://ec.europa.eu/" and the likes in the 'href' links referring to pages located within the same site.
The consistent usage of relative links makes sites easier to maintain, avoid broken links and do not require adaptations in the occasion of site domain or URL context changes.
When to use absolute links (inter-domain links)
When adding links to pages hosted outside your own site domain then an absolute link has to be used.
The frond-end reverse proxy domain names should be always used and never the back-end web server names.
Also make sure that the target domain address in the URL exists and can be resolved by the DNS servers throughout the Commission's network as well as on Internet (for Europa sites).
Although the absolute link might work internally it will not work for users from the Internet since "wlseures.cc.cec.eu.int" can not be resolved by the DNS servers.
The caching policy currently implemented on the DOTEU (europa.eu, ec.europa.eu) reverse proxies is the following:
For webmasters that can not wait for the 8 hours maximum expiration time to elapse before a new image is refreshed in the proxy cache, it is possible to force the cache refresh by issuing in Firefox a Shift+Refresh request for the image from their web browser.
Reverse proxy servers BlueCoat proxy SG 8100-C or 8100-20 (Managed by DIGIT/C2 SNET team)
If your site contents are not suitable to be indexed by search engines (i.e. frequently changing dynamic or outdated contents), then a robots.txt file should exist at the document root of your site (i.e. http://ec.europa.eu/robots.txt).
The access to dynamic sites from the agents of search engines (i.e. Google) knows as robots increases the number of requests received and in extension increases the work load of the back-end web server that can become overloaded and impact the site accessibility.
Here is the contents of robots.txt under the document root of ec.europa.eu:
Although it is possible that some non-standard robots will ignore the robots.txt file, most of them will not and the number of requests received by web robots should be kept under control.
Further information about robots.txt and usage examples can be found at the following page http://www.robotstxt.org/orig.html.
rss files phase-out
Whenever an RSS file is moved/renamed or simply withdrawn from production, the original RSS file name should not disappear because RSS clients will continue polling for the missing feed indefinitely and will cause an impact to the web server where the missing RSS file used to exist.
Separating static pages from applications
In the past, it was a common practice for EUROPA site webmasters to mix their dynamic sites (i.e. Coldfusion) with the associated static sites.
This site management practice results to the distribution of the static contents over the numerous application server environments and it is not possible to have an efficient centralised management of the static sites.
To make this separation possible, a site should be structured so that it will be easy to map the dynamic part as a subsite of the static site.
For example, if the static site xyz, hosted on the doteu infrastructure, is accessible with URL http://europa.eu/xyz the associated dynamic part, hosted on a separate application server, should be accessible as http://europa.eu/xyz/application thanks to an individual reverse proxy mapping to the application server that will seamlessly integrate the static and dynamic site parts.
With regard to IT architecture, infrastructure and the technical aspects (platforms, environments …), the EUROPA site is hosted on the servers of the Commission Data Centre, managed by the Commission’s Directorate-General for Informatics (DIGIT).
EUROPA websites are developed and managed using different technologies: manually encoded static sites, sites managed by means of the Corporate Web Content Management System (CWCMS) and dynamic sites.
A static website is composed of finished HTML pages that are completely replaced with each change of any element within the page. It means that a page retrieved by different users at different times is always the same.
Static HTML pages will always appear faster on the user's browser than the same HTML information generated through a dynamic application. This is especially true for pages that are requested frequently and can be cached.
These static sites are constructed and/or managed manually by means of the HTML editors Webexpression (ex-FrontPage) or Dreamweaver. The static pages (HTML file format) are stored on the static web server in a set of conventional directories.
The access to the testing environment and for triggering transfers onto production server can be granted following the request procedure for access for transfer to EUROPA.
The Corporate Web Content Management System (CWCMS) is the recommended system for producing websites on EUROPA. It allows to decentralise the web publishing process by
whilst at the same time offering a centralised approach to website construction including:
The CWCMS is based on Documentum and its web publisher interface. Content providers use this environment to manage the content of their pages and publish them on the static web server. The end result is static pages on the same environment as the one used for the manually encoded static pages. In this way, pages produced by means of the CWCMS are well integrated in the classical EUROPA environment and take advantage of its high performance.
The access to the CWCMS for triggering transfers onto production server can be granted following the request procedure for access to CWCMS.
A dynamic web page is a page that can be generated dynamically, i.e. "on the fly" (the moment the page is called by the visitor) by an application server. It often contains content from various sources (relational or object-oriented databases, XML files ...)
Dynamic sites are not the preferred technology on EUROPA. They should only be used if the needs imposed on a website cannot be satisfied by the recommended CWCMS technology. Although it is clear that some parts of every site may require some dynamic treatment, overzealous use of dynamic content should absolutely be avoided in order to avoid an unreasonable burden on the server infrastructure, causing the site to become slow, heavy and undependable. Whenever possible, the main part of the site should use the CWCMS whereas specific sections may call upon off the shelf interactive services made available through the "Flexible Platform" environment (FPFIS).
Dynamic sites are managed by means of web applications based on either Coldfusion or Weblogic (J2EE) technology and accessing Oracle databases.
A static website can be created by means of the Corporate Web Content Management System (CWCMS) or by using one of the recommended classical HTML editors Microsoft Webexpression (old Frontpage) or Dreamweaver.
The Corporate Web Content Management System (CWCMS) should be the first choice because it is compliant with our long term strategy to create the content in XML format, allowing easy reuse of the same information in multiple formats and different presentations and because it automates the multilingual management. If you still use a classical HTML editor, it is important to take advantage of the structural features of the tools, like Frontpage includes or Dreamweaver libraries in order to allow reuse of information.
When creating a site, use must be made of available templates in order to respect the corporate identity of Commission sites. The templates can of course be adapted within the limits of the rules described in the template page. In the CWCMS, the standard templates are integrated in the central XSL's and constitute thus the default presentation of any site created with this tool.
There are two kinds of static HTML templates:
In the CWCMS, content is published to static HTML pages by means of the XSL transformation process. The end product is first put on the EUROPA staging server and eventually promoted to the production server. If needed (f.e. to ensure multilingual coherence), the system will regenerate pages automatically.
In the classical HTML editing environment, pages are first created on your local equipment and can be uploaded to the staging server when they are ready for testing. Publishing the pages to the production server can be done by means of an internal tool called “Staging manager”.
A dynamic web page is a page that can be generated dynamically, i.e. "on the fly" (the moment the page is called) by an application server. It often contains content from various sources (relational or object-oriented databases, XML files …).
Dynamic generation is useful or even necessary in the following cases:
A static website is composed of finished HTML pages that are completely replaced with each change of any element within the page. The update of the page is done offline and a new upload of the complete page is required to reflect the change on the production server. The upload can be done manually or automatically (for example regeneration by the CWCMS) and is always done in asynchronous mode, i.e. forced by the producer, not by the visitor.
In a dynamic website, the finished HTML page is only generated at the time of request by the visitor. In general, the layout of the page is fixed beforehand, but the content is filled dynamically from various sources (XML files, databases …). Update of dynamic sites is much easier because it concentrates on the update of the actual content itself, not on its presentation.
A dynamic website typically costs more in terms of development because of its complexity and in terms of machine resources because of frequent regeneration. It can however be a more cost effective solution in the long run and is certainly the most effective solution for sites with content requiring frequent updates.
A website is called “dynamic” when its content is generated dynamically.
When elements of website are moving or if some pages interact with user then website couldn’t be called “dynamic”. Indeed, it’s an animated or an interactive website.
These sites interact with the user usually through either a text-based or graphical user interface. For example, a page with a contact form is interactive but can be static.
Animated websites are often developed in Flash or DHTML. These technologies don’t request an application server. Flash needs a plug-in in the browser while DHTML is able to be performed by any browser.
Use and integrate the whole range of available services like:
Usability testing is a method by which users are asked to perform certain tasks in an effort to measure the website's ease-of-use, task time, and the user's perception of the experience.
Usability testing can be done formally, in a usability lab with video cameras, or informally, with paper mock-ups of website. Changes are made to the website based on the findings of the usability tests. Whether the test is formal or informal, usability test participants are encouraged to think aloud and voice their every opinion. Usability testing is best used in conjunction with user-centered design, a method by which a product is designed according to the needs and specifications of users.
Usability testing allows you to measure the quality of a user's experience when they interact with your website. It’s one of the best ways to find out what is or isn't working on your site.
Make sure the scenarios are clearly written and not too much of a challenge for the allotted test time.
Unit A.5 - EUROPA site