Merging statistics and geospatial information, 2014 projects - Croatia
Compliance with Regulation (EC) No 177/2008 requires each unit in the statistical business register to be georeferenced to a precise point on a map. However, the register in Croatia only included basic parts of addresses for local units and enterprises and was missing information on geographical location.
The aim of this project was to add information on geographical locations (geo-referencing) to the statistical business register (SBR) held by the Croatian statistical office. In particular the objectives were to:
- upgrade the SBR with geocodes for legal and local units, and indirectly for enterprises;
- connect the SBR with the spatial statistical register (SSR) through a geo-reference code and the address for each legal and local unit;
- establish internal processes required for a continuous automated update (at least once a year) of geo-referenced data;
- publish a geographical presentation of selected SBR data on the statistical office’s website.
Action 1: development of the application for coding streets, upgrading the SBR with new attributes and creating links with the SSR application, retrieving geocodes from the SSR database and assigning them to addresses.
The key part of this project was related to information technology. As the resources were not available in-house for the necessary developments this was outsourced. The main elements included the following.
- The development of an application for coding streets based on a thesaurus of street names as well as maintaining this thesaurus to update it with new variations of street names. An initial analysis of the use of non-standardised street names was made and then a set of standardised street names was developed as well as street codes. Through automatic or manual (see below) procedures, these standardised street names and street codes were implemented in the register.
- The addition of new attributes within the address data in the SBR database and within the SBR user interface. An extension of the existing interface and underlying database made it possible to include more information (notably for the street code, NUTS region and geographical location, as well as more address information) from the SSR in the SBR. These structural changes to the database and interface were implemented not only for the live version of the SBR but also for tables containing historical information on addresses.
- Linking the new application with a live version of the SBR database and the new SSR database and the creation of procedures for automatic coding of streets and the assignment of geocodes to addresses. As such, the linking of information can be done in real time, with matched data taken over from the SSR to the SBR.
Action 2: manual matching of street names from the SBR with official street names in the SSR and populating the thesaurus.
The automatic coding of streets resulted in very low percentage of coded streets (only 40 %) and a lot of manual work was needed to match street names. The problems encountered were:
- a street name did not belong to the settlement where the unit was registered, but was instead located in a neighbouring (usually larger) town or village;
- a street name was ‘old’ — after the registration of the unit in the SBR, local authorities changed the name of the street and this was not reflected in the SBR;
- the administrative source had misclassified a settlement due to the fact that there are places with the same name but in different municipalities or counties;
- the street did not exist in the SSR due to the fact that the register of territorial units had not been updated or was not consistent with the real situation, for example when local authorities made changes but they were not reported (in time) to the state geodetic administration (SGA).
Due to the fact that an exhaustive historical list of changes to street names is not available from the SGA, several cadastral offices in larger cities were contacted in order to obtain lists of street names that had changed, in order to be able to assign codes to streets that were not found in registers. Due to the difficulties encountered, manual matching was not employed for dead units in the SBR.
Action 3: selection and preparation of SBR data for publishing on the Croatian Bureau of Statistics (CBS) geoportal.
The only statistical output from the SBR was business demography data and these were used for geographical presentation. In 2015, a GeoSTAT portal was developed and this provides the possibility to publish data down to the level of 1 km² grids. GeoSTAT was also used for the presentation of SBR business demography data at NUTS level 3.
Prior to the work on this project, the SBR contained address data (postal information) and codes for three administrative divisions (at the level of counties, municipalities and settlements). As a result of the work done this was extended to include information on geographic location, a street code and codes for statistical divisions (NUTS levels 2 and 3). Furthermore, the quality of address data in the SBR was improved greatly as many mistakes were corrected. A system was established that should help maintain the improved quality of data on addresses by identifying problematic data at the moment that data entry takes place.
The published business demography data concerned the number of active enterprises, enterprise births and enterprise deaths, for two years, for each section of the Croatian activity classification (NKD2007); these data are available through the GeoSTAT portal.