Reference metadata describe statistical concepts and methodologies used for the collection and generation of data. They provide information on data quality and, since they are strongly content-oriented, assist users in interpreting the data. Reference metadata, unlike structural metadata, can be decoupled from the data.
Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).
SBS covers all activities of the non-financial business economy with the exception of agricultural activities and personal services. Limited information is available on banking, insurance and pension funds.
Main characteristics (variables) of the SBS data category:
Business demographic variables (e.g. Number of enterprises)
"Output related" variables (e.g. Turnover, Value added)
"Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments)
3.2. Classification system
Statistical Classification of Economic Activities in the European Community (NACE): NACE Rev.1 was used until 2001, NACE Rev. 1.1 since 2002, and NACE Rev 2 is used from 2008 onwards. Key data were double reported in NACE Rev.1.1 and NACE Rev.2 for 2008. From 2009 onwards, only NACE Rev.2 data are available.
The regional breakdown of the EU Member States is based on the Nomenclature of Territorial Units for Statistics (NUTS). Detailed information about the consecutive NUTS Regulations can be found at Eurostat's website
The SBS coverage was limited to Sections C to K of NACE Rev.1.1 until 2007. Starting from the reference year 2008 data is available for Sections B to N and Division S95 of NACE Rev.2. With 2013 as the first reference year information is published on NACE codes K6411, K6419 and K65 and its breakdown.
In Estonia most of legal units are equal to enterprise. These units produce goods or services, benefit from a certain degree of autonomy in decision-making, have a complete set of accounts, carry out one or more activities at one or more locations.
In economically significant cases (e.g. all the employment is recorded in a legal unit serving other legal units of a group) enterprise is implemented in business register and one unit from a group (enterprise) reports consolidated characteristics to SE, including the SBS characteristics.
3.6. Statistical population
The statistical population is created on the basis of the national business register. The SBS results are in accordance with the activities on the economic territory of Estonia. Branches of foreign enterprises with more than 19 persons employed are included in target population.
3.7. Reference area
Estonia
3.8. Coverage - Time
1995-2020
3.9. Base period
Not applicable.
Number of enterprises and number of local units are expressed in units.
Monetary data are expressed in millions of €.
Employment variables are expressed in units.
Per head values are expressed in thousands of € per head.
Ratios are expressed in percentages.
2020
Data refers to the calendar year.
6.1. Institutional Mandate - legal acts and other agreements
The dissemination of data collected for the purpose of producing official statistics is guided by the requirements provided in § 34 and § 35 of the Official Statistics Act.
7.2. Confidentiality - data treatment
According to the Official Statistics Act and the regulation of the Government of the Republic of Estonia, the data are published and transmitted without characteristics that permit identification of the respondents. The data have to be classified into groups of at least three persons (primary confidentiality — too few enterprises), while the share of data relating to each person in aggregate data does not exceed x% (primary confidentiality — one enterprise dominates the data, x denotes dominance limit in Statistics Estonia and its value is confidential). The criteria of dominance for turnover, gross investment in tangible goods and personnel costs was applied.
To protect the primary confidential cells, the secondary confidential cells were determined by using R package sdcTable. Only the number of units is published in case of confidentiality reasons that preclude the publication of the data.
The confidential data in the total database is caused by the detailed activity and employment size class breakdown. As Estonia is a small country the rate of confidential cells is considerable.
7.2 The rate of confidential cells_Annexes I to IV_2020 (%)
Country
EE
EE
EE
EE
EE
Dataset
TOTAL
Section
Division
Group
Class
SBSCP_1A
13,0
0,0
13,3
16,4
11,6
SBSCP_1B
23,6
0,0
21,8
26,2
NA
SBSCP_1C
11,8
0,0
14,3
NA
NA
SBSCP_1E
38,9
NA
NA
NA
NA
SBSCP_1F
NA
NA
NA
NA
NA
SBSCP_1G
0,0
0,0
0,0
0,0
0,0
SBSCP_2A
28,9
0,0
0,0
20,3
37,0
SBSCP_2B
30,2
0,0
20,7
34,0
0,0
SBSCP_2C
5,3
0,0
5,9
NA
NA
SBSCP_2F
30,7
0,0
5,9
24,6
37,0
SBSCP_2H
24,7
20,0
25,2
NA
NA
SBSCP_2I
37,1
54,2
35,5
NA
NA
SBSCP_3A
4,9
0,0
0,0
0,0
6,3
SBSCP_3B
18,0
0,0
14,3
19,4
NA
SBSCP_3C
0,0
0,0
0,0
0,0
NA
SBSCP_3D
0,0
0,0
0,0
0,0
NA
SBSCP_3H
0,0
NA
0,0
0,0
0,0
SBSCP_4A
0,0
0,0
0,0
0,0
0,0
SBSCP_4B
12,8
0,0
0,0
18,5
NA
SBSCP_4C
0,0
0,0
0,0
NA
NA
SBSCP_4F
0,0
0,0
0,0
0,0
0,0
SBSCP_4G
0,0
0,0
0,0
NA
NA
SBSCP_4H
14,1
0,0
0,0
20,4
NA
8.1. Release calendar
According to the Official Statistics Act, a producer of official statistics shall disseminate official statistics pursuant to the release calendar published on its website
To assure the quality of processes and products, Statistics Estonia applies the EFQM Excellence Model, EU Statistics Code of Practice and the ESS Quality Assurance Framework (QAF). Statistics Estonia is also guided by the requirements provided for in § 7. „Principles and quality criteria of producing official statistics” of the Official Statistics Act.
ESMS (Euro-SDMX Metadata Structure) metadata based on the SDMX Cross-Domain Concepts is published on the Estonian Statistics website in Estonian and English. Including Financial statistics of enterprises (annual), statistical activity code - 20300, on which SBS is based.
To assure the quality of processes and products, Statistics Estonia applies the EFQM Excellence Model, EU Statistics Code of Practice and the ESS Quality Assurance Framework (QAF). Statistics Estonia is also guided by the requirements provided for in § 7. „Principles and quality criteria of producing official statistics” of the Official Statistics Act.
Statistics Estonia performs all statistical activities according to an international model (Generic Statistical Business Process Model – GSBPM). According to the GSBPM, the final phase of statistical activities is overall evaluation using information gathered in each phase or sub-process (this information includes, among other things, feedback from users, process metadata, system metrics and suggestions from employees). This information is used to prepare the evaluation report which outlines all the quality problems related to the specific statistical activity and serves as input for improvement actions.
The overall assessment of the quality of SBS data is good. Data quality is in accordance with principles of accuracy and reliability, timeliness and punctuality, coherence and compatibility.
12.1. Relevance - User Needs
Each year before the compilation of the list of statistical activities the needs of internal (national accounts, environmental statistics, price statistics) as well as external (ministries, municipalities, unions, universities) users have been examined.
Internal users (NA, price statistics) need more detailed breakdowns of costs and turnover by activities.
Governmental and municipal institutions are interested in statistics by counties (NUTS 4).
12.2. Relevance - User Satisfaction
No regular surveys related to the users' satisfaction regarding the availability of SBS data for Annexes I to IV of the SBS Regulation are organised
12.3. Completeness
12.3.a. Data availability rate Annexes I-IV 2020
Data availability rate calculated by Eurostat — 100%
12.3.b. Specification of missing details_series_Annexes_I to IV_2020
No missing dataseries for 2020.
12.3.c. Use of the quality flag 'Contribution to European totals only' (CETO-flag) foreseen in the SBS regulation_Annexes_I to IV_2020 (%)
CETO-flags are used in following series
1A
2A
3A
4A
The highest share is in series 2A “ Annual enterprise statistics for industry” where 95,6% of cells are flagged at the class level of NACE.
13.1. Accuracy - overall
The overall accuracy of the results can be assessed as good. The survey is a sample survey combined with administrative information. For characteristics missing in administrative information model based estimate is used, also donor imputation and mean value imputation methods are applied. The most important sources of errors are nonresponse and modelling errors when using administrative information.
13.2. Sampling error
Coefficients of Variation.
Starting from 2019
All non-responded enterprises and enterprises not in sample are imputed (e.g. using the data from annual reports, mean value imputation etc) i.e. data is available for all active enterprises, the coefficient of variation equals 0 (zero).
Before 2019
Methodology: Sample survey combined with administrative information
Coefficients of variation (CV) were calculated by using R package survey.
CV is negligible for number of enterprises and small on average for other variables. The largest CVs are, in more extreme cases, for variable investments in tangible fixed assets as the investments in tangible goods are not a stable indicator from year to year and from enterprise to enterprise.
13.3. Non-sampling error
Starting from 2019
The influence of non-sampling error is small.
Data is available for all active enterprises:
collected with the statistical questionnaire or
received from administrative source or
imputed (mean value imputation, donor imputation)
Before 2019
13.3.1.1 Description of estimation methods for taking into account the unit non-responses
Imputation and reweighting are used.
All non-responded enterprises with more than 19 persons employed and influential smaller enterprises are imputed by using the data from annual reports; also the estimations on the basis of short-term statistics, fiscal data from the Estonian Tax and Customs Board or other sources were made on unit level.
Reweighting was used for enterprises with less than 20 persons employed assuming that the percentage of active enterprises in the non-response group was the same as in the response group. Post-stratification was used for reweighting. Strata were formed based on economic activity (up to 3-digit level), employment size class and form of ownership. Weight for stratum h was calculated as follows:
wh = Nh / mh,
where Nh is the number of population units and mh is the number of responded units in stratum h.
13.3.1.2 Weighted unit non-response rate
The data are collected through the same survey; therefore the file includes unit non-response rates weighted by using number of persons employed.
The recorded unit non-response rate in the overall context is small.
13.3.2 Bias
Horvitz-Thompson method was used for estimation which is unbiased. Small bias may occur in size classes with less than 20 persons employed due to assumption that respondents and non-respondents are similar. Generally SBS estimates are unbiased because non-respondents with more than 19 persons employed and other influential units are imputed based on data concerning the same unit from other sources.
Small bias with a limited effect in the overall accuracy of the estimate.
13.3.3 Evaluation of the impact of imputation
Small
Imputation was used for units which were selected into sample. Impact of imputation on CVs is very small.
14.1. Timeliness
Action Deadline
a) data-collection 08/07/2021
b) post-collection phase 07/12/2021
c) dissemination 14/01/2022
14.2. Punctuality
SBS 2020 final dataseries were transmitted on deadline.
Punctuality Annexes I-IV 2020
Annexes I-IV
Punctuality
1A
0
1B
0
1C
0
1E
0
1F
0
1G
0
2A
0
2B
0
2C
0
2F
0
2H
0
2I
0
3A
0
3B
0
3C
0
3D
0
3H
0
4A
0
4B
0
4C
0
4F
0
4G
0
4H
0
15.1. Comparability - geographical
No inconsistencies
15.2. Comparability - over time
Length of comparable time series: 2005-2020.
The data for the reference years 1995–1999 were mainly available at NACE 2-digit level. Also the availability of variables was incomplete. From the reference year 2000 the SBS data production is in accordance with the requirements of SBS regulation.
2000-2007 the activity classification NACE Rev.1.1 was in use. Starting from 2008 NACE Rev.2 was implemented (back casted data for 2005-2007).
Statistical unit enterprise was implemented starting from reference year 2016:
In 2016 were created and included in the SBS 2016 2 enterprises.
In 2017 were created and included in the SBS 2017 3 enterprises.
In 2018 was created 1 enterprise, thus SBS 2018 data include 6 enterprises (2 in 2016 + 3 in 2017 + 1 in 2018).
SBS 2019 and 2020 data include 6 enterprises created in 2016-2019.
15.3. Coherence - cross domain
The SBS data collection is based on active enterprises in the Business Register for statistical purposes compiled by November of year t. During the SBS data production process the activeness of enterprises has to be adjusted: non-active enterprises are excluded and some out-of scope influential units by total assets or turnover are added.
In data editing process of SBS survey strict data check rules are implemented to guarantee the same value of turnover from industrial activities in SBS and Prodcom survey on unit level.
The comparisons with STS data (turnover, investments, number of persons employed) on unit and aggregated level are carried out. Differences of aggregated data are caused from the use of different sampling frames — the updated frame compiled by November of year t is used for producing annual business statistics of year t and for short-term statistics of year t+1.
The difference between SBS and Business Demography (BD) is mainly caused by the inclusion of sole proprietors in BD. SBS is in accordance with National Accounts (NA) and is the main data-source for non-financial corporations' sector.
The same methodology is used for compiling preliminary as well as final data. Regular revisions are not foreseen. Irregular revisions are unplanned and are made to correct significant errors.
18.1. Source data
18.1.1 Concepts and sources
The data source is statistical survey combined with administrative information
18.1.1.1 Description of source
a) Survey
Type of sample design: Stratified
Stratification criteria: activity, employment size class
Selection schemes (sampling rates) 2020:
The sample size by employment size class and activity:
0-9
0-1
2-9
10-19
20+
NACE B-E
0%
not applicable
not applicable
60,9%
100%
NACE F
not applicable
0%
12,5%
41,4%
100%
NACE G
not applicable
0%
14,5%
50,7%
100%
NACE H-S (excl K)
not applicable
0%
13,3%
43,2%
100%
All enterprises with at least 20 persons employed are surveyed totally. For companies included in the population but not surveyed, micro-data from an administrative source were used, or in their absence, imputation was used.
Any possible threshold values: No
The effective sample size:
6 780 (95 988 together with enterprises from administrative source) for the activities NACE B-S except K.
b) Administrative source
The used administrative sources: Companies’ annual reports from Commercial Register (under Ministry of Justice)
The characteristics directly available or with a good proxy in the administrative source:
Number of employees, number of employees in full-time equivalent units, turnover, turnover from industrial, construction, service, trading activities, personnel costs, wages and salaries, social security costs, changes in stocks of goods and services, changes in stocks of finished products and work in progress, changes in stocks of goods purchased for resale in the same conditions as received, balance sheet data.
The extent to which the administrative source are used:
Data source, basic data for some characteristics data source for imputation in case of non-response
Data source for imputation, for strata not covered by the survey (i.e. the enterprises with less than 20 persons employed)
The type of administrative data: Companies' micro data
The frequency to which the used administrative data sources are updated:
Very good – within an hour after the data submission. Annual reports, in general, have no several revisions with increasing degree of completeness.
18.1.1.2. Relation between reporting units and the legal units / enterprise (statistical unit)
The relation between the reporting unit for the survey/administrative data and the enterprise:
The reporting unit for the survey/administrative data is legal unit, in some exceptional cases the reporting unit is the head of legal units' group (enterprise)
18.2. Frequency of data collection
Annual data collection
18.3. Data collection
To obtain SBS data, several EKOMAR questionnaires are used, which are slightly modified according to the enterprise’s principal activity. As an example, a questionnaire and instructions for construction enterprises in pdf form https://www.stat.ee/en/submit-data/questionnaires/10392020
Statistics Estonia uses the web-based electronic data collection system eSTAT. Statistical questionnaires are prefilled with data from annual financial statements. Variables not available in administrative source are added by data providers. eSTAT gives to data provider a possibility before submission of questionnaire to check the correctness of data and correct the errors. There are mostly arithmetical checks - completeness, internal consistency, plausibility checks.
18.4. Data validation
Statistics Estonia uses the web-based electronic data collection system eSTAT which gives to data provider a possibility before submission of report to check the correctness of data and correct the errors. There are mostly arithmetical checks.
In data processing phase (using data processing information system VAIS) the data editing continues by using a lot of arithmetical — completeness, internal consistency, plausibility — checks. The data are also compared with similar data from annual reports, short-term statistics, Prodcom, VAT declaration on unit and aggregated level.
18.5. Data compilation
Starting from reference year 2019 SBS data is available for all active enterprises at individual level. The methodology to obtain the data is sample survey combined with administrative information.
For imputation the available information from administrative source i.e. in annual reports is used. For characteristics missing in annual reports model based estimate is used.
In case of absence of annual reports also donor imputation and mean value imputation methods are applied.
Before 2019 for grossing-up i.e. for estimating population totals the Horvitz-Thompson estimator is used.
18.6. Adjustment
SBS data refer to a calendar year (or in some exceptional cases, a 12-month period beginning or ending in the reporting year).
No further comments.
Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).
SBS covers all activities of the non-financial business economy with the exception of agricultural activities and personal services. Limited information is available on banking, insurance and pension funds.
Main characteristics (variables) of the SBS data category:
Business demographic variables (e.g. Number of enterprises)
"Output related" variables (e.g. Turnover, Value added)
"Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments)
In Estonia most of legal units are equal to enterprise. These units produce goods or services, benefit from a certain degree of autonomy in decision-making, have a complete set of accounts, carry out one or more activities at one or more locations.
In economically significant cases (e.g. all the employment is recorded in a legal unit serving other legal units of a group) enterprise is implemented in business register and one unit from a group (enterprise) reports consolidated characteristics to SE, including the SBS characteristics.
The statistical population is created on the basis of the national business register. The SBS results are in accordance with the activities on the economic territory of Estonia. Branches of foreign enterprises with more than 19 persons employed are included in target population.
Estonia
2020
Data refers to the calendar year.
The overall accuracy of the results can be assessed as good. The survey is a sample survey combined with administrative information. For characteristics missing in administrative information model based estimate is used, also donor imputation and mean value imputation methods are applied. The most important sources of errors are nonresponse and modelling errors when using administrative information.
Number of enterprises and number of local units are expressed in units.
Monetary data are expressed in millions of €.
Employment variables are expressed in units.
Per head values are expressed in thousands of € per head.
Ratios are expressed in percentages.
Starting from reference year 2019 SBS data is available for all active enterprises at individual level. The methodology to obtain the data is sample survey combined with administrative information.
For imputation the available information from administrative source i.e. in annual reports is used. For characteristics missing in annual reports model based estimate is used.
In case of absence of annual reports also donor imputation and mean value imputation methods are applied.
Before 2019 for grossing-up i.e. for estimating population totals the Horvitz-Thompson estimator is used.
18.1.1 Concepts and sources
The data source is statistical survey combined with administrative information
18.1.1.1 Description of source
a) Survey
Type of sample design: Stratified
Stratification criteria: activity, employment size class
Selection schemes (sampling rates) 2020:
The sample size by employment size class and activity:
0-9
0-1
2-9
10-19
20+
NACE B-E
0%
not applicable
not applicable
60,9%
100%
NACE F
not applicable
0%
12,5%
41,4%
100%
NACE G
not applicable
0%
14,5%
50,7%
100%
NACE H-S (excl K)
not applicable
0%
13,3%
43,2%
100%
All enterprises with at least 20 persons employed are surveyed totally. For companies included in the population but not surveyed, micro-data from an administrative source were used, or in their absence, imputation was used.
Any possible threshold values: No
The effective sample size:
6 780 (95 988 together with enterprises from administrative source) for the activities NACE B-S except K.
b) Administrative source
The used administrative sources: Companies’ annual reports from Commercial Register (under Ministry of Justice)
The characteristics directly available or with a good proxy in the administrative source:
Number of employees, number of employees in full-time equivalent units, turnover, turnover from industrial, construction, service, trading activities, personnel costs, wages and salaries, social security costs, changes in stocks of goods and services, changes in stocks of finished products and work in progress, changes in stocks of goods purchased for resale in the same conditions as received, balance sheet data.
The extent to which the administrative source are used:
Data source, basic data for some characteristics data source for imputation in case of non-response
Data source for imputation, for strata not covered by the survey (i.e. the enterprises with less than 20 persons employed)
The type of administrative data: Companies' micro data
The frequency to which the used administrative data sources are updated:
Very good – within an hour after the data submission. Annual reports, in general, have no several revisions with increasing degree of completeness.
18.1.1.2. Relation between reporting units and the legal units / enterprise (statistical unit)
The relation between the reporting unit for the survey/administrative data and the enterprise:
The reporting unit for the survey/administrative data is legal unit, in some exceptional cases the reporting unit is the head of legal units' group (enterprise)
Annual
Action Deadline
a) data-collection 08/07/2021
b) post-collection phase 07/12/2021
c) dissemination 14/01/2022
No inconsistencies
Length of comparable time series: 2005-2020.
The data for the reference years 1995–1999 were mainly available at NACE 2-digit level. Also the availability of variables was incomplete. From the reference year 2000 the SBS data production is in accordance with the requirements of SBS regulation.
2000-2007 the activity classification NACE Rev.1.1 was in use. Starting from 2008 NACE Rev.2 was implemented (back casted data for 2005-2007).
Statistical unit enterprise was implemented starting from reference year 2016:
In 2016 were created and included in the SBS 2016 2 enterprises.
In 2017 were created and included in the SBS 2017 3 enterprises.
In 2018 was created 1 enterprise, thus SBS 2018 data include 6 enterprises (2 in 2016 + 3 in 2017 + 1 in 2018).
SBS 2019 and 2020 data include 6 enterprises created in 2016-2019.