|
|
For any question on data and metadata, please contact: Eurostat user support |
|
|||
1.1. Contact organisation | CSO-Poland |
||
1.2. Contact organisation unit | Demographic and Labour Market Surveys Department, CSO - Poland in cooperation with experts from CSO and Central Statistical Computing Centre. |
||
1.5. Contact mail address |
|
|||
2.1. Data description | |||
The aim of this report is the evaluation of the quality components in the SES 2010 such as:
The SES, which is the subject of this quality report, is a sample survey which has been conducted every 2 years since 1999 (previous surveys on SES were conducted for October 2006 and October 2008). Reference month refers to October. All data for 2010 year regard to full-time employees and part-time employees who work the whole reference month.
The outline of this report is based on the content of Commission Regulation No 698/2006 as reqards quality evaluation on structural statistics on earnings. |
|||
2.2. Classification system | |||
Not available. |
|||
2.3. Coverage - sector | |||
Not available. |
|||
2.4. Statistical concepts and definitions | |||
Not available. |
|||
2.5. Statistical unit | |||
Not available. |
|||
2.6. Statistical population | |||
Not available. |
|||
2.7. Reference area | |||
Not available. |
|||
2.8. Coverage - Time | |||
Not available. |
|||
2.9. Base period | |||
Not available. |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3.1. Source data | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Two-stage sampling was applied in the SES 2010 as in the previous SES, with stratification on the first stage. The first-stage sampling units constitutes local units, while on the second stage are drawn employees who met particular requirements of the survey i.e. work the whole month in October (all persons from sampled units up to 40 employed participate in the survey without sampling, persons from units employing more than 40 persons are sampled).
a) size of sample The 13.0 percent sample for the SES 2010 included about 27.2 thousand units. The responses were given by about 17.3 thousand units. The average unit that gave response comprised 161 employed persons, of which 138 were employees. In particular units the survey embraced about 688.4 thousand of sampled employees who worked for the whole month (October 2010). After generalization, the results are representative for population of about 8.0 million of employees (full- and part-time paid employees without converting part-time paid employees into full-time paid employees; refers to coverage of NACE A-S). The data on earnings are presented after recalculation into full-time work.
b) sampling scheme The two-stage sampling was applied. The purpose was gathering information on structure of wages and salaries by occupations for specified populations of the employed. The term ”population” should be understood as population of local units and employees defined by: 1 - NACE Rev.2 section 2 - ownership sector (public or private).
There were following exceptions from this rule: 1) in section C populations constitute particular subsections within the sector divisions: from 10, 11,…, to 32, 33, 2) in section J and K populations constitute particular subsections within the sector, providing that their numbers are above zero. They are the following divisions: from 58, 59,…, to 65, 66, 3) in sections P and Q populations constitute particular NACE rev.2 groups in a given ownership sector. They are the following groups: from 85.1, 85.2, ..., to 88.1, 88.9.
There were 123 populations overall in the survey.
The first-stage sampling frame was created on the basis of the Business Register - REGON system. All information necessary for the SES purposes concerning entities were transmitted from the REGON system into the sampling frame. Information on the number of employees in a particular local unit was transmitted from results survey on the number of employees. In case of lack of information about a particular local unit in the survey on the number of employees, information on the number of the employed included in REGON was used. Sampling of local units was carried out separately in each population. Within every population local units were divided into strata according to the number of the employed. Table below presents the outline of strata in the SES 2010.
STRATA IN THE SES 2010
The Nih value i.e. the number of reporting units (local units) in h-th stratum of i-th population was determined for every stratum of a given population. According to assumptions a sample in every population should cover about 10 percent of employed persons fulfilling the survey conditions. Thus, sampling fractions on the first and second stages were determined in order to:
(1)
Values f1ih and M2ih = 1/f2ih. However in strata from 6 up to 14 i.e. in local units with over 1100 employed persons different sampling fractions were allowed i.e. less or more than 10 per cent of the employed. In these strata the first stage sampling was not carried out but all the local units from these strata were counted into a sample. Moreover, in case of discrepancies between an actual number of employees and the figure written in the sampling frame equation (1) was not fulfilled. In order to sample local units in strata from (1) to (5) value nih was calculated i.e. the number of units for drawing from h-th stratum of i-th population.
(2)
The value eih in the equation (2) is a random zero-one variable. It was introduced in order to random rounding of the nih value to the integer number. Then nih whole numbers from the [1; Nih] interval were sampled without replacement. They matched conventional ranking numbers of units in the sampling frame. In the same way were sampled local units in all strata in subsequent populations.
In the previous surveys on wages by occupations employees were sampled centrally, i.e. in the Central Statistical Office. Random numbers were sent to previously sampled local units.The numbers constituted ranking numbers of employees on specially prepared registers of employees.These registers were a second stage sampling frame. Such approach was not effective because:
Thus, this problem was solved for SES 1999 and onwards surveys by ensuring sampled local units access via Internet to a special program for sampling employees. The input parameter of this program was value Pj, i.e. the number of employees in the j-th local unit, who worked the whole October. The program performs following tasks:
1) reads in data on employees and sorts records according to 6-digit code ISCO’08 of the performed occupation, 2) depending on a given value Pj imputes value M2 i.e inversion of sampling fraction and then calculates value mpj i.e. the number of employees to be sampled in j-th local unit:
(3)
Relations between Pj and M2 for the SES 2010 is given below in table:
(3) generates a string of random numbers {No.} according to the following rule:
(4)
where: ajk - is a random integer number from interval [1; M2].
Values No.jk are subsequent numbers (ranking numbers) of employees drawn to a sample in j-th local unit. Sampling according to the above program represents stratified random sampling, in which one element is drawn from a stratum of M2 size. There was specially prepared instruction for sampling of employees for local units which could not use the above program. Sampling according to this instruction represented systematic sampling with interval size M2 and random beginning 1.
The main parameter estimated on the basis of the survey results is the number of employees characterised with the specified determinant. This parameter is estimated according to the formula:
(1)
where: xkhi - the number of the employed with the specified characteristics in the i-th unit in the h-th stratum of the k-th population, Wkhi - weight appointed to the employee selected to the survey in the i-th establishment of the h-th stratum of the k-th population.
If the survey was full (universe), the weight set in the formula (1) would be described by the formula: (2)
Because the survey is not full (universe) for different reasons, the weight Wkhi was described by the formula:
(3)
whereas:
(4)
where: P1kh – estimated number of the employed in the units surveyed in the h-th stratum of the k-th population, P2kh – estimated number of the employed in the units not surveyed because of the refusal in the h-th stratum of the k-th population, P3kh – estimated number of the employed in the units not surveyed because of the lack of contact in the h-th stratum of the k-th population, P4kh – estimated number of the employed in the units not surveyed because of the lack of activity, liquidation of the establishment, bankruptcy, etc. in the h-th stratum of the k-th population.
The above values were estimated on the basis of the data on the number of the employed included in the sample frame. The equations (3) – (4) indicate that weights correction included lack of responses caused by refusals and in proportion to the number of the surveyed units and refusals, lack of responses caused by lack of contact with the selected unit. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3.2. Frequency of data collection | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[Not requested] |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3.3. Data collection | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[Not requested] |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3.4. Data validation | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[Not requested] |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3.5. Data compilation | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[Not requested] |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3.6. Adjustment | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[Not requested] |
|
|||
4.1. Quality assurance | |||
Not available. |
|||
4.2. Quality management - assessment | |||
[Not requested] |
|
||||||||||||||||||||||||||||||||||||||||||||||||
5.1. Relevance - User Needs | ||||||||||||||||||||||||||||||||||||||||||||||||
Description of the users Users of the SES data can be divided into the following groups:
a) national users:
b) international users:
Description of users’ needs Generally, internal users are interested in the level of earnings by occupations and by different socio-demographic characteristics such as: age, sex, level of education, length of service and their impact on the situation of different occupational groups on labour market e.g.
|
||||||||||||||||||||||||||||||||||||||||||||||||
5.2. Relevance - User Satisfaction | ||||||||||||||||||||||||||||||||||||||||||||||||
The needs of users are satisfied in professional way. Users obtain all information on structure of earnings that are indispensable for conducting adequate calculations, analyses or policy. Information is available mainly in electronic form by e-mail and in the form of methodological publication, yearbooks of which labour yearbooks.
|
||||||||||||||||||||||||||||||||||||||||||||||||
5.3. Completeness | ||||||||||||||||||||||||||||||||||||||||||||||||
1. All mandatory variables are available from Polish SES 2010.
2. Optional variables which are missing in records A describing local units (they are not available from the Polish SES 2010):
3. Optional variables which are missing in records B describing sampled employees in local units (they are not available from the Polish SES 2010):
4. Occupation code ISCO’08 0 (army forces) is not covered by the SES 2010. |
||||||||||||||||||||||||||||||||||||||||||||||||
5.3.1. Data completeness - rate | ||||||||||||||||||||||||||||||||||||||||||||||||
[Not requested] |
|
||||||||
- |
||||||||
6.1. Accuracy - overall | ||||||||
[Not requested] |
||||||||
6.2. Sampling error | ||||||||
Data obtained from sample surveys, such as the structure of earnings survey by occupations, are biased with: sampling and non-sampling errors which determine accuracy of the survey. Thus, limitation and reduction of these errors significantly influences on improvement of data quality and correct interpretation of survey results.
Sampling errors are related to the sample size and sampling schemes. Their nature consists in the fact that incomplete information concerning a phenomenon influences on lack of confidence regarding relevance of estimates obtained from a sample survey. Thus, results of a sample survey should be treated as only approximate estimation on a value of an unknown parameter of population. Therefore, on one hand we should be aware of incomplete reliability of results (i.e. differences between values gained from a sample and actual values in population, possible to obtain only from a full survey), while on the other hand we should try to obtain maximum credibility of data through adequate sampling.
Generally, sampling errors can be limited through extension of sample size or appliance of more effective sampling frames. Because the sampling design has the important impact on sampling errors, in chapter 3.1 is presented detailed description of sampling method in SES. |
||||||||
6.2.1. Sampling error - indicators | ||||||||
Evaluation on sampling errors in the SES 2010 is carried out on the basis of the relative standard error. Standard error determines a range of variation of a sample mean estimator around a real mean in population (standard error square is called variance of estimated mean). Standard error is a measure of data precision. The lower standard error is the higher precision is and vice versa – the higher standard error the lower precision. The standard error in the SES 2010 is in line with the Commission Regulation No 698/2006 and amounts to less than 3% for the most of variables (for more detailed information please see Annex 2 and Annex 3).
Probability sampling Variance Coefficients of variation is defined as the ratio of the square root of the variance of the estimator to the expected value. It is estimated by the ratio of the square root of the estimate of the sampling variance to the estimated value. Both numerator and denominator are provided, together with the resulting coefficient of variation. The estimation of the sampling variance takes into account the sampling design. According to the Commission Regulation No 698/2006 coefficients of variation refer to monthly and hourly earnings broken down by:
Detailed variance analyses are presented in Annex 3 (with NACE A and without NACE A) attached to this report. In Annex 1 (with NACE A and without NACE A) are presented paid employment and average gross wages and salaries for October 2010. In Annex 2 (with NACE A and without NACE A) are presented standard deviation of gross earnings for October 2010. The relative indicators of wages differentiation were calculated using individual data, and not on the basis of the frequency distributions of total gross wages for October 2010.
Generally the sampling errors for the basic indicators of earnings of October 2010 from SES 2010 were following (coefficients of variation of gross earnings in %): - monthly gross earnings (aggregation A-S → CSO): 85,6% - monthly gross earnings (aggregation B-S → EUROSTAT): 74,1% - hourly gross earnings (aggregation A-S → CSO): 88,1% - hourly gross earnings (aggregation B-S → EUROSTAT): 81,7%
The highest coefficients of variation of average monthly gross earnings by type of activity (NACE Rev.2 B-S → EUROSTAT) took place in: - Financial and insurance activities (K): 102,3% - Administrative and support service activities (N): 102,0% - Professional, scientific and technical activities (M): 100,6%
The smallest sampling errors of monthly gross earnings by type of activity were in: - Education (P): 38,2% - Mining and quarrying (B): 40,4% - Electricity, gas, steam and air conditioning supply (D): 46,9% - Public administration and defence; compulsory social security (O): 47,8%
The highest coefficients of variation of average monthly gross earnings by type of activity (NACE Rev.2 A-S → CSO) took place in: - Professional, scientific and technical activities (M): 114,0% - Financial and insurance activities (K): 111,4% - Administrative and support service activities (N): 108,4% - Information and communication (J): 102,1%
The smallest sampling errors of monthly gross earnings by type of activity were in: - Education (P): 43,1% - Mining and quarrying (B): 43,6%
Non- probability sampling Non-probablity sampling is not used. No data from registers have been used, except for the setting up of the frame population. |
||||||||
6.3. Non-sampling error | ||||||||
Non-sampling errors are divided into coverage errors, measurements and processing errors, non-response errors and model assumption errors. They are described below. |
||||||||
6.3.1. Coverage error | ||||||||
Generally, coverage errors are divided into overcoverage and undercoverage errors. Overcoverage errors relate to units present in the frame and which, in fact, do not belong to the target population or to units not existing in practice (e.g. units that have not been contacted at all, units that are in scope but classified in the wrong sampling strata, duplication in the sampling frame, dead and inactive units ). In the SES 2010 lack of active constituted 1.9% of the selected sample (the ratio of 516 lack of active to the selected sample of 27.209 units). Undercoverage errors refer to units not included in the frame, but which should be (e.g. delays in birth registration, lost registration applications). For these units no information is obtained. As for the methods of limitation and reduction of coverage errors, errors due to lack of answers from the whole unit are eliminated mainly through updating addresses in a sample frame and methods of results weighting described in details in the item 6.3.3. Errors deriving from lack of answers regarding items are limited through grossing-up correction.
In terms of employees covered by the sample survey their rate amounts about 9% of the number of employees in the units in scope i.e. units the survey embraced about 688.4 thousand of sampled employees who worked for the whole month (October 2010). After generalisation, the results are representative for population about 8.0 million of employees persons (for aggregation NACE Rev.2 A-S).
Generally, the errors regarding to unclear, illegible questions and explanations were reduced during the data collection period.
As for the respondent errors, these errors are connected with misunderstandings of methodological note and misinterpretation. Respondents sometimes give incomplete answers in case of time consuming questions. These type of errors are eliminated during the control phase. If errors are caused by averse attitudes of respondents, a survey objective is explained once again, together with respondents’ role in a survey and clearance of any doubts concerning the survey.
In case of paper forms sent by the post, all doubts regarding variables and addresses of reporting units were explained during phone calls of respondents with the staff of the statistical office and during the e-mail contact. |
||||||||
6.3.1.1. Over-coverage - rate | ||||||||
Overall sampling rate (including those units exhaustively covered): 13.0% |
||||||||
6.3.1.2. Common units - proportion | ||||||||
[Not requested] |
||||||||
6.3.2. Measurement error | ||||||||
Measurement errors are divided into: the survey instrument (questionnaire) errors, the respondent errors, the information system and the mode of data collection errors. As for the survey instrument-questionnaires errors, the questionnaire in the SES is designed in such way to eliminate these types of errors because the detailed explanatory notes are attached to this questionnaire to increase its clarity.
Variables that are corrected very often are following:
Below are presented variables that were often corrected by following reasons:
The variables that were corrected very seldom refer to: work seniority bonuses, the year of birth, sex, level of education. |
||||||||
6.3.3. Non response error | ||||||||
Detailed classification of non-response units is following: non- response units consist of 9897 units about 36% of the selected sample by reasons given below:
Description of the methods used for re-weighting for non-response is closely connected with the creation of generalising ratios in the SES 2010. Ex-post stratification is here used. |
||||||||
6.3.3.1. Unit non-response - rate | ||||||||
[Not requested] |
||||||||
6.3.3.2. Item non-response - rate | ||||||||
[Not requested] |
||||||||
6.3.4. Processing error | ||||||||
Processing errors are errors in post-data-collection processes such as data entry, coding, keying, editing, weighting and tabulating. As for errors deriving from data compiling and processing, there are some problems with coding. The code of occupation is given by reporting units on the basis of the name of performed occupation. The code is checked with the corresponding nomenclature but in some cases descriptions given by reporting units are not enough clear to establish the right code. In these cases additional explanations are required. There are not problems with keying, editing, weighting, tabulating because wrong controlling assumption in a computer program or wrong interpretation of the results are removed immediately during the phase control and data are tested again in conducted surveys. It is very difficult to control an occupation with adequate level of education. We apply elastic approach to this matter because in practice, people with the long work seniority have the high position that is not in accordance with their low educational level. Sometimes, vocational and elementary education is linked with an occupation ISCO 3 (technicians and associate professionals) or tertiary education is linked with occupations: ISCO 7 (craft and related trades workers), ISCO 8 (plant and machine operators and assemblers), ISCO 9 (elementary occupations). All such cases are checked and explained by the staff of Statistical Office in Bydgoszcz. |
||||||||
6.3.4.1. Imputation - rate | ||||||||
No imputation applied. |
||||||||
6.3.5. Model assumption error | ||||||||
|
||||||||
6.4. Seasonal adjustment | ||||||||
[Not requested] |
||||||||
6.5. Data revision - policy | ||||||||
[Not requested] |
||||||||
6.6. Data revision - practice | ||||||||
[Not requested] |
||||||||
6.6.1. Data revision - average size | ||||||||
[Not requested] |
|
|||
7.1. Timeliness | |||
a) key-data collection dates in the SES 2010
b) key dates for the post-collection phase in the SES 2010
c) key publication dates in the SES 2010
|
|||
7.1.1. Time lag - first result | |||
[Not requested] |
|||
7.1.2. Time lag - final result | |||
[Not requested] |
|||
7.2. Punctuality | |||
Data were transmitted 18 months from the end of the reference period according to Commission Regulation No 1738/2005. |
|||
7.2.1. Punctuality - delivery and publication | |||
[Not requested] |
|
|||
8.1. Comparability - geographical | |||
As for the geographical comparability, definitions of: statistical units, reference population and variables are based on EUROSTAT recommendation and that is why the results of the SES are comparable on international scale. It is worth to stress that SES 2010 refers only to employees who work the whole month October. The local units refer to units conducting activity in sections A-S of NACE Rev.2 (of which B-S that is the coverage of the priority for EUROSTAT) and employing more than 9 persons.
Applied classifications are in line with international classifications as follows:
|
|||
8.1.1. Asymmetry for mirror flow statistics - coefficient | |||
[Not requested] |
|||
8.2. Comparability - over time | |||
As for the comparability over time, we changed the size of units covered by the SES namely:
Taking into account these circumstances we can state that changes in the size of units have the impact on the employees but they have not significant impact on the level of earnings by occupations and their structure. Thus, we can compare data for October 1999, 2001, 2002, 2004, 2006, 2008 and 2010 with regard to level of earnings by occupations and earnings structure. |
|||
8.2.1. Length of comparable time series | |||
[Not requested] |
|||
8.3. Coherence - cross domain | |||
The comparison with National Accounts is not available. |
|||
8.4. Coherence - sub annual and annual statistics | |||
[Not requested] |
|||
8.5. Coherence - National Accounts | |||
[Not requested] |
|||
8.6. Coherence - internal | |||
[Not requested] |
|
|||
9.1. Dissemination format - News release | |||
[Not requested] |
|||
9.2. Dissemination format - Publications | |||
Data are well documented in the form of: - publications – the SES publication consists of 240 pages, it covers methodological note, characteristics of basic measures on earnings by occupations and earnings structure, information on sampling scheme; the SES publication is disseminated every 2 years;
- information service of the Information Department distributes data on earnings structure by occupations for internal and external users; - chapters on the SES in Yearbooks, Labour Yearbooks published every 2 years.
|
|||
9.3. Dissemination format - online database | |||
No Polish online database. |
|||
9.3.1. Data tables - consultations | |||
[Not requested] |
|||
9.4. Dissemination format - microdata access | |||
[Not requested] |
|||
9.5. Dissemination format - other | |||
Not available. |
|||
9.6. Documentation on methodology | |||
Methodological chapter included in publication "Structure of wages ..." described above. |
|||
9.7. Quality management - documentation | |||
[Not requested] |
|||
9.7.1. Metadata completeness - rate | |||
[Not requested] |
|||
9.7.2. Metadata - consultations | |||
[Not requested] |
|
|||
[Not requested] |
|
|||
11.1. Confidentiality - policy | |||
[Not requested] |
|||
11.2. Confidentiality - data treatment | |||
[Not requested] |
|
|||
This report covers main information on the data quality. It is worth to stress that Polish SES 2010 includes all mandatory variables, thus quality of SES 2010 is accordant with EUROSTAT’s requests. |
|
|||
|
|||
Annex 1_Employees and Average of SES 2010 Annex 2_Standard deviation of SES 2010 Annex 3_Variance of SES 2010 |