Adult Education Survey 2022

National Reference Metadata in Single Integrated Metadata Structure (SIMS)

Compiling agency: Central Statistical Bureau of Latvia

For any question on data and metadata, please contact: Eurostat user support

Download

1. Contact

Top

1.1. Contact organisation

Central Statistical Bureau of Latvia

1.2. Contact organisation unit

Social Statistics Methodology Section

1.5. Contact mail address

Central Statistical Bureau of Latvia

Lāčplēša street, 1

Riga, LV-1301

2. Metadata update

Top

2.1. Metadata last certified

18/12/2023

2.2. Metadata last posted

18/12/2023

2.3. Metadata last update

18/12/2023

3. Statistical presentation

Top

3.1. Data description

The Adult Education Survey (AES) covers adults’ participation in education and training (formal - FED, non-formal - NFE and informal learning - INF). The 2022 AES focuses on people aged 18-69. The reference period for the participation in education and training is the twelve months prior to the interview.

Information available from the AES is grouped around the following topics:

Participation in formal education, non-formal education and training and informal learning
Volume of instruction hours
Characteristics of the learning activities
Reasons for participating
Obstacles to participation
Access to information on learning possibilities and guidance
Employer financing and costs of learning
Self-reported language skills

For further information see the 2022 AES legislation (http://ec.europa.eu/eurostat/web/education-and-training/legislation) and the 2022 AES implementation manual (http://ec.europa.eu/eurostat/web/education-and-training/methodology).

3.2. Classification system

- Classification of Learning Activities (CLA, 2016 edition)
- International Standard Classification of Education 2011 (ISCED 2011)
- Classification of Occupations 2008 (ISCO 08)
- Classification of economic activities Rev. 2 (NACE Rev. 2)

3.3. Coverage - sector

AES covers all economic sectors.

3.4. Statistical concepts and definitions

Definitions as well as the list of variables covered are available in the 2022 AES implementation manual (http://ec.europa.eu/eurostat/web/education-and-training/methodology).

3.5. Statistical unit

Individuals, non-formal learning activities.

3.6. Statistical population

Individuals aged 18-69 living in private households.

3.7. Reference area

All the territory is covered.

3.8. Coverage - Time

Field work periods:

2011 AES 01.09.2011-15.12.2011

2016 AES 01.10.2016-15.02.2017

2022 AES 01.09.2022 - 15.01.2023

3.9. Base period

Not applicable.

4. Unit of measure

Top

Number, EUR.

5. Reference Period

Top

Fieldwork for AES 2022 was form 01.09.2022.-15.01.2023

6. Institutional Mandate

Top

6.1. Institutional Mandate - legal acts and other agreements

At European level:
Basic legal act: Regulation (EU) 2019/1700
Implementing act: Commission Implementing Regulation (EU) 2021/861

At national level:
Statistics law

6.2. Institutional Mandate - data sharing

Not applicable.

7. Confidentiality

Top

7.1. Confidentiality - policy

Confidentiality of individual data is protected by Statistics Law:

Section 7. Competence of the Statistical Institution in Production of Official Statistics

(2) The statistical institution shall:
8) ensure statistical confidentiality in accordance with the procedures laid down in this Law;

Section 17. Data Processing and Statistical Confidentiality

Section 19. Dissemination of Official Statistics

(1) The statistical institution shall disseminate official statistics in a way that does not allow either directly or indirectly identify a private individual or a State institution in cases other than those laid down in Section 25 of this Law.
(2) The statistical institution shall publish the official statistics which have been produced within the framework of the Official Statistics Programme in a publicly available form and by a predetermined deadline on the portal of official statistics. Until the moment of publication of official statistics this statistics shall not be published.

7.2. Confidentiality - data treatment

To protect Individual data in the field of social statistics we start with evaluation of information in a table (or any other type of summary information) aiming to find out whether information to be published contains data cells revealing confidential information. Risk assessment is based on several factors – sensitivity of indicators, precision and freshness of the information revealed. The primary confidential cell values discovered are then hidden.

When publishing summary information, with an aim to protect primary confidential cells identified also secondary confidentiality principle is followed, i.e., additional cell values are hidden to ensure that it is not possible to calculate primarily confidential cell values by performing arithmetic operations.

8. Release policy

Top

8.1. Release calendar

Release calendar is accessible for data releases one year in advance. All official statistics is published according to the data release calendar, at 13:00 o’clock. Release date for AES data is included in the release calendar as well.

8.2. Release calendar access

https://stat.gov.lv/en/calendar

8.3. Release policy - user access

Statistical release dates and times are pre-announced in the data dissemination calendar. This information is available for all data users.

9. Frequency of dissemination

Top

Every 6 years.

10. Accessibility and clarity

Top

10.1. Dissemination format - News release

Press release about the main results of the survey are planned to be published in October 2023. Individual requests from researchers and institutions/organisations to access the microdata will be considered on a case-by-case basis.

10.2. Dissemination format - Publications

The main results of the survey are planned to be published in October 2023 in the form of 8-page leaflet. Name of the publication “Main Results of the Adult Education Survey”, published online by Central Statistical Bureau of Latvia.

10.3. Dissemination format - online database

The results of the AES are published in online databases under the domain Education, culture, and science and under the subdomain Lifelong learning.

Link to the data: https://stat.gov.lv/en/statistics-themes/education/lifelong-learning/other/8465-adult-education?themeCode=II.

Data is planned to publish online in October 2023.

10.3.1. Data tables - consultations

Not applicable.

10.4. Dissemination format - microdata access

It is planned to develop anonymized database for research purposes.

It is possible to use remote access to anonymized individual data in research. Depending on the additional data processing methods applied, the datasets are available for use on the researcher's infrastructure (OffSite) or on the remote access system of the Central Statistical Bureau (OnSite). The data are available if application is filled in and contract is concluded in case of positive decision from the Central Statistical Bureau. Anonymized individual data can be only used for scientific or research purposes, moreover, research result has to assure benefit to all society.

10.5. Dissemination format - other

Not applicable.

10.5.1. Metadata - consultations

Not applicable.

10.6. Documentation on methodology

The main methodological document is AES methodological manual developed by Eurostat. Adult Education Survey (AES) methodology.

10.6.1. Metadata completeness - rate

Not applicable.

10.7. Quality management - documentation

The quality of the survey is ensured through specific requirements (sampling and precision requirements) set in the Regulations for the 2022 AES wave and is also reflected through the use of harmonised definitions and concepts. The quality is discussed in Eurostat working groups.

11. Quality management

Top

11.1. Quality assurance

CSB has introduced Quality Management System (QMS). The system is directed towards providing high user satisfaction and ensuring compliance with regulatory enactments. Based on the structure of Generic Statistical Business Process Model (GSBPM), QMS defines and at the level of procedures describes processes of statistical production as well as sets the persons responsible for the monitoring of processes at all stages of the statistical production. QMS defines the sequence how processes are implemented (i.e., activities to be performed (incl. verifications of processes and statistics, sequence and implementation requirements thereof, as well as persons responsible for the implementation)), procedures used in the evaluation of processes and statistics, as well as any improvements needed.

Since 2018, QMS of the CSB has been certified by the standard ISO 9001:2015 “Quality Management Systems. Requirements” (scope of certification: development, production and dissemination of official statistics).

11.2. Quality management - assessment

Quality of statistics is assessed in accordance with the existing requirements of external and internal regulatory enactments and in accordance with the established quality criteria.

Regulation (EC) no 223/2009 of the European Parliament and of the Council on European statistics states that European Statistics European statistics shall be developed, produced and disseminated on the basis of uniform standards and of harmonised methods. In this respect, the following quality criteria shall apply: relevance, accuracy, timeliness, punctuality, accessibility, clarity, comparability and coherence.

12. Relevance

Top

12.1. Relevance - User Needs

The main users of the AES data are policy makers and specifically the Ministry of Education. They are interested in:

Access to information on learning possibilities and guidance
Participation in education and training by type, characteristics of the activity (field, distance learning, etc.)
Share of job-related or employer-sponsored NFE
Volume of instruction hours for FED and NFE
Cost of learning for NFE
Obstacles to participation in education and training
Self-reported language skills

Media and general public is mainly interested in language skills as population of Latvia speaks in two main languages Russian and Latvian although official language is Latvian.

12.2. Relevance - User Satisfaction

The mission of the CSB is to provide users of statistical information with independent high-quality official statistics for decision-making, research and discussions.

Comments on data quality may be sent to the e-mail address: pasts@csp.gov.lv

12.3. Completeness

The data sent to Eurostat are overall complete and match the requirements set out in the Commission Regulation.

12.3.1. Data completeness - rate

Not applicable.

13. Accuracy

Top

13.1. Accuracy - overall

Despite the main sources of error being non-response and over-coverage, there is no substantial evidence of their impact on the obtained estimates. During data collection process response levels and response representativity was monitored using R-indicators as set out in Schouten, Cobben and Bethlehem (2009) and Shlomo, Skinner and Schouten (2012) to help to identify potential bias by measuring the degree of difference between responding and non-responding sample groups. Based on the monitoring and analyses of the R-indicators, active interventions are implemented during data collection process to increase the chances of obtaining a representative set of final response units. Design weights were adjusted accordingly to fieldwork results. The calculation of sampling errors is provided for key indicators in various cross-sections.

13.2. Sampling error

The sampling frame was prepared according to the target population, taking into account the best available information at the time of sample creation, ensuring equal opportunities for all survey units to be included in the sample, in accordance with the developed survey planning. To provide sample distribution over the territory and education level, systematic sampling technique were used with sorting within strata, survey polygon (small territories), administrative and territorial units classification, known education level.

Design weights were adjusted to nonresponse in homogeneity groups/stratas made by sex, age group and region and then weights were calibrated to population sizes by sex and age group, region, education level.

Variability of main indicators were estimated and are available in table 13.2.1 “Sampling errors - indicators for 2022 AES key statistics” in annex “LV - QR tables 2022 AES (excel)”.

The standard deviations of estimates for variables of interest were checked for compliance with Eurostat requirements - REGULATION (EU) 2019/1700 ANNEX II. The acquired results partially met those requirements. The estimate of the participation rate in formal education and training (age 18-24) slightly exceeded the regulation limit.

13.2.1. Sampling error - indicators

Coefficients of variation, the standard errors and the confidence intervals were calculated in free software R (R Core Team (2023) https://www.R-project.org/) using function vardom from package vardpoor that computes the variance estimation of the sample surveys in domain by the ultimate cluster method (Hansen, Hurwitz and Madow, 1953) with Taylor linearization for non-linear statistics and residual estimation from the regression model to take weight calibration into account.

See table 13.2.1 “Sampling errors - indicators for 2022 AES key statistics” in annex “LV - QR tables 2022 AES (excel)”.

13.3. Non-sampling error

Codes indicating ineligible sample elements are classified as out-of-scope and none as other ineligible. That is the reason, why in table 13.3.1.1 in category "Other ineligible" are 0 elements.

It is not possible to provide information about non-respondents by groups defined by education level and employment status because this information is not available for all non-respondents.

13.3.1. Coverage error

The sampling frame was made using available information from the Register of Natural Persons, the Statistical Register of Dwellings and the Register of Addresses and Buildings. The sampling frame was draw from the register data available in April 2022, covered all persons defined on the target population, who were permanent residents of the Republic of Latvia at the start of the survey (01.09.2022).

Reference period: 01.09.2022

Frequency and timing of frame update: monthly

To reduce under-coverage risks and associated bias risks the sampling frame was prepared according to the target population, taking into account the best available information at the time of sample creation as close as possible to the start of data collection. The age was taken into account so during data collection all respondents are in target population.

13.3.1.1. Over-coverage - rate

See table 13.3.1.1 “Over-coverage - rate” in annex “LV - QR tables 2022 AES (excel)”.

13.3.1.2. Common units - proportion

Not applicable.

13.3.2. Measurement error

2022 AES questionnaire design was improved in comparison to AES 2016. There was added numeration in addition to variable names. The electronic questionnaire was improved as well – the routing was improved and additional logical controls added.

Interviewer training was conducted before the start of the fieldwork. It consisted of three parts:

General information about the survey (aim, legal basis, population, sample size, duration of the fieldwork) and the key results of the previous survey were presented. It also covered the questionnaire used (main sections, average duration of interview, terms used, incl. on different forms of education, and some real-life examples).
More detailed explanations of the questionnaire contents (the type of questions covered by each section, main filters and routing) by emphasizing the most confusing and difficult parts of the questionnaire.
The table of the random numbers was explained. However, it was strongly suggested to interviewers to use electronic questionnaire as all routing and filtering is automatic and thus random activities are also chosen automatically.

13.3.3. Non response error

Non-response bias occurs when non-respondents systematically differ from respondents, leading to estimates deviating from actual values. To prevent non-response bias, as mentioned earlier in 13.1, the data collection process was monitored using R-indicators, and groups identified as underrepresented received increased attention. There is no substantial evidence of non-response's impact on the obtained estimates.

To boost the response rate in the data collection process, repeated calls are made. To mitigate the effect of non-response on the estimates during the weighting process, we consider non-response and calibrate to known population quantities, as described in 13.2.

Item non-response is relatively low. Only few variables were with an item non-response higher than 10% (see 13.3.3.2).

Questions about income of the household which were used to calculate the variable HHINCOME had also a higher non-response, but imputations were used to calculate missing values of the income variable.

13.3.3.1. Unit non-response - rate

See table 13.3.3.1 “Unit non-response - rate” in annex “LV - QR tables 2022 AES (excel)”.

13.3.3.2. Item non-response - rate

See table 13.3.3.2 “Item non-response rate” in annex “LV - QR tables 2022 AES (excel)”.

13.3.4. Processing error

Data processing and validation was done following the recommendations of Eurostat. Standard formatting and variable naming for submission of the data file was used.

Measurement errors were detected by logical checks and verification of received data.

13.3.5. Model assumption error

No specific models were used.

14. Timeliness and punctuality

Top

14.1. Timeliness

AES was conducted and data was produced according to Eurostat’s implementation monitoring table for AES. The fieldwork ended on January 15 2023, and production of microdata was completed by July 15th 2023. Data is planned to be published in the October 2023.

14.1.1. Time lag - first result

10 months. Publishing of the results are planned in October 2023.

14.1.2. Time lag - final result

10 months. Publishing of the results are planned in October 2023.

14.2. Punctuality

The fieldwork for the AES ended on January 15, 2023. The data publishing was planned within 10 months after finishing the fieldwork.

The prechecked microdata was sent to Eurostat within the six months of the end of the national data collection period.

See table 14.2 “Project phases - dates” in annex “LV - QR tables 2022 AES (excel)”.

14.2.1. Punctuality - delivery and publication

Not applicable.

15. Coherence and comparability

Top

15.1. Comparability - geographical

See table 15.1 “Deviations from 2022 AES concepts and definitions” in annex “LV - QR tables 2022 AES (excel)”.

No additional variables related to COVID-19 were collected.

15.1.1. Asymmetry for mirror flow statistics - coefficient

Not applicable.

15.2. Comparability - over time

For AES 2022 several changes and improvements were made. The questionnaire and electronic questionnaire were improved. Data collection modes were used slightly different – in AES 2016, CATI and CAPI modes were used more equally, however, in the AES 2022, CATI was used mostly, and CAPI was used for respondents without correct or not available phone numbers. Also sampling was changed and improved based on AES 2016 data and the one-stage sample was used instead of two-stage. Improvements to weighting schemes was made as well.

See table 15.2 “Comparability - over time” in annex “LV - QR tables 2022 AES (excel)”.

15.2.1. Length of comparable time series

Not applicable.

15.3. Coherence - cross domain

See table 15.3 “Coherence - cross-domain” in annex “LV - QR tables 2022 AES (excel)”.

15.3.1. Coherence - sub annual and annual statistics

Not applicable.

15.3.2. Coherence - National Accounts

Not applicable.

15.4. Coherence - internal

AES results for a given data collection round are based on the same microdata and results are calculated using the same estimation methods, therefore the data are internally coherent.

16. Cost and Burden

Top

In line with the strategic directions of the European Statistics System and latest trends in statistical production, continuous use of information acquired in regular CSB surveys and proportionate reduction of the response burden are among the key CSB priorities.

In cooperation with holders of administrative data and in line with the competences provided for in the Statistics Law, CSB is striving to solve the issues related to the use of administrative data sources, thus aiming to acquire as comprehensive and high-quality administrative data allowing to reduce response burden on enterprises and households as possible.

CSB measures to improve use of administrative data and reduce response burden taken in 2020 (in Latvian only).

17. Data revision

Top

17.1. Data revision - policy

Not applicable.

17.2. Data revision - practice

Not applicable.

17.2.1. Data revision - average size

Not applicable.

18. Statistical processing

Top

18.1. Source data

The sampling frame was built from the register data available in April 2022, and it covered the whole target population, i.e., usual residents of Latvia who at the start of the survey (1 September 2022) were aged 18–69 and lived in private households. In total sample frame included 1 035 816 people.

To implement sampling procedures, also a sample frame as list of units (persons) was prepared. CSB usually uses multi-mode research method, but due to COVID-19 and thus cancellation of face-to-face interviews it was decided to shift to CATI in 2020 (as CSB has phone number of the most of sample units). As for now, CAPI has not returned to the previous level.

Sample allocation was calculated based on 2016 AES data using two variables – FED (formal education) and NFENUM (non-formal education). In 2022 AES, a one-stage stratified systematic sampling of persons is used and several stratification options were tested. Territorial stratification is made according to the registered place of residence. In AES there are 48 strata divided by statistical region (Rīga, Pierīga, Kurzeme, Vidzeme, Zemgale, Latgale), sex (male, female) and age group (18–24, 25–34, 35–54, 55–69). Based on the scope differences between 2016 AES and 2022 AES, strata allocation in AES 2022 was following: for age 18–24 were duplicated with correction on lower response rate from estimation of age group 25–34, 55–69 were estimated based on 55–64.

To distribute the sample evenly over the whole territory and among levels of education, systematic sampling was used, and units were sorted within strata, survey polygons (small territories), administrative and territorial units, educational attainment levels.

Sampling procedure resulted in the gross sample volume of 8 764 persons.

See table 18.1 “Source data” in annex “LV - QR tables 2022 AES (excel)”.

18.2. Frequency of data collection

Every 6 years.

18.3. Data collection

The responsive adaptive survey design (ASD), utilizing R-indicators as measures of representativeness, is tested in CSB as a flexible approach for organizing social surveys. R-indicators as set out in Schouten, Cobben and Bethlehem (2009)^{^[1]} and Shlomo, Skinner and Schouten (2012)^{^[2]} help to identify potential bias by measuring the degree of difference between responding and non-responding sample groups. Based on the monitoring and analysis of the R-indicators, active interventions are implemented during data collection process to increase the chances of obtaining representative set of final response unit, thereby reducing variance in the weights of the final survey data.

For monitoring needs an ASD dashboard (see the added picture), which was evaluated by the survey manager, was introduced in AES 2022. In the dashboard it was possible to monitor overall response level, status of the questionnaires (new, opened, non-response, response) and balance indicators in different respondent groups (age, sex, region).

The ASD approach was focused on ensuring high quality of fieldwork, emphasising representativeness of the set of sample response units in particular. To achieve the goal, several steps were taken during the fieldwork. At the first part of the data collection, R-indicators were used for monitoring needs, and afterwards the groups of imbalances (see Picture 2) were identified and resources of interviewers were redirected to collect the data on these groups.

[1] Schouten B., Cobben F., Bethlehem J. (2009) Indicators for the representativeness of survey response, Computer Science, Chemistry Dalton Transactions.

[2] Shlomo N., Skinner C., Schouten B. (2012) Estimation of an indicator of the representativeness of survey response, Volume 142, Issue 1, January 2012, 201-211.

Annexes:
ASD_dashboard

18.4. Data validation

Data processing and validation was done following the recommendations of Eurostat. Standard formatting and variable naming for submission of the data file was used.

Measurement errors were detected by logical checks and verification of received data.

18.5. Data compilation

Imputation was made for calculation of variable HHINCOME. Imputation was made for respondents who did not provide any income or provided their income as interval. Hot deck imputation - a method for handling missing data in which each missing value is replaced with an observed response from a "similar" unit.

Grossing of the net sample of individuals is done with the help of weighting procedure. The weighting procedure was carried out as follows: at first, the design weights were adjusted for unit non-response and then adjusted design weights were calibrated taking into account demographic data.

To select optimal weights, four different weights were calculated. Final weights were compared by using the estimated accuracy of the target variable estimators. The standard errors of the required variable estimators were compared to the limits of the standard errors and 2016 AES results, thus ensuring that further the sample has better quality.

Four options for pre-calibration weights were calculated:

first pre-calibration weights where sample design weight adjusted to unit non-response by homogeneity group (stratum);
second pre-calibration weights where first pre-calibration weights divided by estimated response propensities;
third pre-calibration weights where second pre-calibration weights multiplied by q coefficient, to calibrate weights so that the sums are equal to those of the basic design weights in each stratum;
fourth pre-calibration weights where design weight adjusted for unit non-response dividing by response propensities for valid response set.

Response propensity was estimated with binomial regression. Response propensity model was developed the same way as for the ASD. Variables were selected by minimizing AIC and excluding insignificant regressors from the model. At the threshold of 0.51, which maximizes the number of correct predictions, modification of response propensities to the predicted answer matches the actual response in 67.4% of all cases. The auxiliary variables used to estimate response propensities are:

sex (male, female) and age group (18–24, 25–34, 35–44, 45–54, 55–69);
statistical region (Rīga, Pierīga, Kurzeme, Vidzeme, Zemgale, Latgale);
nationality (1, 13, 17, 21, 45, 9999);
citizenship (Latvian, other);
level of income (20th percentile groups of income);
educational attainment level (groups of ISCED levels: 0–2, 3–4, 5–8).

Estimated response propensities were further used in weighting.

Calibration to population size was applied to each of the four pre-calibration weights in the following groups:

sex (male, female) and age group (18–24, 25–34, 35–44, 45–54, 55–69);
statistical region (Rīga, Pierīga, Kurzeme, Vidzeme, Zemgale, Latgale);
educational attainment level (groups of ISCED levels: 0–2, 3–4, 5–8).

Sample errors in the main indicators also are calculated in different breakdowns. Accuracy of the four final weights was compared among target variables (Participation rate in formal education and training, Participation rate in non-formal education and training, and Everyday learning) thus finding out the best weights. Compared to others, the estimates of the variables Everyday learning and Participation rate in non-formal education and training is slightly higher when using the first weights, but the estimate Participation rate in formal education is slightly lower when using the first weights. The evaluation of the estimates for the variable of interest shows smooth consistency among all weights, precision and confidence intervals are very close among methods.

18.5.1. Imputation - rate

See table 18.5.1 “Imputation - rate” in annex “LV - QR tables 2022 AES (excel)”.

18.6. Adjustment

Not applicable.

18.6.1. Seasonal adjustment

Not applicable.

19. Comment

Top

None.

Related metadata

Top

Annexes

Top

LV - QR tables 2022 AES (excel)
Questionnaire in Latvian