Adult Education Survey 2022

National Reference Metadata in Single Integrated Metadata Structure (SIMS)

Compiling agency: Statistics Finland


Eurostat metadata
Reference metadata
1. Contact
2. Metadata update
3. Statistical presentation
4. Unit of measure
5. Reference Period
6. Institutional Mandate
7. Confidentiality
8. Release policy
9. Frequency of dissemination
10. Accessibility and clarity
11. Quality management
12. Relevance
13. Accuracy
14. Timeliness and punctuality
15. Coherence and comparability
16. Cost and Burden
17. Data revision
18. Statistical processing
19. Comment
Related Metadata
Annexes (including footnotes)
 



For any question on data and metadata, please contact: Eurostat user support

Download


1. Contact Top
1.1. Contact organisation

Statistics Finland

1.2. Contact organisation unit
FI-00022 Statistics Finland
Finland
1.5. Contact mail address
FI-00022 Statistics Finland
Finland


2. Metadata update Top
2.1. Metadata last certified 21/12/2023
2.2. Metadata last posted 21/12/2023
2.3. Metadata last update 21/12/2023


3. Statistical presentation Top
3.1. Data description

The Adult Education Survey (AES) covers adults’ participation in education and training (formal - FED, non-formal - NFE and informal learning - INF). The 2022 AES focuses on people aged 18-69. The reference period for the participation in education and training is the twelve months prior to the interview.

Information available from the AES is grouped around the following topics:

  • Participation in formal education, non-formal education and training and informal learning
  • Volume of instruction hours
  • Characteristics of the learning activities
  • Reasons for participating
  • Obstacles to participation
  • Access to information on learning possibilities and guidance
  • Employer financing and costs of learning
  • Self-reported language skills

For further information see the 2022 AES legislation (http://ec.europa.eu/eurostat/web/education-and-training/legislation) and the 2022 AES implementation manual (http://ec.europa.eu/eurostat/web/education-and-training/methodology).

3.2. Classification system

- Classification of Learning Activities (CLA, 2016 edition)
- International Standard Classification of Education 2011 (ISCED 2011)
- Classification of Occupations 2008 (ISCO 08)
- Classification of economic activities Rev. 2 (NACE Rev. 2)

3.3. Coverage - sector

AES covers all economic sectors.

3.4. Statistical concepts and definitions

Definitions as well as the list of variables covered are available in the 2022 AES implementation manual (http://ec.europa.eu/eurostat/web/education-and-training/methodology).

3.5. Statistical unit

Individuals, non-formal learning activities.

3.6. Statistical population

Individuals aged 18-69 living in private households.

3.7. Reference area

Finland

3.8. Coverage - Time
We have carried out these following national adult education surveys (reference year and field work period):
AES 1980
AES 1990
AES 1995 01/09-31/12
AES 2000 01/02-30/06
AES 2006 06/03-16/06
AES 2012 01/09-31/12
AES 2017 05/01-30/06
AES 2022 04/09/22-04/03/23
3.9. Base period

Not applicable.


4. Unit of measure Top

Number, EUR.


5. Reference Period Top

Our fieldwork period was 04/09/2022-04/03/2023.

Reference period is 12 months prior to the interview.


6. Institutional Mandate Top
6.1. Institutional Mandate - legal acts and other agreements

At European level:

Basic legal act: Regulation (EU) 2019/1700

Implementing act: Commission Implementing Regulation (EU) 2021/861

At national level:

Tilastolaki 280/2004

6.2. Institutional Mandate - data sharing

Not applicable.


7. Confidentiality Top
7.1. Confidentiality - policy

The compilation of statistics is regulated by the Statistics Act. Data collected in other contexts must be primarily exploited for statistics. The vast majority of data are drawn from diverse registers. Only such data that cannot be obtained from elsewhere are collected from data suppliers. State authorities have a statutory obligation to supply data from the information in their possession. Enterprises, municipal organisations and non-profit institutions are obliged to supply data on matters separately prescribed in law.

Basic statistical data are confidential. Permission to use data can be granted for scientific studies and statistical surveys by means of the user licence procedure. Permission to use data can be given so that data enabling direct identification of the statistical unit have been removed. Exceptions to secrecy are public data in the Business Register and public data describing the activity of central and local government authorities.

7.2. Confidentiality - data treatment

The processing of personal data during the compilation of statistics is subject to the provisions of the Personal Data Act. Personal data refer to identifiable data on all natural persons, including natural persons practising a trade or profession. Observation of the Act in the compilation of statistics requires careful advance planning of the data processing process prior to the establishment of registers of persons. Data needs are investigated thoroughly and data suppliers are only asked for such data that are necessary for the compilation of statistics. The data are protected, stored and destroyed at different stages of processing in accordance with data protection guidelines.

The Act allows subsequent use of personal data derived from diverse administrative registers for statistical purposes. Sensitive personal data can be used for statistical purposes, but only subject to specific conditions. The personal identity number may be used in the compilation of statistics if it is important to identify an individual in a register. This is usually necessary when registers of persons are combined.

The Personal Data Act requires that register descriptions are produced and made publicly available for all registers of persons.


8. Release policy Top
8.1. Release calendar

We have a release calendar, which is publicly accessible. This calendar includes future releases that will be released in ongoing calendar year.

8.2. Release calendar access

 Aikuiskoulutukseen osallistuminen - Tilastokeskus (stat.fi)

8.3. Release policy - user access

Everyone who wants to use the Finnish EU-AES-data needs to fill in an application for licence to use statistical data. The applicant for a licence may be an official body, an institution, a person in charge of study or an individual researcher. In any case, there need to be named a specific person or persons who will use the data. The period the data are going to be used is also a compulsory information.

The data may only be used for statistical and research purposes. The purpose for which the AES-data are going to be used shall be specified, in sufficient detail, in a research plan. The users must also sign the pledge of secrecy.

The EU-AES-data being available for researchers is in form from which individual units cannot be identified. That means that personal identity numbers, names, addresses and precise dates of birth are removed from the data. The level of education attainment is presented in the form of two numbers.

Researchers of the universities in Finland could have the AES-data without charges. This is grounded on the contract between the Ministry of Education and Statistics Finland.


9. Frequency of dissemination Top

Every 6 years.


10. Accessibility and clarity Top
10.1. Dissemination format - News release

Not applicable.

10.2. Dissemination format - Publications
10.3. Dissemination format - online database

Not applicable.

10.3.1. Data tables - consultations

Not applicable.

10.4. Dissemination format - microdata access

Everyone who wants to use the Finnish EU-AES-data needs to fill in an application for licence to use statistical data. The applicant for a licence may be an official body, an institution, a person in charge of study or an individual researcher. In any case, there need to be named a specific person or persons who will use the data. The period the data are going to be used is also a compulsory information.

The data may only be used for statistical and research purposes. The purpose for which the AES-data are going to be used shall be specified, in sufficient detail, in a research plan. The users must also sign the pledge of secrecy.

The EU-AES-data being available for researchers is in form from which individual units cannot be identified. That means that personal identity numbers, names, addresses and precise dates of birth are removed from the data. The level of education attainment is presented in the form of two numbers.

Researchers of the universities in Finland could have the AES-data without charges. This is grounded on the contract between the Ministry of Education and Statistics Finland.

10.5. Dissemination format - other

Not applicable.

10.5.1. Metadata - consultations

Not applicable.

10.6. Documentation on methodology

Our methodological documents will be available by the end of 2024.

10.6.1. Metadata completeness - rate

Not applicable.

10.7. Quality management - documentation

These documents will be available in May 2024.


11. Quality management Top
11.1. Quality assurance

The questionnaire was based on the face-to-face interview questionnaire used in the earlier rounds of the AES. This time the questionnaire was adapted to fulfil the requirements of web data collection and CATI interview, so some item structures and questions were renewed. The mixed mode questionnaire was carefully designed and pre-tested. In pre-testing, the main focus was the cognitive and usability testing of the web questionnaire. In all, 10 test interviews were conducted and results were taken into account in questionnaire design. Also, the themes of distance learning and informal learning were of interest for the pre-testing and there was one focus group organized to discuss about these themes. The test results concerning distance learning and informal learning are exploitable in the next AES, too.

11.2. Quality management - assessment

Overall assessment:

The survey was conducted as a mixed mode data collection. Web data collection was offered as the first option, and for the first time CATI as the second. The change in data collection method caused the need to renew the questionnaire. The overall quality of the survey is good and the introduction of new data collection method did not significantly affect data quality.

Weaknesses:

  • Recall problems 
  • High response burden

Strong points:

  • The questionnaire includes many solutions that support recollection (list of education organisations, introductions and visual guidance in questions concerning details of learning activities etc.).
  • Before data collection the interviewers participated in the training course. The content of the course included several themes like the background and objectives of the AES, the questionnaire and concepts used in the AES and field work instructions.
  • The usability of web questionnaire was tested and evaluated to be adequate.

Most problematic variables:

On the job training (NFEGUIDEDJT): The concept of on the job training is problematic and difficult to translate unambiguously into Finnish. In the earlier pre-tests, it has been recognized that respondents tend to interpret the concept too broadly. To prevent over-reporting, especially in the web data collection, there were additional check questions included in the web questionnaire.


12. Relevance Top
12.1. Relevance - User Needs

See annex "User Needs".



Annexes:
User Needs
12.2. Relevance - User Satisfaction

Not available.

12.3. Completeness

Our dataset covers all variables as requested in the 2022 AES legislation.

12.3.1. Data completeness - rate

Not applicable.


13. Accuracy Top
13.1. Accuracy - overall

The sampling frame of three age strata (18-24, 25-64, 65-69) was taken into account in the design weights. 

We tested the weighting strategy presented by Laaksonen and Hämäläinen (2018). The method is a combination of the response propensity model and calibration. At first we calculated the basic weights for respondents. We assumed that the response mechanism is random within strata but not random between strata. Thus the basic weights are the inverses of inclusion probabilities in each stratum.

Next we created a logit model to estimate response probabilities for each respondent. We tested several models with different covariates and their interactions included. The model with gender, age (in 5-year categories), level of education (in 3 categories), urbanization of municipality (in 3 categories), language (in 3 categories) and family status (in 4 categories) as covariates without any interactions gave the best results.

The basic weights were then divided by the estimated response probabilities. Next the weights created in the previous step were calibrated so that the sums are equal to the basic weights in each stratum. This was done by multiplying the weights by the ratio, where h refers to stratum, ? refers to basic weights and ? refers to the estimated response probabilities for each respondent.

In the next stage these weights were used as starting weights for calibration. We calibrated the weights to correspond to the population distributions of age, education, region and gender.

Laaksonen S. & Hämäläinen A.: Joint response propensity and calibration method, Statistics in Transition, May 2018.

13.2. Sampling error

Correspondences between the sample and the population are quite good. There are some slight differences; people aged 25-54 are slightly overrepresented in the sample, same as persons in rural areas.

13.2.1. Sampling error - indicators
See table 13.2.1 “Sampling errors - indicators for 2022 AES key statistics” in annex “FI - QR tables 2022 AES (excel)”.
 
The standard error was calculated here using the formula for a simple random sample and assuming that non-response was negligible:
 
The (1-α)'100% confidence interval was calculated using the formula Pr{pϵ(p-tαd(p),p-tαd(p))}=1-α. 
 
The value tα corresponding to a 95% confidence interval is 1.96.
 
The coefficients of variation were calculated using the formula v=s/x, where s=standard deviation and x=mean. The coefficient of variation is valid only for at least interval scale variables.
 
All these calculations were made using SAS-software.
13.3. Non-sampling error

See items 13.3.1.-13.3.5.

13.3.1. Coverage error

Register used in sampling was population database maintained by Statistics Finland 2022. Time lag between the last update of the register and actual sampling was very short. Therefore there is no known shortcomings.

13.3.1.1. Over-coverage - rate

See table 13.3.1.1 “Over-coverage - rate” in annex “FI - QR tables 2022 AES (excel)”.

13.3.1.2. Common units - proportion

Not applicable.

13.3.2. Measurement error

Main sources of measurement errors

Recall problems

The 12 months reference period is a very long time period to memorize all non-formal education and training that a respondent has participated in. That may cause underreporting. In addition, challenges in memorizing a name of an organizer of non-formal education may lead to placing the organizers in a wrong category.

The length of the questionnaire made it difficult for the respondents to remember the names of the non-formal educations they mentioned at the beginning when later answering the course-specific questions of the non-formal educations. The questionnaire was designed to repeat the previously mentioned course name more often on the questionnaire to help the respondent's burden.

Understanding of concepts and terms

The absence of an interviewer in the web data collection may have led to some misinterpretations of single concepts or meaning of questions. For example, although the idea of deliberate learning is included in the INF questions, respondents' interpretations about the scope of the INF may have been to some extent broader than intended. This problem was recognized in the pre-testing, but there was no evident way to prevent it. This source of measurement error is emphasized when self-administered data collection mode is used.

Response burden

The questionnaire is long and answering requires a lot of effort from a respondent. There might be respondents who experienced response burden too high and gave cursory answers instead of thorough processing. This source of measurement error concerns both modes of data collection. The efforts to prevent the error were made in questionnaire design (web questionnaire shorter than the interview as the national questions were only included in the interview; usability testing of the web questionnaire to enable fluent answering) and in the interviewer training (discussion about how to motivate respondents).

13.3.3. Non response error

The non-response was higher among younger respondents (persons aged 18-24 and 25-34) and those with less than upper secondary education. But all in all, it may be stated that the interviewees were relatively representative of the population, and that these distortions in terms of level of education and age could be corrected by means of weighting coefficients.

13.3.3.1. Unit non-response - rate

See table 13.3.3.1 “Unit non-response - rate” in annex “FI - QR tables 2022 AES (excel)”.

13.3.3.2. Item non-response - rate

See table 13.3.3.2 “Item non-response rate” in annex “FI - QR tables 2022 AES (excel)”.

13.3.4. Processing error

The data was checked by two researches from March 2023 to May 2023. They checked and corrected two data sets, (national AES and EU-AES) at the same time. There were some different variables in these two data sets, and the labels (in Finnish and in English) differed as well. The data was processed using SAS-software.

Both data sets are large and quite complicated, and they have two statistical units, individuals and learning activities. The biggest discomfort is connected to the learning activity as statistical unit. The respondent may have 2 non-formal and one formal education activity to be asked for more information. If there was a little mistake in the interview for example connected to the type of education, it might lead to quite big corrections in the data and larger item-response, too.

Fields of training were first coded by the researchers.

The remaining open questions were post-coded by the researches. The most difficult cases were discussed and completed in research group's meetings.

We do not foresee processing errors being any problem for us.

13.3.5. Model assumption error

Not applicable.


14. Timeliness and punctuality Top
14.1. Timeliness

We do not see, that the time lag between data availability and reference period is too long. We can still reduce it by automatising the process as far as possible.

14.1.1. Time lag - first result

216 days.

14.1.2. Time lag - final result

11 months.

14.2. Punctuality

90 days.

See table 14.2 “Project phases - dates” in annex “FI - QR tables 2022 AES (excel)”.

14.2.1. Punctuality - delivery and publication

Not applicable.


15. Coherence and comparability Top
15.1. Comparability - geographical

See table 15.1 “Deviations from 2022 AES concepts and definitions” in annex “FI - QR tables 2022 AES (excel)”.

Some additional variables/information related to COVID-19 were collected, see table 15.1.

15.1.1. Asymmetry for mirror flow statistics - coefficient

Not applicable.

15.2. Comparability - over time

There are no changes, that might cause limitations in the use of data for comparisons over time.

See table 15.2 “Comparability - over time” in annex “FI - QR tables 2022 AES (excel)”.

15.2.1. Length of comparable time series

Not applicable.

15.3. Coherence - cross domain

There are no major differences between the results of AES 2022 and other data sources (CVTS, AES 2017).

See table 15.3 “Coherence - cross-domain” in annex “FI - QR tables 2022 AES (excel)”. 

15.3.1. Coherence - sub annual and annual statistics

Not applicable.

15.3.2. Coherence - National Accounts

Not applicable.

15.4. Coherence - internal

AES results for a given data collection round are based on the same microdata and results are calculated using the same estimation methods, therefore the data are internally coherent.


16. Cost and Burden Top

Costs:

Staff involved in administering the survey (in full-time equivalent, without the interviewers): 2 researchers

Costs of the field work: Field work 96 000 €, planning and programming 137 000 €, total 233 000 €

Burden:

Average time for answering the questionnaire (in minutes): CAWI 29 minutes and CATI 43 minutes.

Because certain variables are mandatory according to the regulation, chances to minimise burden are quite limited. Basically we did two things: we used register based information as much as possible, and decreased the number of nationally interesting questions (which are not mandatory). Because our Ministry of Education and Culture is financing our survey, we had to take a notice of their needs too.

But with these modifications, we succeeded to shorten the average time for answering by approximately 6 minutes compared to AES 2016.


17. Data revision Top
17.1. Data revision - policy

Not applicable.

17.2. Data revision - practice

Not applicable.

17.2.1. Data revision - average size

Not applicable.


18. Statistical processing Top
18.1. Source data

Sampling frame: Population database maintained by Statistics Finland, 2012.

Sampling design: Stratified random sampling of elements.

The current sampling design for the Finnish Adult Education Survey is a stratified sampling design where the stratification is done based on age categories (18-24, 25-64 and 65-69). The precision requirements for the 2022 survey are set out in the Regulation (EU) 2019/1700 Annex 2. The youngest age group is only included in the study for national purposes because Finland has a derogation decision concerning the precision requirement for the participation rate in formal education and training (age 18-24). We will get this information from a registry source in the near future. Therefore, the testing was done in order to meet the precision requirement for the participation rate in non‐formal education and training (age 25‐69).

The principle of testing was to use variance estimates of the study variables in strata based on previous data collection and adjust the number of respondents of the standard error formula for stratified sampling. The estimates for the precision requirement were calculated under different schemes. The required sample size for every stratum is calculated by expanding the number of respondents under specified scheme with the inverse of response rate in stratum. The overall sample size is the sum of these sample size terms.

Table 1 (in the annex below) demonstrates this strategy by showing how the coefficient of variation would change, if the number of respondents in the largest strata would increase by 574 respondents and the response rate in each strata would stay the same as in the previous survey. 

The precision requirements for the 2016 AES stated that the estimate of the absolute margin of error should not exceed 1.7 percentage points for countries like Finland, that is countries with a population aged 25 to 69 of one million to three and half million. The legislation for the Adult Education Survey has changed since the previous data collection to include the age group 65-69. With this new age group included the absolute margin of error for the previous AES survey was 1.6 percent. Therefore, under the new legislation, the precision requirement would be met using the same allocation strategy as in the previous AES survey, where the sample sizes for each stratum were 350, 5400 and 400 respectively.

However, our testing was based on the assumption that the response rate would stay the same as in the previous survey. We know from other surveys that the response rate has decreased during the past few years. For example, in the labour force survey, the non-response rate was 34.4 percent in 2018 and 48.0 percent in 2021. Therefore, we decided to increase the sample size so that the number of respondents would be enough to achieve the precision requirement in the upcoming survey. Based on the variance estimates from the previous survey, we decided to increase only the sample size of the age group 25-64. The total sample size is 7500 and it is allocated into the tree different strata so that the sample size for the age group 18-24 will be 350, 6750 for the second strata (25-64) and 400 for the oldest age group (65-69).

Stratification according to age categories:
18-24: n=350
25-64: n=6 750
65-69: n=400

Systematic random sampling from the frame which was sorted by the domicile code (i.e. unique address code) which results in implicit geographic stratification in each stratum.

See also table 18.1 “Source data” in annex “FI - QR tables 2022 AES (excel)”.



Annexes:
Table 1 Change in the coefficient of variation
18.2. Frequency of data collection

Every 6 years.

18.3. Data collection

See also table 18.1 “Source data” in annex “FI - QR tables 2022 AES (excel)”.

The national questionnaires are attached.



Annexes:
Questionnaire in Finnish, Swedish and English
18.4. Data validation
The data was checked by two researches from March 2023 to May 2023. They checked and corrected two data sets (national AES and EU-AES) at the same time. There were some different variables in these two data sets, and the labels (in Finnish and in English) differed as well. The data was processed using SAS-software. 
 
We have not find any variables having more errors than the others. We do not foresee this being any problem for us.
18.5. Data compilation

None imputation made.

18.5.1. Imputation - rate

None.

See table 18.5.1 “Imputation - rate” in annex “FI - QR tables 2022 AES (excel)”.

18.6. Adjustment

Not applicable.

18.6.1. Seasonal adjustment

Not applicable.


19. Comment Top

None.


Related metadata Top


Annexes Top
FI - QR tables 2022 AES (excel)