3. Statistical processing |
Top |
|
1.Survey process and timetable |
Preparation work FSS preparatory work took place from the first semester of 2015 onwards and extended until the third quarter of 2016. The main tasks of the teams responsible for organising the FSS were the following: Consultation of users, definition, design and composition of the questionnaire and respective instruction manual, update of the list of producers, recruitment and training of the interviewers. It was also necessary to prepare the webization of the data collection. This data collection mode was used on a sub-sample, and implied a set of new tasks of preparation work, namely the adaptation of the questionnaire and the technical definitions, development and testing the web registration application. For the face-to-face collection mode, there wasn’t any questionnaire test due to the fact that the questionnaire remained mostly unchanged since 2013. Chronogram of the main operation activities The main activities of this statistical operation, from the preparation to the dissemination of results, are recorded in the chronogram shown in 3-1. Calendar (overview of work progress) (in annex).
The FSS data collection was split in two periods: September-October 2016 for the web data collection; from November 2016 to May 2017 for the face-to-face data collection. Data collection also included critical appraisal, recording, validation and analysis. Collected data was subject to a critical appraisal according to the guidelines defined in the control manual (document provided to field chain and containing, for each question, the procedures to be adopted for a preliminary control of data collection, especially identifying compulsory questions, relationships among variables, etc.). In turn, the consistency of collected data should be analysed in line with the provisions of the Instruction Manual and its alignment with local circumstances. |
2. The bodies involved and the share of responsibilities among bodies |
Statistics Portugal (INE) was the entity responsible for conducting the FSS, in cooperation with the Regional Statistical Office of the Azores and the Regional Directorate of Statistics of Madeira. The Agriculture and Environment Statistics Unit of the Economic Statistics Department and the Data Collection Department were the two units responsible for the operation at national level, having been in charge of organising and conducting all tasks from data collection to data validation and data dissemination. This statistical operation involved over 200 staff across the whole country (Mainland and Islands), and was based on compliance with pre-defined data collection procedures aimed at organising, managing, monitoring and controlling data collection. The FSS face-to-face data collection model focused on the collection services, with the coordination and technical support structure and coordination upstream, and field teams downstream. 3-2. FSS – Organisational structure (in annex) shows the organisational structure of the statistical operation.
The responsibilities are allocated as follows:
Interviewer: data recording, validation, critical appraisal, analysis and confirmation/correction in computer-readable format.
Local technical staff member (Portuguese acronym: TL) - guiding and monitoring data collection, recording, validation and analysis, - ensuring the organisational logistics and administrative management at their section level, - ensuring information sessions to interviewers, - ensuring, in cooperation with the regional coordinating body, the allocation of the different intervening parties to the SAGR chain and the distribution of work to the interviewers, - managing the questionnaire transfer, notably by reallocating questionnaires at the level of the collection section and between this section and other sections of the region or of other regions, - contributing to overcome the difficulties shown by the interviewers and assessing the quality of the information provided by them, being able to hand them back certified questionnaires, - preparing meetings and drafting periodical monitoring reports on the work.
Regional coordinating body (Portuguese acronym: CR) - recruiting, selecting and training human resources, effectively establishing shared intervention in terms of operational management, - coordinating the operation in each region, being responsible for compliance with the respective budget (upon final validation of collection structure expenses), - guiding and monitoring data collection, recording, validation and analysis at regional level, - ensuring information sessions for the interviewers, and overcoming difficulties these considered to be insurmountable, - preparing periodical monitoring reports on the work at regional level, as well as a regional report on the operation.
National coordinating body (Portuguese acronym: CN) - composed of representatives of the Economic Statistics Department and the Data Collection Department – organisational units of Statistics Portugal: - defining the organisational and logistical structure of the statistical operation, - monitoring works, thereby gauging the need to intervene in order to solve critical situations. It also assumed responsibility for the project’s budget control. Structure of the field chain The collection structure was initially sized based on the number of agricultural holdings forming the farm register, the size of the geographical area of intervention of teams, and, at the upper levels of the chain of collection, the profile and availability of human resources. Geographical distribution/organisation The field structure was composed of 7 services distributed across the country, 32 staff supervising a team incorporating 216 interviewers. As the collection stage of the operation progressed, the national coordinating body sent to the different regional coordinating bodies the current status of the collection on a weekly basis. This included, per collection service, information on the number of questionnaires collected, still to collect, their status and certification. |
3. Serious deviations from the established timetable (if any) |
There was not any deviation from the established calendar. |
Annexes: 3-1. Calendar (overview of work progress) 3-2. FSS organisational structure |
3.1. Source data |
1. Source of data |
The FSS is a sample survey. |
2. (Sampling) frame |
The source of the frame was BAA (farm register). To update the list of holders, and with reference to the agricultural farm register/agriculture sample base – an agricultural holdings base to support agricultural surveys – ad hoc cross-checks were made with statistical files (statistical units file and specific surveys), with data from administrative sources, namely Financing Institute for Agriculture and Fisheries (Portuguese acronym: IFAP)
The statistical files are: SOURCE: Agricultural surveys (vegetable, orchard, floriculture, etc.) Responsibility: Statistics Portugal Coverage: All agricultural holders with specific productions Geographical scope: Mainland, Autonomous Region of the Azores, Autonomous Region of Madeira Reference point: 2009 to 2015 SOURCE: Statistical units file Responsibility: Statistics Portugal Coverage: All companies and self-employed people Geographical scope: Mainland, Autonomous Region of the Azores, Autonomous Region of Madeira Reference point: online (2016) For administrative sources, see item 4.1 below In addition, use was made of other sources: - of files with specific information from the Autonomous Region of Madeira and, - on an ad hoc basis, of information scattered in files from other statistical surveys of the Economic Statistics Department, notably the inquiry populations of statistical operations targeted at poultry farms and nurseries. The type of frame is a list frame.
The list of producers became available at April 2016. In the process of updating and rendering the different available sources compatible, Statistics Portugal used a tool developed in QualityStage, a data quality management application. This application supported the implementation of processes within the scope of the standardisation and consolidation of names and addresses. Based on the use of the QualityStage tool, a sequential process was established to compare sources, two by two. After the definition of survival criteria, this gave rise to provisional lists of producers. The provisional result is compared with a new source, until a final consolidated list of producers is obtained. It was deemed necessary to define survival rules for the “selection” of the producer who subsists given a group of “potentially” equal producers; essentially, the rules are related to the quality/timeliness associated with each source. In the 2016 process the list was updated with the latest data obtained from the Financing Institute for Agriculture and Fisheries, the Autonomous Region of Madeira, and also available in the Economic Statistics Department. Following implementation of the QualityStage software and definition of survival rules for producers, a provisional list was obtained, which was subject to various types of analyses, as a result of the enhancement, improvement and update of information: (a) Spelling correction of names, addresses and cities, undetectable in the standardisation made by the software application; (b) Construction of Access queries to detect and eliminate possible double counting that may persist: same names/District/Municipality/Commune (DT/MN/FREG)/Tax identification number; names/telephone/ DT/MN/FREG/address; etc. (c) Comparison with data from the statistical units file to complement registers with missing information, in particular names, incomplete or unknown addresses, DT/MN/FREG/4 and 7-digit postal_code; (d) Update of the DT/MN/FREG code of the holding and the producer with current territorial referencing (Portuguese acronym: REFTER) codes, ensuring completion of all fields; (e) Assignment of cities of the Portuguese Mail Services (CTT) when there is a common 7-digit DT/MN/FREG/postal_code key between the two bases of comparison; (f) Completion of not valid postal codes through the DT/MN/FREG link with the CTT table for which there is a direct match; (g) Replacement of legal person identification numbers started with 8xx by new numbers, started with 1xx or 2xx; (h) Filtering of telephone characters and elimination of telephones without 9 digits; (i) Elimination of the completion of the address field, whenever the address and city fields were exactly alike. |
3. Sampling design |
3.1 The sampling design |
Single-stage stratified random sampling of holdings. The sample was selected independently in each stratum by sequential simple random selection without replacement. That is, within each stratum holdings were sorted by the random number associated with them and were selected for the sample the first agricultural holdings. |
3.2 The stratification variables |
The sample was stratified by agrarian region, geographical region (NUTS level II), groups of general farm type, and a variable (ST) which characterizes the holding by effective livestock or by size classes of UAA. In order to obtain good results for some variables, with significative importance at national level, but concentrated in a relatively small number of holdings, it was adopted a stratification “in cascade”. Some strata were built that contained all the holdings of the region with a non-zero value or above a certain value for those variables. It’s called stratification “in cascade”, because the holdings with values of the concerning variables above certain limits were progressively isolated. All the remaining holdings, not belonging to these special strata, were stratified by size classes of UAA. The stratification and the variables used can be found in annex: 3.1-3.2. The stratification variables and the sampling stage. |
3.3 The full coverage strata |
Strata with less than 10 holdings were fully covered. |
3.4 The method for the determination of the overall sample size |
The size of the sample was calculated by NUTS II in order to meet the accuracy requirements defined in the EU Regulation. The total sample size was 27575 holdings, corresponding to a sampling fraction of approximately 9,1%. This sample size was considered adequate in order to get, for each region (NUTS II), a sufficient precision for the most important variables. Later adjustments to sample size were made to improve the accuracy of some indicators considered relevant. |
3.5 The method for the allocation of the overall sample size |
For the non-exhaustive stratum, Neyman allocation was used to calculate the optimal sample size for each stratum, based on the number of holdings. See annex: 3.1-3.5 Neyman allocation. |
3.6 Sampling across time |
Not applicable. |
3.7 The software tool used in the sample selection |
For the study and selection of the sample the package SAS was used, with programs made for the occasion. |
3.8 Other relevant information, if any |
Nothing relevant. |
4. Use of administrative data sources |
4.1 Name, time reference and updating |
The farm register was updated from crossing the agricultural sample base (based on the 2009 General Agricultural Census and updated on the basis of agricultural surveys and other sources) with data from the following administrative sources:
SOURCE: Payments under PAC policy Responsibility: Financing Institute for Agriculture and Fisheries (IFAP) - IACS Coverage: Agricultural holders that actually received aid in the reference year Geographical scope: Mainland, Autonomous Region of the Azores, Autonomous Region of Madeira Reference point: Crop year 2015/2016 Legal Basis: Regulation (EU) No 1307/2013 of the European Parliament and of the Council of 17 December 2013 establishing rules for direct payments to farmers under support schemes within the framework of the common agricultural policy and repealing Council Regulation (EC) No 637/2008 and Council Regulation (EC) No 73/2009 Updating frequency: yearly SOURCE: SNIRA (Animal register)
Responsibility: Financing Institute for Agriculture and Fisheries (IFAP) - IACS Coverage: livestock keepers at national level Geographical scope: Mainland, Autonomous Region of the Azores, Autonomous Region of Madeira Reference point: September 2016 Legal Basis: Council Directive 92/102/EEC of 27 November 1992 on the identification and registration of animals; Commission Regulation (EC) No 1678/98 of 29 July 1998 amending Regulation (EEC) No 3887/92 laying down detailed rules for applying the integrated administration and control system for certain Community aid schemes; Council Regulation (EC) No 21/2004 of 17 December 2003 establishing a system for the identification and registration of ovine and caprine animals and amending Regulation (EC) No 1782/2003 and Directives 92/102/EEC and 64/432/EEC; Commission Regulation (EC) No. 759/2009 amending the Annex to Council Regulation (EC) No. 21/2004 establishing a system for the identification and registration of ovine and caprine animals; Commission Decision 2010/280/EU amending Decision 2006/968/EC implementing Council Regulation (EC) No 21/2004 as regards guidelines and procedures for the electronic identification of ovine and caprine animals. Updating frequency: on-line (cattle); 3-times a year (pigs); yearly (sheep and goats) |
4.2 Organisational setting on the use of administrative sources |
The national legislation provides for access to administrative records (Article 4 (2)) which in the case of the FSS are used to validate the information provided by respondents. Statistics Portugal does not participate in the conceptual design and subsequent related revisions of the administrative sources. |
4.3 The purpose of the use of administrative sources - link to the file |
Please access the information in the file at the link: (link available as soon as possible) |
4.4 Quality assessment of the administrative sources |
Payments under PAC policy IFAP (IACS) |
Method |
Shortcoming detected |
Measure taken |
- coherence of the reporting unit (holding) |
|
There are no significant differences between the definitions of holding. However, it is possible that several beneficiaries apply for different payments in the same holding. |
To improve the comparability between the FSS data and this source, it was included one question (see in the item 3.3-5 the annex Questionnaire FSS 2016 Mainland, question number 22) about the beneficiaries associated with the holding. |
- coherence of definitions of characteristics |
Most characteristics have the same definition in FSS and this source, after an effort to harmonize them. |
|
|
- coverage: |
|
|
|
|
over-coverage |
|
|
|
|
under-coverage |
|
Not all units of the farm register are present in this source, since it only includes the ones that actually received aid in the reference year. |
Taking in account this shortcoming, it is only possible to use this source value for validating microdata (individual data) or to consider it as the minimum allowable value for a given aggregate characteristic. |
|
misclassification |
|
|
|
|
multiple listings |
|
|
|
- missing data |
|
|
|
- errors in data |
|
It is not uncommon the incorrect classification of the oat harvested green as oat for the production of grain in this source as compared to FSS. |
|
- processing errors |
|
|
|
- comparability |
|
See item coverage: under-coverage |
See item coverage: under-coverage |
- other (if any) |
|
|
|
SNIRA (Animal register) IFAP (IACS) |
Method |
Shortcoming detected |
Measure taken |
- coherence of the reporting unit (holding) |
|
There are differences between the definitions of holding in SNIRA and FSS. It’s not uncommon that one holding in FSS corresponds to two or more holdings of SNIRA. |
To improve the comparability between the FSS data and this source, it was included one question (see in the item 3.3-5 the annex Questionnaire FSS 2016 Mainland, question number 22) about the beneficiaries associated with the holding. |
- coherence of definitions of characteristics |
All characteristics have the same definition in FSS and this source. |
|
|
- coverage: |
|
|
|
|
over-coverage |
|
|
|
|
under-coverage |
|
Not all units of the farm register are present in this source. The herd database (namely for sheep, goats and pigs) is not yet exhaustive, since there is an ongoing task for the registration of all the holders in the database. The missing units are necessarily small holders, namely those under thresholds of the aid scheme. |
Taking in account this shortcoming, it is only possible to use this source value for validating microdata (individual data) or to consider it as the minimum allowable value for a given aggregate characteristic. |
|
misclassification |
|
|
|
|
multiple listings |
|
|
|
- missing data |
|
|
|
- errors in data |
|
|
|
- processing errors |
|
|
|
- comparability |
|
See item coverage: under-coverage |
See item coverage: under-coverage |
- other (if any) |
|
|
|
4.5 Management of metadata |
Both sources are managed by IFAP (IACS). The metadata describing both administrative sources are systematically stored and maintained over time by Statistics Portugal in dedicated databases. |
4.6 Reporting units and matching procedures |
External source |
File |
Correspondence between the INE’s definition of holding and the one from the external source |
Financing Institute for Agriculture and Fisheries (IFAP) – IACS |
Holders that received payments under the common agricultural policy from IFAP in 2016 |
Theoretically there are no significant differences between the concepts of INE and IFAP. However, often the beneficiaries of IFAP and holders don’t have a perfect match (e.g.: one holding may correspond to two or more beneficiaries of IFAP, when/if different household members apply for aid). |
SNIRA (Animal Register) – livestock keepers at national level |
There are differences between the "holding" of SNIRA and "holding" of INE (e.g., one holding may correspond to two or more holdings of SNIRA). |
|
4.7 Difficulties using additional administrative sources not currently used |
Organic farming: The data produced in Portugal under the Reg. 834/2007 presents problems of quality and timeliness. Also with regard to the concepts there are differences, particularly because in animal production insects can be included while in crop production, wild plant products. Moreover certification companies often do not report to the Ministry of Agriculture individual data but only aggregate information by control body. |
Annexes: 3.1-3.5. Neyman allocation 3.1-3.2. The stratification variables and the sampling stage |
3.2. Frequency of data collection |
Frequency of data collection |
Since 1989, data collection on FSS was made in the following years:
- 1989 - Census
- 1993
- 1995
- 1997
- 1999 - Census
- 2003
- 2005
- 2007
- 2009 - Census
- 2013
- 2016
|
|
3.3. Data collection |
1. Data collection modes |
The survey was conducted using two different data collection modes: i) Internet, using questionnaires which were completed through internet component (WEBINQ); ii) face-to-face interviews, with the collection based on paper questionnaires. The WebInq questionnaire was available to a sub-sample during September and October 2016. After that period, those units which didn't answer through this mode were transfered to face-to-face data collection mode, used also in the remaining units of the sample. In this late mode, interviewers were also responsible for the recording of data on the laptops. The type of data recording may be characterised as “heads up”, given that the tailor-made software application to support the agricultural survey system of Statistics Portugal (SAGR) supplied instantaneous feedback to the staff member using a laptop to record data electronically regarding the information that was being recorded. |
2. Data entry modes |
Internet data collection mode: The data capture and editing on the internet data collection mode was supported by the component WEBINQ, a part of the SIGINQ, the Statistics Portugal integrated approach to survey management systems. In this component (WEBINQ, questionnaires on the web) respondents submit their answers online. Acknowledging the relevance of data capture and editing on the final data quality, WEBINQ follows the new concept "Data Version Stack", which has four main rules: i) there is a single response for each tuple (survey, data reference period, statistical unit); ii) data editing performs a new response version; iii) all versions of the responses are saved; iv) last version goes to data warehouse. Data Version Stack allows the survey manager to track changes and ensure data quality. For more detailed information on this issue, check annex 3.3-2 Data collection - Statistics Portugal Survey Management System Architecture (also available at https://ec.europa.eu/eurostat/cros/system/files/NTTS2013fullPaper_146.pdf). Face-to-face data collection mode: A generic management and recording application was developed, standardisable by survey (set of validation items and rules), to support the agricultural direct collection statistical operations. The application is composed of the following modules: • Management of the survey. Import of validation items and rules; • Management of agricultural holdings. Import and consultation of the sample. Formulation of monitoring lists; • Management of the chain of collection. Assignment of user profiles and allocation of agricultural holdings to interviewers; • Management of questionnaires. Includes the recording module; • Payments. Introduction of generic and specific variables to prepare payment slips; • Data analysis. Totalisers, ad-hoc selections and comparison with external sources; • Maps; • Synchronisation (between the interviewers’ laptops and the central database).
A web application was developed with a central environment targeted at survey management and analysis, and a local environment on laptops, targeted at questionnaire recording and validation by interviewers. Hardware Hardware used: • 1 web server/application server (virtual machine, 4 CPU, 4 Gb ram memory) • 1 database server (16 Gb ram memory) Software Software used: • Java • Oracle 10 and Oracle Express Software architecture: see 3.3-2.Physical model of the application in annex. Strengths of the collection and recording application • Solution that may be used in other statistical operations; • Recording by interviewers. Correction of errors by interviewers; • Validation rules editor. Time saving in the programming and testing of rules; • Selection editor (ad-hoc queries). Research by users with no need for programming; • Update of the local online application; • Advantages inherent in a web application (broadly-based access, central application update, centralised database, online output). |
3. Measures taken to increase response rates |
Internet data collection mode: Prior to the beginning of the collection via Internet, agricultural holders from the sub-sample selected to this collection mode were contacted through circular letters, sent to inform them on the statistical operation, its purposes and the importance of their cooperation. It was also mentioned in the letter that they had been selected to answer using Internet and the necessary procedures to answer using Statistics Portugal’s WEBINQ. After that, agricultural holders were contacted by phone and questioned about their intention to answer via Internet (a valid email was then collected to allow the login to the service and further contacts). During the collection period, agricultural holders that didn’t answer the questionnaire were contacted by email, remembering the need to do it and reinforcing that, if not, they would be later contacted by an interviewer that would visit the holding to get the required data. Face-to-face data collection mode: Promoting and advertising the statistical operation Before the interviews, letters (circular letters) were sent to agricultural holders informing them on the statistical operation, its purposes, the importance of their cooperation, and the date of the interviews. Priority of data collection • Holdings with a location other than the address of the holder – prior to the interviews, interviewers identified the agricultural holdings that had been allocated to them and which were located in a commune other than that of the holder's address. Priority was given to these interviews, especially those located in different agricultural areas. It was thus ensured that the questionnaire would be transferred accordingly and the interview made in due time. In addition, this procedure also permitted to avoid the possibility of the holder returning to the agricultural holding, leading to a new transfer of the questionnaire. Priority contact with these holders could also in a first instance avoid the transfer of the respective questionnaires. • Large holdings and/or holdings with significant activity in their location area – the evaluation of the holdings' size, as well as the importance of their activity in the respective geographical area, were two crucial factors for interview priority. Prior scheduling of the interview With a view to enhancing the success of the interview, in particular the required availability of the agricultural holder to respond to the interview on a single occasion, where possible, the interviewer made an appointment with the agricultural holder. It was thus possible to avoid, for instance, incomplete collection of data, additional visits, and unnecessary further availability for concluding the collection/interview. “I have been here” message Reminders were adopted insisting on the need for making the interview and obtaining the necessary information. Therefore, in those cases where interviewers visited the address of the holder or agricultural holding, but could not get in touch with the holder, they left the message “I have been here”, i.e. the indication that they had visited the holding/address of the holder, and informing of a date for a new contact. This was intended to speed up the process, permitting the interviewer to establish a future contact with the holder in order to obtain response. Interview techniques With a view to raising the awareness of interviewees, leading them to cooperate and supply the required information, during the interviews interviewees were always informed about the purpose of the survey. They were persuaded, motivated and clarified regarding the importance of their cooperation. Where necessary, interviewers always sought to provide the required explanations, showing dependability and availability. In order to ensure the confidentiality and reliability of data, no third persons were allowed during the interviews, except where that was required by the person responsible for the information supplied. Reminder that a response must be given Whenever an interview could not be made, irrespective of the reason (impossibility to locate the holding or to contact the holder, absence of the interviewee, refusal by the holder to answer the interview), every effort was made to reverse the situation. The interviewer could resort to the local technical staff member. The interview would be considered non-achieved only after such conclusion had been drawn by higher hierarchical members in the chain of collection, and the decision communicated to the local technical staff member. Obtaining alternative contacts In addition, obtaining alternative contacts whenever it was impossible to contact the holder proved to be quite an important asset in terms of recovering missing interviews. Payment of non-achieved interviews The actual payment of non-achieved interviews was a further incentive for the interviewer to take all necessary steps to obtain the interview. Treatment of refusals – Reminders until interview was considered non-achieved In those cases where the reason for not making the interview (non-achieved interviews) was refusal, the interviewer tried to reverse the situation, insisting, in person and accompanied by the local technical staff member, on the need for the agricultural holder, or the responsible person, to supply the information required. When it was not possible to reverse the situation at this level, the section manager was informed and indicated the subsequent step. If the situation remained unchanged (non-achieved interview), the section manager would follow the procedures in force, and request guidance to the regional coordinating body. This body would be responsible for any decision on the impossibility of conducting the interview. This decision alone made it possible to record a questionnaire to an agricultural holding as a non-achieved interview. Treatment of refusals – Circular letter In the cases of refusal confirmed by the regional coordinating body, a circular letter was sent to the holder/person responsible for supplying the information, informing them of the mandatory nature of the response and the fines to which they were subject in case of non-compliance with the legal obligation (according to Article 4 (1) of the Law No 22/2008 and Article 4 of the Decree-Law 136/2012). This made it possible to reverse a number of refusals. The final number of non-achieved interviews due to refusal was rather low at national level (see item 6.3.3 Non response error - item 1). |
4. Monitoring of response and non-response |
1 |
Number of holdings in the survey frame plus possible (new) holdings added afterwards In case of a census 1=3+4+5 |
302783 |
2 |
Number of holdings in the gross sample plus possible (new) holdings added to the sample Only for sample survey, in which case 2=3+4+5 |
28003 |
3 |
Number of ineligible holdings |
2925 |
3.1 |
Number of ineligible holdings with ceased activities This item is a subset of 3. |
This information wasn't collected |
4 |
Number of holdings with unknown eligibility status 4>4.1+4.2 |
550 |
4.1 |
Number of holdings with unknown eligibility status – re-weighted |
550 |
4.2 |
Number of holdings with unknown eligibility status – imputed |
0 |
5 |
Number of eligible holdings 5=5.1+5.2 |
24528 |
5.1 |
Number of eligible non-responding holdings 5.1>=5.1.1+5.1.2 |
This information wasn't collected |
5.1.1 |
Number of eligible non-responding holdings – re-weighted |
- |
5.1.2 |
Number of eligible non-responding holdings – imputed |
- |
5.2 |
Number of eligible responding holdings |
24528 |
6 |
Number of the records in the dataset 6=5.2+5.1.2+4.2 |
24528 |
5. Questionnaire(s) - in annex |
See annexes. |
Annexes: 3.3-2. Physical model of the application 3.3-5. Azores questionnaire 2016 (Portuguese only) 3.3-5. Mainland questionnaire 2016 (Portuguese only) 3.3-5. Madeira questionnaire 2016 (Portuguese only) 3.3-5. Web questionnaire (Portuguese only) 3.3-2. Statistics Portugal Survey Management System |
3.4. Data validation |
Data validation |
Internet data collection mode: Data validation on the electronic questionnaire available on WebInq was primarily executed by the 659 on-line validation rules, the large majority (653) classified has fatal errors, which disable the conclusion and submission of the questionnaire. These errors were from different types, such as completeness checks (non-compliance with the compulsory filling-in of a certain field; e.g. total area of the holding not filled in), relational/consistency checks (if a certain field is/is not recorded, another field must be/does not have to be filled in; e.g. existence of irrigated area, where the field irrigated land had not been filled in) and range checks (data must be included within a certain range, or cannot attain a certain value; e.g. rice fields in Entre Douro e Minho). Correction of these errors was mandatory. There were also the so called warning errors, which basically warns the respondent that the situation he is describing is not usual. These errors are mostly relational/consistency checks (e.g. the holding has an irrigation system and does not record irrigated land, or the only labour force of the elder is the manager). Face-to-face data collection mode: Interviewer/staff member using a laptop to record data electronically The interviewer's functions include data analysis, especially as regards consistency and alignment with local circumstances. Moreover, interviewers/staff using laptops also record, validate and review data in computer-readable format. In order to assist interviewers/staff in this function, 1 910 validation rules were created for recorded data, by resorting to the validation rules editor of SAGR. This editor makes it possible to centrally create strings of rules. A number of fine-tuning interventions and updates were made to the original rules in the course of the operation. Validation rules triggered errors, which can be broken down into three large groups:
- Intrinsic errors (8) – those usually associated with the introduction of characters that are not accepted in specific recording fields, especially those related to the identification of the holder (e.g. characters not accepted in names, addresses, etc.). Any one of these errors prevents the questionnaire from being recorded;
- Fatal errors (1 298) – this type of error enables the questionnaire to be recorded; however, its validation will undoubtedly result in the questionnaire being labelled as incorrect, which prevents its conclusion. These errors can be completeness checks (non-compliance with the compulsory filling-in of a certain field; e.g. legal personality not filled in), relational/consistency checks (if a certain field is/is not recorded, another field must be/does not have to be filled in; e.g. existence of irrigated area, where the field irrigated land had not been filled in), arithmetic checks (wrong totals; e.g. wrong total cereal), range checks (data must be included within a certain range, or cannot attain a certain value; e.g. rice fields in Entre Douro e Minho), or sequential checks (non-sequential filling-in of certain fields; e.g. non-sequential filling-in of members of the holder's household). Correction of these errors is mandatory;
- Warning errors (604) – this type of error enables the questionnaire to be recorded and concluded. These errors may also be of the following types: range checks (e.g. rice field in the Algarve or rice exceeding 50 acres in Beira Litoral) and relational/consistency checks (e.g. the holding has an irrigation system and does not record irrigated land), completeness checks (e.g. the telephone was not filled in). These errors basically play a warning role, and the interviewer/staff member using a laptop will analyse the data triggering errors and confirm or correct them.
Validation may be broadly based, covering the whole national territory (443), or be restricted to the Mainland (413), Madeira (459), the Azores (144) or specific agricultural regions (451). For the Mainland questionnaire, specific validation rules were implemented for each region, so as to identify and validate certain characteristics. Errors in SAGR were automatically triggered during the data recording procedure, enabling the staff member using a laptop to immediately correct/analyse data. See 3.4 Example of the list of errors triggered during the data recording procedure (in annex, only available in Portuguese). After the correction of fatal errors and the analysis of warning errors (reflected in confirmations and/or corrections of recorded data), interviewers, where they considered their work to have been concluded regarding the questionnaire/holding in question, would label it as “Concluded” in SAGR (afterwards, they would inform local technical staff members that the recorded information was ready to be analysed).
Local technical staff member Local technical staff members analysed the information to detect possible inconsistencies in data collected and recorded, as well as incorrectly implemented concepts or misalignments with local/regional circumstances. An analysis was made of the information contained in the questionnaires concluded by the interviewers (those that they deemed ready to be analysed by the local technical staff member). For the purpose, the local technical staff member could/should resort to the Error Report, the Validate function, the Selection module and/or the Comparison with other Sources module. The local technical staff member could introduce corrections/changes, return it to the interviewer (Return to lower level). After a critical appraisal and analysis of the information, the local technical staff member would certify the questionnaire (thereby signalling the regional coordinating body that their work had been concluded and the questionnaire was ready to be analysed).
Regional coordinating body After analysis, the regional coordinating body could return the questionnaire to the local technical staff member, who could resort to the interviewer, only if necessary for contacting the person responsible for the information supplied.
National coordinating body In the course of the operation, the national coordinating body, similarly to the other elements in the data collection chain structure of the FSS, prepared regular analyses of data collected and, in order to complement and support regional analyses, submitted the output to be validated at the different levels in the chain of collection. As a result, the analyses implied the justification or correction of data collected. For different geographic levels, it usually covered information regarding:
- Comparison with other sources;
- Frequency of errors;
- Maximum permissible errors;
- Selections;
- Totalisers.
In addition to validating the information registered through warning and fatal errors, aggregate data and microdata in the FSS were also analysed. Information was analysed through the SAGR software application, by using features specifically developed for the purpose, in particular: totalisers, selections of holdings and comparison with external sources (microdata and aggregate data). Both data collection modes: The analysis of totalisers, i.e. aggregate information per geographic level, is essential to evaluate the consistency of collected data vis-à-vis local circumstances. Totalisers of the different geographic levels were analysed according to profiles in the chain of collection, thereby ensuring that an analysis would be carried out for all geographic levels. Selections refer to the search for holdings according to selected conditions, with a view to detecting incorrections in data collection. The critical appraisal of aggregate data made it possible to obtain elements for the analysis of microdata, particularly in the identification of overvalued variables and high variable values – maximum permissible errors. Selections were frequently based on the existence of a given warning error (so as to identify potential systematic errors made by interviewers) and were especially adjusted to local agricultural specificities, i.e. a dynamic process.
The comparison of recorded data with other sources was instrumental for validating the information. For further information, see item 8.3 Coherence - cross domain. |
Annexes: 3.4. Example of the error report (only available in Portuguese) 3.4. Example of the list of errors triggered during the data recording procedure (only available in Portuguese) |
3.5. Data compilation |
Methodology for determination of weights (extrapolation factors) |
1. Design weights |
See annex: 3.5-1. Design weights |
2. Adjustment of weights for non-response |
See annex: 3.5-2. Adjustment of weights for non-response |
3. Adjustment of weights to external data sources |
Not applicable. |
4. Any other applied adjustment of weights |
Not applicable. |
Annexes: 3.5-1. Design weights 3.5-2. Adjustment of weights for non-response |
3.6. Adjustment |
[Not requested] |