IMPORTANT LEGAL NOTICE - The information on this site is subject todisclaimercopyright notice
  EUROPA > European Commission > EuropeAid > Evaluation > Methodology > Basics > How?
Last updated: 19/01/2006

Methodological bases
Evaluation process (How?)




• Evaluation Guidelines
• Methodological bases
• Evaluation tools
• Examples
• Glossary
• Sitemap

• What
• When
• Why
• Who
• How

• Overview
• Strategy
• Questions
• References
• Design
• Data Collections
• Analysis
• Judgment
• Quality assurance

• Overview
• Analysis strategy
• Exploration
• Confirmation
• Validity


What does this mean?

The counterfactual, or counterfactual scenario, is an estimate of what would have occurred in the absence of the evaluated intervention.
The main approaches to constructing counterfactuals are

  • Comparison group
  • Modelling


What is the purpose?

By subtracting the counterfactual from the observed change (factual), the evaluation team can assess the effect of the intervention, e.g. effect on literacy, effect on individual income, effect on economic growth, etc.


Comparison group

One of the main approaches to counterfactuals consists in identifying a comparison group which resembles beneficiaries in all respects, except for the fact that it is unaffected by the intervention. The quality of the counterfactual depends heavily on the comparability of beneficiaries and non-beneficiaries. Four approaches may be considered for that purpose.

Randomised control group

This approach, also called experimental design, consists in recruiting and surveying two statistically comparable groups. Several hundred potential participants are identified and asked to participate or not in the intervention, on a random basis. The approach is fairly demanding in terms of preconditions, time and human resources. When the approach is workable and properly implemented, most external factors (ideally all) are neutralised by statistical rules, and the only remaining difference is participation in the intervention.


Adjusted comparison group

In this approach a group of non-participants is recruited and surveyed, for instance people who have applied to participate but who have been rejected for one reason or another. This approach is also called quasi-experimental design. In order to allow for a proper comparison, the structure of the comparison group needs to be adjusted until it is similar enough to that of participants as regards key factors like age, income, or gender. Such factors are identified in advance in an explanatory model. The structure of the comparison group (e.g. per age, income and gender) is adjusted by over- or under-weighting appropriate members until both structures are similar.


Matching pairs

In this approach a sample of non-participants is associated with a sample of beneficiaries on an individual basis. For each beneficiary (e.g. a supported farmer), a matching non-participant is found with a similar profile in terms of key factors which need to be controlled (e.g. age, size of farm, type of farming). This approach often has the highest degree of feasibility and may be considered when other approaches are unpractical.


Generic comparison

The counterfactual may be constructed in abstracto by using statistical databases. The evaluation team starts with an observation of a group of participants. For each participant, the observed change is compared to what would have occurred for an "average" individual with the same profile, as derived from an analysis of statistical databases, most often at national level.


Comparative approaches

Different forms of comparison exist, each with pros and cons, and varying degrees of validity.

  • An "afterwards only" comparison involves analysis of the differences between both groups (participants and non-participants) after the participants have received the subsidy or service. This approach is easy to implement but neglects the differences that may have existed between the two groups at the outset.
  • A "before-after" comparison focuses on the evolution of both groups over time. It requires baseline data (e.g. through monitoring or in the form of statistics, or through an ex ante evaluation), something which does not always exist. Baseline data may have to be reconstructed retrospectively, which involves risks of unreliability.


Strengths and weaknesses in practice

A well-designed comparison group provides a convincing estimate of the counterfactual, and therefore a credible base for attributing a share of the observed changes to the intervention. A limitation with this approach stems from the need to identify key external factors to be controlled. The analysis may be totally flawed if an important external factor has been overlooked or ignored. Another shortcoming stems from the need to rely upon large enough samples in order to ensure statistical validity. It is not always easy to predict the sample size which will ensure validity, and it is not infrequent to arrive at no conclusion after several weeks of a costly survey.



The principle is to run a model which correctly simulates what did actually occur in reality (the observed change), and then to run the model again with a set of assumptions representing a "without intervention" scenario. In order to be used in an evaluation, a model must include all relevant causes and effects which are to be analysed. These are at least the following:

  • Several causes including the intervention itself and other explanatory factors.
  • The effect to be evaluated.
  • A mathematical relation between the causes and the effect, including adjustable parameters.

Complex models (e.g. macro-economic ones) may include hundreds of causes, hundreds of effects, hundreds of mathematical relations, hundreds of adjustable parameters, and complex cause-and-effect mechanisms such as causality loops. When using a model, the evaluation team proceeds in three steps:

  • A first simulation is undertaken with real life data. The parameters are adjusted until the model reflects all observed change correctly.
  • The evaluation team identifies the "primary impacts" of the intervention, e.g. increase in the Government's budgetary resources, reduction of public debt, reduction of interest rates, etc. A set of assumptions is elaborated in order to simulate the "without intervention" scenario, that is to say, a scenario without the "primary impacts".
  • The model is run once again in order to simulate the "without intervention" scenario (i.e. the counterfactual). The impact estimate is derived from a comparison between both simulations.

Modelling techniques are fairly demanding in terms of data and expertise. The workload required for building a model is generally not proportionate to the resources available to an evaluation. The consequence is that the modelling approach is workable only when an appropriate model and the corresponding expertise already exist.