A Fresh Look at the Intervention Logic of Structural Funds

Size: px
Start display at page:

Download "A Fresh Look at the Intervention Logic of Structural Funds"

Transcription

1 A Fresh Look at the Intervention Logic of Structural Funds Paper presented at the European Evaluation Society Conference in Helsinki, 4 th October 2012 Veronica Gaffey Head of Evaluation Directorate General for Regional Policy European Commission veronica.gaffey@ec.europa.eu Abstract: In the 1990s, the European Commission initiated the MEANS programme of evaluation guidance for socio-economic programmes, primarily for those co-financed with the Structural Funds. This initiative came up with an intervention logic which has remained in place ever since. The Directorate General for Regional Policy has recently been taking a fresh look at the logical framework. We have examined it drawing on experiences from three programming periods: from the perspective of an intensive ex post evaluation of the programming period; from the perspective of reporting on the ongoing performance of current programmes; and from the perspective of designing a policy for with a stronger result orientation. The conclusion of this work is that our intervention logic was never entirely clear. We cannot in practice distinguish between a short-term direct effect (result) and a longer-term, indirect effect (impact). We have never actually measured impacts defined like this. With the increasing focus on outcomes in the international literature and the developments concerning the evaluation of impact defined as the "change that can credibly be attributed to an intervention", we realise that we need to clarify our intervention logic. This paper will outline the experiences of the Directorate General for Regional Policy and its proposals for a re-articulation of the logic of our interventions and the terminology we use in this regard. Background When Structural Funds were reformed in 1988 and again in 1993, evaluation gained a prominence it had not had before. With multi-annual programmes, Structural Funds became more a policy and less of a financial instrument. The role of evaluation was not only to evaluate results and impacts, but to contribute to the design and implementation of better programmes which would deliver economic and social (and later territorial) cohesion. In this context, in 1995 the European Commission established the MEANS programme: Means for Evaluating Actions of a Structural Nature. Supported by a group of independent experts, the programme culminated in 1999 in a 6 volume set of handbooks

2 on monitoring and evaluation approaches and techniques. MEANS was a valuable resource and contains much guidance which is still relevant. It was succeeded by EVALSED the online resource on evaluation guidance which is updated regularly (but needs another update now to reflect the thinking which is presented in this paper). Through MEANS, the logical framework for Structural Funds was put in place and remained essentially unchanged until 2009/2010. Graph 1: Logic of Intervention: Source: European Commission Guidance on Indicators for Representing state of the art thinking in 1995, we have more recently come to question the concepts of the logical framework and their usefulness in designing, implementing, monitoring and evaluating public policies. Has the logical framework worked in practice? Have all elements of the model been clear? This process of reflection was prompted by the experience of evaluating the Structural Funds programmes, of trying to summarise what the current programmes are delivering and of designing proposals for In parallel, the DG for Regional Policy has engaged in debate with some leading evaluation experts on concepts for evaluation and monitoring of Structural Funds programmes. In addition, we have experimented with counterfactual impact evaluation methods in relation to area based and enterprise support measures and explored more rigorous methods in different intervention areas. At the start of our process of reflection, we were conscious that the traditional logic model was never clear how the bottom up inputs outputs results related to the top down impacts results outputs - inputs. The particular challenge was always with "impact". Can we observe an impact? Can it be captured by an indicator? Should all inputs relate in a linear fashion to an impact? How do we take account of other 2

3 contributing factors? What is the difference between a result and an impact? The various elements of guidance in MEANS use terms in different ways, but in essence the suggestion seems to be that results are short-term direct effects, while impacts are longerterm indirect effects. But what about long term direct effects? And finally, what are the respective roles of monitoring and evaluation related to different aspects of the logical framework? Indicators for the Programming Period In 1999 and 2000, the Commission put a significant effort into ensuring that indicators were built into programmes in a systematic way in line with the MEANS intervention logic. In 2003 all programmes were subject to a mid-term evaluation and in 2004 a performance reserve was allocated to programmes which were deemed to be most performing. In fact, the mid-term evaluation took place too early for many programmes and the process of allocating the performance reserve brought to our attention some of the weaknesses of the indicator systems. Some programmes at this early stage radically over-achieved their targets highlighting the fact that identifying indicators and target setting were not reliable in many Member States. Many programmes used the process of the mid-term evaluation to improve their indicator systems. In 2007, the European Commission launched the ex post evaluation of the Structural and Cohesion Fund programmes and projects. During the five years of this exercise and through the 21 evaluation contracts the DG for Regional Policy designed and managed as part of the process, a number of findings emerged related to the quality and use of intervention logics in the design and implementation of the programming period. At the time the ex post evaluation of the European Regional Development Fund was concluded in early 2010 we still did not have the final reports from the programmes. Monitoring data used in the ex post evaluation came from the Annual Implementation Reports from In 2011, in order to complete the information available, the DG for Regional Policy launched a final exercise to extract the data from the Final Implementation Reports relating to the achievements of the programmes 1. For the mainstream ERDF programmes (Objective 1 and 2), a total of 22,600 indicators were reported across 227 programmes, with an average of 106 indicators per programme, ranging from 25 in Denmark to 192 in Italy. While 25 indicators might be reasonable for relatively large programmes, those in Denmark were small. Over 100 indicators suggests a lot of counting but also dispersed effort and a lack of concentration. 51% of these indicators were classified as outputs; 30% were results; and 19% were impacts. For the purposes of this paper, we take a closer look at the impact indicators, what they were, how they were defined and reported against. When we look more closely at the 4,985 impact indicators, we find that 854 (17%) had no values at all, for baseline, target or achievement. That leaves us with 4,131 impact indicators with any quantified values. Of these: 94% had final achievements; 1 This data is available for researchers and other interested parties at: 3

4 58% had targets; 6% had baselines; 55% had targets and achievements; 5% had baselines, targets and achievement; and finally 0.5% had baselines but no targets and no achievements. Interestingly, the percentage of impact indicators with reported achievement values increased from 42% in the 2006 Annual Implementation Reports to 78% in the Final Reports, representing a serious amount of effort on the part of those responsible for programmes. But it is notable that so few of the impact indicators had baselines. How can we assess impacts if we have no idea of the starting point? While having a target and an achievement is useful, without the baseline we cannot assess the scale of the problem and the extent of the achievements reported. The concept of baselines was implicitly referred to in the Structural Funds regulation for in the article on ex ante evaluation 2 : "[The ex ante evaluation] shall assess the consistency of the strategy and targets selected [ ] and the expected impact of the planned priorities for adtion, quantifying their specific targets in relation to the starting situation, where they lend themselves thereto." There was also a section on baselines in the guidance on indicators issued by the Commission in However, baselines as we see above featured in very few cases where they were most relevant in identifying the aspects of a region or a sector on which the programmes aimed to have an impact. Ten Member States reported baselines, targets and achievements for some of their impact indicators. Interestingly, 4 of these were Member States which joined the EU in Poland, for example, had a very clear impact indicator on reducing the number of fatalities on the roads with a baseline, a target and an achievement. Of course, evaluation would be needed to find out the extent to which the projects co-financed (e.g., the building of motorways) contributed to the reduction. Where we have targets and achievements, we can assess achievement ratios, although the reliability of such an exercise is undermined as we know that targets were often changed to align with actual performance. When we examine the impact indicators with targets and achievements reported, we find most in Italy, followed by Spain, the UK, Austria and then Germany. Austria is a surprise as it had fewer programmes and less resources than the other countries. However, we find Austria included numbers of projects having a positive or neutral impact on gender or environment or on rural areas as impacts which seems of dubious value. When we look across most Member States, they reported jobs created or maintained but with no baselines. Some specify that they report net new jobs but this would be more meaningful with a baseline. Jobs safeguarded is another indicator which is difficult to define. Several Member States (for example, 2 Article 41 (2) of council Regulation (EC) No 1260/1999 of 21 June

5 Greece, Spain) report jobs created during construction - which by definition is not an impact. Reporting only achievements is still less satisfactory as the achievements are not related either to the need or the objective. None of this is to criticise the great effort which has gone into gathering and reporting the data. But we need to ask why we are gathering all this data. Much of the data are not meaningful and certainly do not represent the impact of the Structural Funds. Their purpose seems primarily to provide a figure which will unlock the final payment from the Structural Funds, rather than being an exercise in accountability and learning. There has been little or no public debate on these figures in any Member State. There must be scope to rationalise, streamline and focus, but also to clarify conceptually what we should count and what these figures mean. A new Interest in Outcomes As we were still deep within the experience of the ex post evaluation of the period, the process of planning for the new period began. The evaluators and we within the Commission were frustrated that success in Cohesion Policy seemed to be defined by many in Member States as well as within the European Commission as absorption of funds. Reflection within the DG for Regional Policy led to a shared view that there was a need for a decisive shift from a focus on absorption only to one driven by concerns for performance and results. But this would not be achieved through a continuation of current practice with a bunch of indicators of varying degrees of relevance included as the last element in the finalisation of programmes. This was reflected in the conclusions of the 5 th Cohesion Report, adopted by the Commission in November 2010, which stated that the "impact of cohesion policy is difficult to measure" and that: "The starting point for a results-oriented approach is ex ante setting of clear and measurable targets and outcome indicators. Indicators must be clearly interpretable, statistically validated, truly responsive and directly linked to policy intervention, and promptly collected and publicised." The draft Regulations for the period were adopted by the Commission in October 2011 and discussions are underway in both the European Council and Parliament with a view to adoption by the end of The results orientation is incorporated into the draft regulations and accompanied by a draft Guidance Paper developed by the Directorate General for Regional Policy 3. This Guidance Paper on Concepts and Recommendations fundamentally reviews the intervention logic of Cohesion Policy. The new intervention logic is defined as follows: Graph 2: Outputs, results and impact in relation to programming, monitoring and evaluation 3 Some of these ideas were inspired by the work of a Task Force on Outcome Indicators, involving international academics and led by Fabrizio Barca and Philip McCann, advisors to the Commissioner for Regional Policy. 5

6 Programming Strategy Monitoring and Evaluation Needs Specific Objective Intended Result Actual Result Other Factors Project selection Allocated INPUTS Actual INPUTS Contribution (impact) Targeted OUTPUTS Achieved OUTPUTS Results Orientation for Future Cohesion Policy What the DG for Regional Policy proposes for future Cohesion Policy is as follows: The specific objectives of future programmes (which are the pre-determined investment priorities defined in a regional or national context) must have a corresponding result indicator and a baseline; The result indicator is a proxy for the intended change we recognise that an indicator can never capture everything which happens; Policy monitoring reports on the evolution of the result indicator and feeds debate on the evolution of the need the programme aims to tackle; Evolution of the result indicator is a consequence of the policy action and other factors; Therefore the result indicator should be close enough to policy so that policy can have a discernible effect on the indicator; Evaluation is required to disentangle the effects of the policy from those of other factors as well as exploring any unintended effects of the policy. The implication of this approach is that we have dropped the traditional distinction between result and impact indicators. The notion of "results" is similar to that of "outcomes" often used in the literature. We use the term results since the translation of "outcomes" in most EU languages uses the same word as for "results". While "impact indicators" have been dropped, impact has not: it is now explicitly the contribution of the policy to change in the result indicator. It is not the longer term effects on the wider 6

7 population which may (or may not!) be in some way linked to the policy. We have found very few examples of such impacts being credibly linked to the policy. Instead, we are more modest: let us identify a change we seek where we believe our policy can have an effect. And let us evaluate this effect as we implement the policy The fundamental change in approach is that we aim to start the programming process with an identification of the intended result, with a corresponding indicator. Then we consider what scale of policy intervention and what resources should be applied to contribute to change. This is a radically different approach to one in the past which in practice and in theory started with the allocation of resources (see Graph 1, which starts at the bottom with the inputs and operations leading upwards finally to impacts). An essential element of the new approach is transparency on the result indicator and regular monitoring of and debate on its evolution. It is quite possible that the result indicator will not move in the desired direction. Then it would be important to reflect on whether the policy action is the correct one (but that effects will take time), or perhaps the "other factors" are too dominant (and perhaps they should become the focus of a different policy?). It may be that the indicator selected does not reflect the intended change and should be reviewed. If programmes are designed with this intervention logic at least we are clear about what the policy makers what to change and what success should be measured against with regular debate and review of the policy and its effects. Current practice We have an idea of how we want to move forward and how we might capture the impact of future Cohesion Policy programmes. Our belief is that the results orientation must be built into programmes from the beginning; it cannot be bolted on at the end. It should also express the objectives of the programme which seems obvious, but clearly is not when we have indicator systems involving thousands of indicators. The regulatory background for the period was one where many Member States in the European Council negotiations took a view that monitoring and evaluation issues created administrative burdens which should be simplified. We can interpret this as a frustration with the amount of effort which went into the gathering of data for and with the fact that the data gathered did not seem to be meaningful. Our contention is that part of that frustration arises from a lack of clarity on some of the basic concepts of the logical framework. The requirement was for: "information on the priority axes and their specific targets. Those targets shall be quantified using a limited number of indicators for outputs and results [ ]. The indicators shall make it possible to measure the progress in relation to the baseline situation and the achievement of the targets of the priority axis" 4. While the Commission's guidance for the selection of indicators 5 maintained the use of the traditional logical framework and the notion of impact indicators referring to longer 4 Article 37 (1) (c) of Council Regulation (EC) No 1083/2006 of 11 July The New Programming Period : Indicative Guidelines on Evaluation Methods: Monitoring and Evaluation Indicators Working Document No. 2 7

8 term effects for beneficiaries or the wider population, there was a shift away from the use of "impact indicators" with a call for a greater emphasis on result indicators. It was recognised that impact could only be dealt with through evaluation and that other factors would contribute to change in such indicators. Those of us involved in developing the guidance were struggling with the fact that we knew that impact indicators were not delivering much meaningful information but we did not have sufficient knowledge at that stage to challenge them more radically. In fact, we now know that the guidance we provided and the early reflections on how to use result indicators was often ignored, in a context in where most Member States (and some colleagues within the DG for Regional Policy) regarded indicators as an unnecessary administrative burden. We have insights from three sources of information. The first is reporting against some common mostly output indicators in Annual Implementation Reports. While not obligatory, Member States agreed in 2008 to report against these indicators so that some aggregate figures could be generated at EU level to communicate the achievements of the policy. What this experience shows us is that, while practice is improving, there are still simple errors in reporting which undermine the reliability of the figures. However, the fact that these data have to be reported annually, and can be simply aggregated and compared, feeds a process where the Commission regularly asks Member States and regions for explanations when data seem strange. Even more, arriving at these large aggregate figures while impressive in themselves gives rise to the "so what?" question. What do these outputs lead to? What changes as a result? More insights come from a pilot exercise where we explored with volunteer regions from Member States what the results orientation we propose for the future would look like in current programmes. The basic questions we asked during these pilots were simple and we remain convinced that if we can answer these questions we will have better quality programmes which are more likely to achieved their intended results. We asked for each priority we examined: What do you want to change? What indicator can capture this change? Do you know the baseline for 2007 or now (data sources)? Will your output indicators contribute to change in the result indicator? How? We found the following 6 : The new approach is feasible but not without a significant change in the practice of those designing programmes. None of the pilot regions currently use result indicators in the manner proposed by the Commission. The objectives of the priorities examined were expressed in 6 See report of the pilot exercise on the DG Regional Policy website: 8

9 very general terms and in most cases current indicators do not capture the intended effects of the programmes. The results focus must become part of the development of the programme, which need a stronger and more explicit intervention logic; this cannot be added afterwards. The main change required is concentration. But concentration has to be the outcome of a process of deliberation and policy choice. This emphasises the importance of political debate on the choices which drive programme design. If there is concentration, this means that there will be fewer indicators. Some pilot regions had very many indicators but none captured the motivations for policy action. Whatever result indicator is selected, baselines and targets are essential. As a final point, it is important to recall that indicators do not tell us everything. The evolution of the result indicator should prompt a debate; it is not the last word on the performance of the policy. While the pilots did not go so far as to explore the interactions between different priorities of complex integrated programmes, it seems clear that clarity on the focus of individual priorities is essential before one can assess the interactions (hopefully coherence and synergy and not conflict) between them. A further finding from analysis of Annual Implementation Reports in the current programming period 7 is that they are not a good source of evidence on the performance of programmes. This is hardly surprising since if the indicators were not a reflection in the first place of the real objectives of programmes, reports against them are unlikely to shed much light on performance. Impact evaluation, not impact indicators This paper confronts the theory of the logical framework of the Structural Funds with the practice of how it has been used, from 2000 until now. Examining what has been reported as "impact" gives insights into what impact actually might be. The traditional notion of "impact" as long-term effects, including those which are direct and indirect, intended and unintended, seems not to be one which is very meaningful. It seems to represent aspirations rather than anything which can be traced back with any credibility to particular policies or interventions. Many impact indicators reported in the past by Member States in their Structural Fund programmes do not tell us about the impact of the Funds: some capture the activity of a programme but not the change which results; others are so distant from policy actions (e.g., GDP or productivity growth) that they tell us little about the effects or effectiveness of policy. "Impact indicators" suggest that there are simple indicators which we can quantify which capture impact. We know 7 Expert Evaluation Network delivering policy analysis on the performance of cohesion policy at: 9

10 this is not the case. Therefore, is it not time we dropped this use of the word impact and the concept of impact indicators? What we have proposed as the results orientation is very similar to what is referred to in some of the literature as "outcome monitoring" or "results monitoring" 8. We have used the term "policy monitoring". The important point is that monitoring is to observe; evaluation must go beyond observation to estimate, capture or judge impact. We have clarified that impact is what can credibly be attributed to an intervention. This means that we abolish "impact indicators" and instead focus on impact evaluation, using different methods both quantitative and qualitative counterfactual and theory based. In this way we clarify the different roles of monitoring and evaluation. We also emphasise that indicators alone cannot tell the whole story. To suggest that they can is to undermine the extremely important role that evaluation can and should play in policy design and implementation and the feedback loop which should exist between the two. It is time for evaluators to reclaim impact and insist on their role in evaluating the effects of policy. 8 See White and Stern respectively. 10

11 Bibliography Barca F (et al): Outcome Indicators and Targets: Towards a new system of monitoring and evaluation in EU Cohesion Policy, European Commission (2011). me_indicators_en.pdf European Commission: 5 th Report on Economic Social and Territorial Cohesion (2010) cfm European Commission, Directorate General for Regional Policy: Results Indicators 2014+: Report on Pilot Tests in 12 Regions across the EU (2012). _report%20.pdf Martini A: How Counterfactuals got lost on the way to Brussels, prepared for an Evaluation Symposium in Strasbourg (2008). Stern E (et al): Broadening the Range of Designs and Methods for Impact Evaluations, Working Paper 38, Department for International Development (2012). White, Howard: A Contribution to Current Debates in Impact Evaluation, in Evaluation, Sage (2010). 11