Evaluating the Evaluators: Some Lessons from a Recent World Bank Self-Evaluation

Editor’s Note: The World Bank’s Independent Evaluation Group (IEG) recently published a self-evaluation of its activities. Besides representing current thinking among evaluation experts at the World Bank, it also more broadly reflects some of the strengths and gaps in the approaches that evaluators use to assess and learn from the performance of the international institutions with which they work. The old question “Quis custodet ipsos custodes?” – loosely translated as “Who evaluates the evaluators?” – remains as relevant as ever. Johannes Linn served as an external peer reviewer of the self-evaluation and provides a bird’s-eye view on the lessons learned.

An Overview of the World Bank’s IEG Self-Evaluation Report

In 2011 the World Bank’s Independent Evaluation Group (IEG) carried out and published a self-evaluation of its activities. The self-evaluation team was led by an internal manager, but involved a respected external evaluation expert as the principal author and also an external peer reviewer.

The IEG self-evaluation follows best professional practices as codified by the Evaluation Cooperation Group (ECG). This group brings together the evaluation offices of seven major multilateral financial institutions in joint efforts designed to enhance evaluation performance and cooperation among their evaluators. One can therefore infer that the approach and focus of the IEG self-evaluation is representative of a broader set of practices that are currently used by the evaluation community of international financial organizations.

At the outset the IEG report states that “IEG is the largest evaluation department among Evaluation Capacity Group (ECG) members and is held in high regard by the international evaluation community. Independent assessments of IEG’s role as an independent evaluation function for the Bank and IFC rated it above the evaluation functions in most other ECG members, international nongovernmental organizations, and transnational corporations and found that IEG follows good practice evaluation principles.”

The self-evaluation report generally confirms this positive assessment. For four out of six areas of its mandate IEG gives itself the second highest rating (“good”) out of six possible rating categories. This includes (a) the professional quality of its evaluations, (b) its reports on how the World Bank’s management follows up on IEG recommendations, (c) cooperation with other evaluation offices, and (d) assistance to borrowing countries in improving their own evaluation capacity. In the area of appraising the World Bank’s self-evaluation and risk management practices, the report offers the third highest rating (“satisfactory”), while it gives the third lowest rating (“modest”) for IEG’s impact on the Bank’s policies, strategies and operations. In addition the self-evaluation concludes that overall the performance of IEG has been “good” and that it operates independently, effectively and efficiently.

The report makes a number of recommendations for improvement, which are likely to be helpful, but have limited impact on its activities. They cover measures to further enhance the independence of IEG and the consistency of evaluation practices as applied across the World Bank Group’s branches – the World Bank, the International Finance Corporation (IFC), and the Multilateral Investment Guarantee Agency (MIGA) –; to improve the design of evaluations and the engagement with Bank management upstream for greater impact; and monitoring the impact of recent organizational changes in IEG in terms of results achieved. The report also recommends that more be done to evaluate the Bank’s analytical work and that evaluations draw on comparative evidence.


In terms of the parameters of self-evaluation set by the prevailing practice among the evaluators on international financial agencies, the IEG self-evaluation is accurate and helpful. From my own experience as an operational manager in the Bank whose activities were evaluated by IEG in years past, and as a user of IEG evaluations (and of evaluations of other international aid organizations) for my research on aid effectiveness, I concur that IEG is independent and effective in meeting its mandate as defined. Moreover, the self-evaluation produces useful quantitative evidence (including survey results, budget analysis, etc.) to corroborate qualitative judgments.

However, the self-evaluation suffers from a number of limitations in approach and gaps in focus, which are broadly representative of the practices prevalent among many of the evaluation offices of international aid agencies.

Approach of the IEG self-evaluation

The core of the self-evaluation report is about the evaluation process followed by IEG, with very little said about the substance of IEG’s evaluations. The following questions could have usefully been raised, but were not: do evaluations cover the right issues with the right intensity, such as growth and poverty; environmental, governance, and gender impacts; regional dimensions versus exclusive country or project focus; effectiveness in addressing the problems of fragile and conflict states; effectiveness in dealing with global public goods; sustainability and scaling up; etc. Therefore the report does not deal with the question of whether IEG effectively responds in its evaluations to the many important strategic debates and issues with which the development community is grappling.

Related to this limitation is the fact that the report assessed the quality of IEG’s mostly in terms of (a) whether its approach and processes meet certain standards established by the Evaluation Cooperation Group; and (b) how it is judged by stakeholders in response to a survey commissioned for this evaluation. Both these approaches are useful, but they do not have any basis in professional assessments of the quality of individual products. This is equivalent to IEG evaluating the World Bank’s projects on the quality of its processes (e.g., appraisal and supervision processes) and on the basis of stakeholder surveys, without evaluating individual products and their impacts.

Gaps in the Self-Evaluation and in Evaluation Practice

Careful reading of the report reveals six important gaps in the IEG self-evaluation, in the prevailing evaluation practice in the World Bank, and more generally in the way international financial organizations evaluate their own performance. The first three gaps relate to aspects of the evaluation approach used and the second three gaps relate to lack of focus in the self-evaluation on key internal organizational issues:

1. Impact Evaluations: The report notes that IEG carries out two to three impact evaluations per year, but it sidesteps the debate in the current evaluation literature and practice as to what extent the “gold standard” of randomized impact evaluation should occupy a much more central role. Given the importance of this debate and divergence of views, it would have been appropriate for the self-evaluation to assess IEG’s current practice of very limited use of randomized evaluations.

2. Evaluation of Scaling Up: The report does not address the question of to what extent current IEG practice not only assesses the performance of individual projects in terms of their outcomes and sustainability, but also in terms of whether the Bank has systematically built on its experience in specific projects to help scale up their impact through support for expansion or replication in follow-up operations or through effective hand-off to the government or other partners. In fact, currently IEG does not explicitly and systematically consider scaling up in its project and program evaluations. For example, in a recent IEG evaluation of World Bank funded municipal development projects (MDPs) , IEG found that the Bank has supported multiple MDPs in many countries over the years, but the evaluation did not address the obvious question whether the Bank systematically planned for the project sequence or built on its experience from prior projects in subsequent operations. While most other evaluation offices like IEG do not consider scaling up, some (in particular those of the International Fund for Agricultural Development and the United Nations Development Program) have started doing so in recent years.

3. Drawing on the Experience of and Benchmarking Against Other Institutions: The self-evaluation report does a good job in benchmarking IEG performance in a number of respects against that of other multilateral institutions. In the main text of the report it states that “IEG plans to develop guidelines for approach papers to ensure greater quality, in particular in drawing on comparative information from other sources and benchmarking against other institutions.” This is a welcome intention, but it is inadequately motivated in the rest of the report and not reflected in the Executive Summary. The reality is that IEG, like most multilateral evaluation offices, so far has not systematically drawn on the evaluations and relevant experience of other aid agencies in its evaluations of World Bank performance. This has severely limited the learning impact of the evaluations.

4. Bank Internal Policies, Management Processes and Incentives: IEG evaluations traditionally do not focus on how the Bank’s internal policies, management and incentives affect the quality of Bank engagement in countries. Therefore evaluations cannot offer any insights into whether and how Bank-internal operating modalities contribute to results. Two recent exceptions are notable exceptions. First, the IEG evaluation of the Bank’s approach to harmonization with other donors and alignment with country priorities assesses the incentives for staff to support harmonization and alignment. The evaluation concludes that there are insufficient incentives, a finding disputed by management. Second, is the evaluation of the Bank’s internal matrix management arrangements, which is currently under way. The self-evaluation notes that Bank management tried to quash the matrix evaluation on the grounds that it did not fall under the mandate of IEG. This is an unfortunate argument, since an assessment of the institutional reasons for the Bank’s performance is an essential component of any meaningful evaluation of Bank-supported programs. While making a good case for the specific instance of the matrix evaluation, the self-evaluation report shies away from a more general statement in support of engaging IEG on issues of Bank-internal policies, management processes and incentives. It is notable that IFAD’s Independent Office of Evaluation appears to be more aggressive in this regard: It currently is carrying out a full evaluation of IFAD’s internal efficiency and previous evaluations (e.g., an evaluation of innovation and scaling up) did not shy away from assessing internal institutional dimensions.

5. World Bank Governance: The IEG self-evaluation is even more restrictive in how it interprets its mandate regarding the evaluation of the World Bank’s governance structures and processes (including its approach to members’ voice and vote, the functioning of its board of directors, the selection of its senior management, etc.). It considers these topics beyond IEG’s mandate. This is unfortunate, since the way the Bank’s governance evolves will substantially affect its long-term legitimacy, effectiveness and viability as an international financial institution. Since IEG reports to the Bank’s board of directors, and many of the governance issues involve questions of the board’s composition, role and functioning, there is a valid question of how effectively IEG could carry out such an evaluation. However, it is notable that the IMF’s Independent Evaluation Office, which similarly reports to the IMF board of directors, published a full evaluation of the IMF’s governance in 2008, which effectively addressed many of the right questions.

6. Synergies between World Bank, IFC and MIGA: The self-evaluation report points out that the recent internal reorganization of IEG aimed to assure more effective and consistent evaluations across the three member branches of the World Bank Group. This is welcome, but the report does not assess how past evaluations addressed the question of whether the World Bank, IFC and MIGA effectively capitalized on the potential synergies among the three organizations. The recent evaluation of the World Bank Group’s response to the global economic crisis of 2008/9 provided parallel assessments of each agency’s performance, but did not address whether they work together effectively in maximizing their synergies. The reality is that the three organizations have deeply engrained institutional cultures and generally go their own ways rather than closely coordinating their activities on the ground. Future evaluations should explicitly consider whether the three effectively cooperate or not. While the World Bank is unique in the way it has organizationally separated its private sector and guarantee operations, other aid organizations also have problems of a lack of cooperation, coordination and synergy among different units within the agency. Therefore, the same comment also applies to their evaluation approaches.


Self-evaluations are valuable tools for performance assessment and IEG is to be congratulated for carrying out and publishing such an evaluation of its own activities. As for all self-evaluations, it should be seen as an input to an independent external evaluation, a decision that, for now, has apparently been postponed by the Bank’s board of directors.

IEG’s self-evaluation has many strengths and provides an overall positive assessment of IEG’s work. However, it does reflect some important limitations of analysis and of certain gaps in approach and coverage, which an independent external review should consider explicitly, and which IEG’s management should address. Since many of these issues also likely apply to most of the other evaluation approaches by other evaluation offices, the lessons have relevance beyond IEG and the World Bank.

Key lessons include:

  • An evaluation of evaluations should focus not only on process, but also on the substantive issues that the institution is grappling with.
  • An evaluation of the effectiveness of evaluations should include a professional assessment of the quality of evaluation products.
  • An evaluation of evaluations should assess:
    o How effectively impact evaluations are used;
    o How scaling up of successful interventions is treated;
    o How the experience of other comparable institutions is utilized;
    o Whether and how the internal policies, management practices and incentives of the institution are effectively assessed;
    o Whether and how the governance of the institution is evaluated; and
    o Whether and how internal coordination, cooperation and synergy among units within the organizations are assessed.

Evaluations play an essential role in the accountability and learning of international aid organizations. Hence it is critical that evaluations address the right issues and use appropriate techniques. If the lessons above were reflected in the evaluation practices of the aid institutions, this would represent a significant step forward in the quality, relevance and likely impact of evaluations.