Evaluations should enable development practitioners to learn from current programmes, and improve future ones. Sadly, they currently fall far short of that ideal. Most evaluations are short, shallow, and completely ignored. There are a couple of reasons for this:
- Bad knowledge management. Large NGOs and consulting firms may commission hundreds of evaluations a year. However, these are generally not stored publically, and often not stored internally either. Consequently, there is no way for this information to be used for anything.
- Easy to suppress negative evaluations. While some evaluations do get read, they will generally be the positive ones. It is extremely to suppress any negative evaluations.
- Poor design. The terms of reference for evaluations are often impossible to implement. They may have vague or inappropriate evaluation questions, or pack a huge number of questions into a very limited number of days.
- Poor execution. Evaluation consultants are often underqualified and overworked. Especially for more technical jobs, they may just not have the skills or knowledge required.
This is ultimately a problem of incentives. There is no real incentive for evaluations to be transparent or high-quality. Neither donors nor NGOs have any desire to showcase their failures, and overwork and staff turnover prevent them from learning from experience. Donors sometimes hold organisations accountable for the result of evaluations – but they seldom consider the quality or appropriateness of the work.
So what could the solution be? I believe that we need a peer-review system for evaluations, similar to that for academic articles. This could establish and utilise criteria for a quality evaluation, and the peer-review itself could be performed by academics, consultants, or monitoring and evaluation specialists at different organisations.
This can be linked to an evaluation database. Each planned evaluation should be registered before being conducted, and then the peer-reviewed report will be placed in the database, alongside the peer reviewing and any other relevant material. The Millennium Challenge Corporation offers a great example of this. .
This will need to be enforced and supported by donors, at least initially. Donors should signal that participation in this quality assurance process will be rewarding for the organisation. Perhaps bids that commit to this approach could be favoured over those that don’t.
If successfully managed, this could address a number of the problems that begun this article. The design and execution may still be poor – but with the knowledge that the evaluation will be peer-reviewed and openly available, there are greater incentives to improve. It would no longer be possible to cherry-pick the best evaluations for publication. Finally, and perhaps most importantly, all these evaluations would be readily available, so the knowledge wouldn’t be lost.
There are of course challenges facing this approach. The database would need to cover a wide range of potential evaluation types, and not be trapped into only considering a certain approach (such as randomised control trials). Finding peer reviewers would be the hardest part – this may need to be a paid rather than voluntary role, especially given the time pressure to produce and use evaluations.
Such a database could be managed by a small, independent secretariat. The best approach, however, would be for it to be taken on by an organisation specialising in evaluations; perhaps ODI or IDS, 3ie (if they can get over their RCT obsession), Learn MandE, or Better Evaluation. Any takers?
p.s. If you enjoyed the post, see our discussion in the rough notes page, and please leave your comments below!
p.p.s. I noted after writing this that the Big Push Forwards conference report proposed a different but related idea:
“Promoting a ‘kite mark’ for evaluators and purchasers of evaluations who ascribe to certain ethical, moral and developmental standards of monitoring, evaluation and impact assessment. This would seek to create a ‘guild’ who would promote these standards and in so doing publicall and confront poor evaluation processes, evaluators and in appropriate ToRs.“