Program evaluation

Program evaluation is essentially a set of philosophies and techniques to determine if a program 'works'. It is a practice field that has emerged, particularly in the USA, as a disciplined way of assessing the merit, value, and worth of projects and programs. Evaluation became particularly relevant in the 1960s during the period of the Great Society social programs associated with the Kennedy and Johnson administrations. Extraordinary sums were invested in social programs, but the means of knowing what happened, and why were not available.

Behind the seemingly simple question of whether the program works are a host of other more complex questions. For example, the first question is, what is a program supposed to do? It is often difficult to define what a program is supposed to do, so indirect indicators may be used instead. For example schools are supposed to 'educate' people. But what does 'educate' mean? Give knowledge? Teach how to think? Give specific skills? If the exact goal cannot be defined well, it is difficult to indicate whether the program 'works'.

Another question about programs is, what else do they do? There may be unintended or unforeseen consequences of a program. Some consequences may be positive and some may be negative. These unintended consequences may be as important as the intended consequences. So evaluations should measure not just whether the program does what it should be doing, but what else it may be doing.

Perhaps the most difficult part of evaluation is determining whether it is the program itself that is doing something. There may be other events or processes that are really causing the outcome, or preventing the hoped for outcome. However, due to the nature of the program, many evaluations cannot determine whether it is the program itself, or something else, is the 'cause'.

One main reason that evaluations cannot determine causation involves self selection. That is, people select themselves to participate in a program. For example, in a jobs training program, some people decide to participate, and others, for whatever reason, do not participate. It may be that those who do participate are those who are most determined to find a job, or who have the best support resources, thus allowing them to participate and allowing them to find a job. The people who participate are somehow different from those who don't participate, and it may be the difference, not the program, that leads to a successful outcome for the participants, that is, finding a job.

If programs could, somehow, use random assignment, then they could determine causation. That is, if a program could randomly assign people to participate or to not participate in the program, then, theoretically, the group of people who participate would be the same as the group who did not participate, and an evaluation could 'rule out' other causes.

However, since most programs cannot use random assignment, causation cannot be determined. Evaluations can still provide useful information. For example, the outcomes of the program can be described. Thus the evaluation can say something like, "People who participate in program xyz were more likely to find a job, while people who did not participate were less likely to find a job."

If the program is fairly large, and there are many participants, and there is enough data, statistical analysis can be used sometimes to make a 'reasonable' case for the program by showing, for example, that other causes are unlikely.

Another approach is to use the evaluation to analyze the program process. So instead of focusing on the outcome (for example, did people in a jobs training program get jobs), the evaluation would focus on what the program was doing. For example, did people seem to learn the skills being taught? Did people stay in the program or did they drop out part way through? Were the teachers teaching appropriate skills? And so forth. This information could help how the program was operating.

People who do program evaluation can come from many different backgrounds, such as sociology, psychology, economics, social work or many other areas. Some graduate schools also have specific training programs for program evaluation.

Program evaluations can involve quantitative methods of social research or qualitative methods or both.

Types of evaluation
Program evaluation is often divided into types of evaluation.

Formative Evaluation occurs early in the program. The results are used to decide how the program is delivered, or what form the program will take. For example, an exercise program for elderly adults would seek to learn what activities are motivating and interesting to this group. These activities would then be included in the program.

Process Evaluation is concerned with how the program is delivered. It deals with things such as when the program activities occur, where they occur, and who delivers them. In other words, it asks the question: Is the program being delivered as intended? An effective program may not yield desired results if it is not delivered properly.

Outcome Evaluation addresses the question of what are the results. It is common to speak of short-term outcomes and long-term outcomes. For example, in an exercise program, a short-term outcome could be a change knowledge about the health effects of exercise, or it could be a change in exercise behavior. A long-term outcome could be less likelihood of dying from heart disease.

CDC framework
In 1999, the Centers for Disease Control and Prevention (CDC) published a six-step framework for conducting evaluation of public health programs. The publication of the framework is a result of the increased emphasis on program evaluation of government programs in the US. The six steps are:
 * 1) Engage stakeholders, a term referring to anyone with an interest in the program.
 * 2) Describe the program.
 * 3) Focus the evaluation.
 * 4) Gather credible evidence.
 * 5) Justify conclusions.
 * 6) Ensure use and share lessons learned.