The CIPP model
A model of evaluation outlines the criteria for and focus of an evaluation [Simone & Deserner, 2009]. There are many frameworks an organization can choose to advocate, the most popular framework however that many choose is that by Donald Kirkpartick. Other HRD theorists have modified existing frameworks including Kirkpatrick's and put forward other models. This paper will take a look at the components of the Kirkpatrick model along side the CIPP model. The CIPP model was the creation of Daniel Shufflebeam, and further worked on by James Galvin. Both will be looked at in detail from its main components to its criticisms. As well as explore recommendations for their improvement.
Donald L Kirkpatrick, Professor Emeritus, University Of Wisconsin (where he achieved his BBA, MBA and PhD), first published his ideas in 1959, in a series of articles in the Journal of American Society of Training Directors. The articles were subsequently included in Kirkpatrick's book Evaluating Training Programs. He was president of the American Society for Training and Development (ASTD) in 1975. Donald Kirkpatrick's 1994 book Evaluating Training Programs defined his originally published ideas of 1959, thereby further increasing awareness of them, so that his theory has now become arguably the most widely used and popular model for the evaluation of training and learning. Kirkpatrick's four-level model is now considered an industry standard across the HR and training communities. [Ref. #2.]
As a result his model is the most well-known and used for measuring the effectiveness of training programs. The basic structure of Kirkpatrick's evaluation model focuses on "Reaction, Learning, Behavior and Results".
- Reaction of student is what they thought and felt about the training;
- Learning being the resulting increase in knowledge or capability;
- Behaviour is the extent of behaviour and capability improvement and implementation/application; and
- Results which is the effects on the business or environment resulting from the trainee's performance.
Daniel Stufflebeam retired in 2007 as Distinguished University Professor at Western Michigan University (WMU) [Ref #3].
His major contributions were to the development and advocacy of the evaluation profession. Stufflebeam was responsible for having developed one of the first models for systematic evaluation, the CIPP Model. The model's core concepts are denoted by the acronym CIPP, which stands for evaluations of an entity's context, inputs, processes, and products.
Kirkpatrick learning evaluation model
This particular piece of literature goes into detail about each level of Kirkpatrick's model. Looking at what each level evaluates and how the evaluation is carried out.
Starting with Reaction, as the word implies, evaluation at this level measures how the learners react to the training. In order to obtain the necessary information for this level, it is often measured with attitude questionnaires that are passed out after most training classes. The target outcome is the learner's perception (reaction) of the course. Learners are often keenly aware of what they need to know to accomplish a task. If the training program fails to satisfy their needs, a determination should be made as to whether it's the fault of the program design or delivery.
Reaction evaluation is how the delegates felt and their personal reactions to the training or learning experience. Common examples of determining this are:
- Did the trainees like and enjoy the training?
- Did they consider the training relevant?
- Was it a good use of their time?
- Did they like the venue, the style, timing, domestics, etc?
- Level of participation.
- Ease and comfort of experience.
- Level of effort required making the most of the learning.
- Perceived practicability and potential for applying the learning.
The next level learning is the extent to which participants change attitudes, improve knowledge, and increase skill as a result of participating in the learning process. It addresses the question, Did the participants learn anything?
The learning evaluation requires some type of post-testing to ascertain what skills were learned during the training. In addition, the post-testing is only valid when combined with pre-testing, so that you can differentiate between what they already knew prior to training and what they actually learned during the training program. Learning evaluation is the measurement of the increase in knowledge or intellectual capability from before to after the learning experience.
This can be assessed by considering if:
- Did the trainees learn what was intended to be taught?
- Did the trainee experience what was intended for them to experience?
- What is the extent of advancement or change in the trainees after the training, in the direction or area that was intended?
With Performance (behaviour), this evaluation involves testing the student's capabilities to perform learned skills while on the job, rather than in the classroom. It determines if the correct performance is now occurring by answering the question, “Do people use their newly acquired learnings on the job?” Behaviour evaluation is the extent to which the trainees applied the learning and changed their behaviour, and this can be immediately and several months after the training, depending on the situation. And in order to narrow down the results of this level we look at:
- Did the trainees put their learning into effect when back on the job?
- Were the relevant skills and knowledge used
- Was there noticeable and measurable change in the activity and performance of the trainees when back in their roles?
- Was the change in behaviour and new level of knowledge sustained?
- Would the trainee be able to transfer their learning to another person?
- Is the trainee aware of their change in behaviour, knowledge, skill level?
In the final level, Results it measures the training program's effectiveness, that is, “What impact has the training achieved?” These impacts can include such items as monetary, efficiency, moral, teamwork, etc. Results evaluation is the effect on the business or environment resulting from the improved performance of the trainee.
Measures would typically be business or organizational key performance indicators, such as volumes, values, percentages, timescales, return on investment, and other quantifiable aspects of organizational performance, for instance; numbers of complaints, staff turnover, attrition, failures, wastage, non-compliance, quality ratings, achievement of standards and accreditations, growth, retention, etc.
The CIPP Model by Amy McLemore
February 04, 2009
The CIPP Model is a comprehensive framework for guiding formative and summative evaluations of projects, programs, personnel, products, institutions, and systems. According to the article here by Amy McLemore this model attempts to make evaluation directly relevant to the needs of decision-making during the different phases and activities of a program.
In the CIPP approach, in order for an evaluation to be useful, it must address those questions which key decision-makers are asking, and must address the questions in ways and language that decision-makers will easily understand. The approach aims to involve the decision-makers in the evaluation planning process as a way of increasing the likelihood of the evaluation findings having relevance and being used.
However, some of the commissioning agencies who receive the reports from participative evaluation say they do not always find then helpful in decision-making, because of the nature of the reports produced and lack of clear indications for decision-making or conflicting conclusions.
According to Stufflebeam, evaluation should be a process of delineating, obtaining and providing useful information to decision-makers, with the overall goal of program or project improvement.
Kirkpatrick's model treats evaluation as an end of the process activity. Where the objective should be to treat evaluation as an ongoing activity that should begin during the pre-training phase.
As the word implies, evaluation at this level measures how the learners react to the training. This level is often measured with attitude questionnaires that are passed out after most training classes. This level measures one thing: the learner's perception (reaction) of the course. Learners are often keenly aware of what they need to know to accomplish a task. If the training program fails to satisfy their needs, a determination should be made as to whether it's the fault of the program design or delivery.
This level is not indicative of the training's performance potential, as it does not measure what new skills the learners have acquired or what they have learned that will transfer back to the working environment. This has caused some evaluators to down play its value. However, the interest, attention and motivation of the participants are often critical to the success of any training process - people often learn better when they react positively to the learning environment by seeing the importance of it.
This is the extent to which participants change attitudes, improve knowledge, and increase skill as a result of participating in the learning process. It addresses the question: Did the participants learn anything? The learning evaluation requires some type of post-testing to ascertain what skills were learned during the training. In addition, the post-testing is only valid when combined with pre-testing, so that you can differentiate between what they already knew prior to training and what they actually learned during the training program.
Measuring the learning that takes place in a training program is important in order to validate the learning objectives. Evaluating the learning that has taken place typically focuses on such questions as:
- What knowledge was acquired?
- What skills were developed or enhanced?
- What attitudes were changed?
Learner assessments are created to allow a judgment to be made about the learner's capability for performance. There are two parts to this process: the gathering of information or evidence (testing the learner) and the judging of the information (what does the data represent?). This assessment should not be confused with evaluation. Assessment is about the progress and achievements of the individual learners, while evaluation is about the learning program as a whole.
It is important to measure performance because the primary purpose of training is to improve results by having the students learn new skills and knowledge and then actually applying them to the job. Learning new skills and knowledge is no good to an organization unless the participants actually use them in their work activities. Since level-three measurements must take place after the learners have returned to their jobs, the actual Level three measurements will typically involve someone closely involved with the learner, such as a supervisor.
Although it takes a greater effort to collect this data than it does to collect data during training, its value is important to the training department and organization as the data provides insight into the transfer of learning from the classroom to the work environment and the barriers encountered when attempting to implement the new techniques learned in the program.
The final results that occur, measures the training program's effectiveness, that is, “What impact has the training achieved?” These impacts can include such items as monetary, efficiency, moral, teamwork, etc.
The four-levels of evaluations mean very little to the other business units. it reflects a learning-centric perspective that tends to confuse rather than clarify issues and contribute to the lack of understanding between business and learning functions. Although there are a variety of definitions for training, it is generally considered an HRD intervention or process for fixing a performance problem through some type of learning program.
However Stufflebeam viewed evaluation in terms of the types of decisions it served and categorized it according to its functional role within a system of planned social change. Critics of CIPP have said that it hold an idealistic notion of what the process should be rather that its actuality and is too top-down or managerial in approach, depending on an ideal of rational management rather than recognizing its messy reality. In practice, the informative relationship between evaluation and decision-making has proved difficult to achieve and perhaps does not take into account sufficiently the politics of decision-making within and between organizations.
The Model is configured to enable and guide comprehensive, systematic examination of efforts that occur in the dynamic, septic conditions of the real world, not the controlled conditions of experimental psychology and split plot crop studies in agriculture.
Context evaluations assess needs, problems, assets, and opportunities to help decision makers define goals and priorities and help the broader group of users judge goals, priorities, and outcomes.
Input evaluations assess alternative approaches, competing action plans, staffing plans, and budgets for their feasibility and potential cost-effectiveness to meet targeted needs and achieve goals. Decision makers use input evaluations in choosing among competing plans, writing funding proposals, allocating resources, assigning staff, scheduling work, and ultimately in helping others judge an effort's plans and budget.
Process evaluations assess the implementation of plans to help staff carry out activities and later help the broad group of users judge program performance and interpret outcomes.
Product evaluations identify and assess outcomes - intended and unintended, short term and long term - both to help a staff keep an enterprise focused on achieving important outcomes and ultimately to help the broader group of users gauge the effort's success in meeting targeted needs.
In summing up the product evaluation component it may be divided into assessments of impact, effectiveness, sustainability, and transportability.
These product evaluation subparts ask:
- Were the right beneficiaries reached?
- Were their targeted needs met?
- Were the gains for beneficiaries sustained?
- Did the processes that produced the gains prove transportable and adaptable for effective use elsewhere?
In finalizing a summative report, the evaluator refers to the store of context, input, process, and product information and obtains additionally needed information. The evaluator uses this information to address the following retrospective questions: Were important needs addressed? Was the effort guided by a defensible plan and budget? Was the service design executed competently and modified as needed? Did the effort succeed?
The four-level evaluation model is often criticized today for being too old. Rather than being just about evaluation, it should have been presented as both a planning and evaluation model. In order to do this according to Clark, 2008 it needs to be flipped it upside-down. Rearrange the steps into a “backwards planning” tool by starting with the end in mind.
Thus, planning and analysis needs to work backward. This is done by identifying The desired impact that will improve the performance of the business; the level of performance the learners must be able to do to create the impact ; the knowledge and skills they need to learn in order to perform ; and what they need to perceive in order to learn.
Planning it backwards will help to ensure there is a circular causality. Thus the learners' perception of the need to learn should motivate them to learn, which in turn causes the desired performance that drives the impact desired by our customer. This causality should continue in a circular fashion in that the results achieved should now drive the performers' perceptions of the need to learn more and perform better in order to achieve even better results. Of course this assumes that not only the customer understands the level of impact achieved, but also the performers/learners' perception on how close they came to achieving the desired result.
This model offers a way of overcoming the top-down approaches to evaluation. It is argued that all stakeholders have a right to be consulted about concerns and issues and to receive reports which respond to their information needs, however in practice, it can be difficult to serve or prioritize the needs of a wide range of stakeholders (Worthen & Sanders, 1987). In the stakeholders approach, decisions emerge through a process of accommodation (or democracy based on pluralism and the diffusion of power). So the shift in this type of approach is from decision-maker to audience.
Cronbach (1982) argues that the evaluator's mission is to facilitate a democratic, pluralist process by enlightening all the participants. However, some of the commissioning agencies who receive the reports from participative evaluation say they do not always find then helpful in decision-making, because of the nature of the reports produced and lack of clear indications for decision-making or conflicting conclusions. The CIPP Model thus looks at evaluation as an essential concomitant of improvement and accountability within a framework of appropriate values and clear answers.
There are many frameworks an organization can choose to advocate, in selecting a model of evaluation for their organization. Our focus in this paper was on the Kirkpatrick model and the CIPP model. Both of which are comprehensive frameworks that can together target all the areas of the organization that demand attention or have needs to be addressed in terms of a HRD perspective.
The Kirkpatrick model takes a more technical approach to obtaining the desired results. This model treats evaluation as an end of the process activity. Where the objective should be to treat evaluation as an ongoing activity that should begin during the pre-training phase. An evaluation at each level answers whether a fundamental requirement of the training program was met. It's not that conducting an evaluation at one level is more important that another. All levels of evaluation are important.
However where the Kirkpatrick model stops, the CIPP model can adequately address those areas that its predecessor did not target. It focuses more on management training and addressing the questions of the decision makes. And links evaluations directly back to the needs of the decision makers.
Consequently CIPP was more preferred by training specialists when evaluating management training and development programs. And Kirkpatrick's model may be more appropriate for evaluating manual and technical skills training, but it is too narrow for evaluating management training. This model was more or less an intervention for fixing a problem through a learning program. Whereas the CIPP stresses that its overall goal is project improvement.
- Desimone, Randy L. and Jon M. Werner, 2008. Human Resource Development. Southwestern college publications.
- Dubois, David D., 1993. Competency-based performance improvement: a strategy for organizational change. 1 ed. HRD press
- Article #1:
- Article #2:
The CIPP Model, Amy McLemore, February 04, 2009.
- Clark, D. (2008). Flipping Kirkpatrick. bdld.blogspot.com. Dec. 17, 2008.
- Cronbach, L. J. 1982. Designing Evaluations of Educational and Social Programs. San Francisco: Jossey-Bass.
- Worthen, B.R. & Sanders, J.R. (1987). Educational Evaluation: Alternative Approaches and Practical Guidelines.New York: Longman.