A COMPARATIVE ANALYSIS OF THE EFFICACY OF THREE PROGRAM- EVALUATION MODELS –A REVIEW ON THEIR IMPLICATION IN EDUCATIONAL PROGRAMS

Purpose of the study: This article reviews the comparative efficacy, theoretical and practical background of three program evaluation models (Stufflebeam’s CIPP model, Kirkpatrick’s model, and outcome-based evaluation models) and their implications in educational programs. The article discusses the strengths and limitations of the three evaluation models. Methodology: Peer-reviewed and scholarly journals were searched for articles related to program evaluation models and their importance.


INTRODUCTION
Based on societal needs, different programs and projects are initiated by policymakers after a comprehensive planning considering their structure, design, costs, and intended outcomes. The programs may either be designed for the short term when the fulfillment of the desired outcomes is emergently required, or they may be designed for the longer terms when the goals are broader. Both short-term and long-term program may not necessarily be efficient and perfect. To monitor the progress of a particular program, its evaluation is necessary. Evaluation of programs provides sound background about the direction of progress of the programs, their functionality, and goals achievement. Without continuous monitoring and evaluation, opportunities for growth and development of a program remain overlooked. Evaluation of programs helps make decisions about the continuity, modification, or termination of the program based on linkages with finance, career, and welfare of the people (Butler, 2020). In general, the program-evaluation efforts have been long recognized. A systematic expansion and development were observed during the 1960s in the USA where the evaluation of different programs in the military, health care services and social organizations, and educational institutions were encouraged to attain the accountability, competency, and reforms (Stufflebeam, 2001).
Educational programs and organizations are important drivers in social, behavioural, professional, and economic development, and their evaluation is likely necessary as other programs and projects do. The evaluation of educational programs and organizations helps in making the right decisions, modifying the previous decisions, attitude establishment or modification, and building the capacity of organizations (Alkin and King, 2017). The contextual benefits of program evaluation have been recognized throughout the world, and stakeholders employ a variety of evaluation models to enhance the efficacy of different programs. Several evaluation models such as the logic model, outcome-based evaluation, Stufflebeam's CIPP model, and Kirkpatrick's model, among others, have been developed and extensively used during the last few decades to assess the progress of different programs (Schalock, 2001a;Stufflebeam, 2001). Implementing a particular evaluation model for measuring the progress of educational programs depends on the structure of the evaluation model, its objectivity, feasibility and cost affectivity, intentions of the evaluation, and nature of the program to be evaluated. From theoretical and empirical studies, it is evident that different program evaluation models help in improving decision making, which leads to the quality enhancement and goal achievement of educational programs (Mizikaci, 2006;Rooholamini et al., 2017;Darma, 2019).
The objective of this review is to discuss the importance of three evaluation models, namely Stufflebeam's CIPP model, outcome-based evaluation model, and Kirkpatrick's model, in the quality enhancement of different educational programs. The three models are critically analysed and their strengths and shortcomings are highlighted.

METHODOLOGY
For structuring and organizing this article, a literature survey was conducted to collect relevant information about the program evaluation models. Peer-reviewed and scholarly journals were searched for articles related to program evaluation models and their importance. Keywords included program evaluation', 'assessment', 'CIPP model', 'evaluation of educational programs, 'outcome-based model, and 'planning'. Major databases such as google scholar, PubMed, NIH Library, andElsevierwere searched for articles by entering the keywords. A total of 38 journal articles indexed in Scopus and Web of Science (WoS) were selected for this review. Articles on Stufflebeam's CIPP model, Kirkpatrick's model, and outcome-based evaluation models were particularly focused because the review aimed at analysing these three models. The strengths and inadequacies of the three models were weighed and presented.

PROGRAM EVALUATION MODELS: THEORETICAL AND PRACTICAL IMPLICATIONS
Program evaluation is a systematic approach to assess, analyse and use the information about the progress, outcomes, goal achievements, and effectiveness of policies, projects, and programs (Usun, 2016). Every program and project have primary purposes, which are designed and executed through rigorous planning and policy inputs by stakeholders. Shadish et al. (1991) argued that program evaluation models could not be developed without the assistance of theories. They asserted that before jumping to practical manipulation of the program, understanding of knowledge, basic concepts, and rules are necessarily provided by theories. Theories provide a basic framework for designing and developing a program evaluation model. Chen (2016) stated that theories provide assumptions on how useful a program evaluation model be designed by understanding the basic components and contextual aspects of the program. From theoretical perspectives, logical conciseness, the systematic intervention of different program components, and some fundamental rules providing efficacy guidelines are the core components in developing a program. Without linking the program to a sound philosophical and theoretical background, the purpose of designing an ideal evaluation model cannot be met.
Historically, different theories have played influential roles in constructing and reforming different evaluation models for education and other programs. Widely acknowledged theories are reductionism, complexity theory, and general system theory ( Figure 1). According to Chen (2016), reductionism theory suggests the breakup of a program into its core components which may be crucially analysed and understood. Frye & Hemmer, (2012) stated that reductionism theory focuses first on the understanding the integral program, and then analysing its constituent elements and their contribution to the outcome of the integral program. They further elaborate that a linear relationship between the elements of the program can create room for changes which can impart a predictable impact on the program's outcome depending on the magnitude of the changes, e.g., as evident in the Logic evaluation model. The general system theory initially proposed by Bertalanffy in the 1920s for medicine but later on adopted for several other disciplines is the reverse of reductionism theory, and it weighs the whole system (program) as more important than its components (Frye & Hemmer, 2012). Mizikaci, (2006) explained the general system theory by highlighting its assumption of "the wholeness and universal application of the principles of organization". As per this explanation, "whole" is the crucial component of the general system theory than components, and "whole" specifies the nature of the components, which are difficult to understand if they are isolated from the "whole". Complexity theory or theory of complex adaptive system (CAS) in general states that systems (programs, organizations, environments, etc.) consist of interacting components, where diversity prevails, certainty is rare, ambiguity is certain, equilibrium is rare, evolution is common, and changes frequently occurs due to interacting components of the system, and due to cause and effects phenomena (Frenken, 2006;Morrison, 2008;Norberg & Cumming, 2008;Walton, 2014). Cunningham (2003) stated that complex theory considers a system to be dynamic, not at equilibrium, indeterminate, open to sharing information with the surrounding, based on feedback, and where the whole is regarded more than parts.
How these and other theories are helpful in constructing the program evaluation models? To answer this question, evaluators and those involved in developing a program evaluation model are supposed to consider the framework of the theory and the nature of the program. Theories provide mechanisms and concepts on how to construct an evaluation model. Evaluators consider theoretical guidelines such as which methods are suitable for the evaluation of a specific

DIFFERENT EVALUATION MODELS AND THEIR APPLICATION IN EDUCATIONAL PROGRAMS
Educational institutions notably higher educational institutes, in most of the countries, have no proper mechanisms to provide information about the outcomes of their educational programs; instead, they emphasize only on course, activities and subject contributions, and resource and research output, which according to Nusche (2008) are not adequate indicators of the quality of educational programs. To grasp a clear picture of the quality, efficiency, and affectivity of the educational programs, a comprehensive evaluation mechanism is always needed, which can lead to the removal of flaws and improvement of educational programs. Over the last few decades, efforts have been made to improve the quality, integrity, and standards of different educational programs by employing different approaches. One of the approaches is the use of a systematically structured tool termed as "program evaluation model". The purpose of an evaluation model is to assess whether the program fulfills the required needs, provides anticipated services, delivers the desired outcomes, achieves its goals and objectives, and is functioning in the way it was planned (Posavac, 2015). Evaluation of an educational program intends to check the program's progress, identify merits and flaws, and acquire information. After analysing the quality indicators of educational programs by an evaluation model, previous decisions are either maintained or modified to achieve better program outcomes. In recent years, the evaluation of programs has become a dynamic profession with wide application in health, governments, organization, and education (Madaus & Kellaghan, 2000). In the succeeding paragraphs, three evaluation models -outcome-based evaluation model, The Kirkpatrick 4 step model, and Stufflebeam's CIPP models -are discussed with their implementation in educational programs.

OUTCOME-BASED EVALUATION MODEL
The Institute of Museum and Library Services defines outcome-based evaluation as a systematic approach towards evaluating the estimation of the program's intended outcomes, effectiveness, and the benefits that the program have provided to the participants (http://www.nysl.nysed.gov/libdev/obe/). According to Brewer (2011), the outcome-based evaluation model has its origin in the United States, and it focuses on the results and effectiveness of the program and the benefits clients draw from the program. The author noted that information obtained through an outcome-based evaluation model might lead to the summation or formation of the program. He suggested its usefulness in educational programs, health systems, and organizational evaluation. Schalock (2001a)  f. The use of information and feedback to improve the program.
In another article, Schalock (2001b) described that the outcome-based evaluation model takes into account stakeholders, promoters, and program evaluators. He elaborated that the model employs four major approaches during the evaluation process of a given program, i.e. program evaluation, effectiveness evaluation, impact evaluation, and policy evaluation ( Table 1)

Limitation and strengths
The outcome evaluation model stresses on the outcomes of the program. It has been widely used in the evaluation of educational programs and has been found an effective tool; however, the model has some limitations as well. According to Schalock (2001b), when evaluators consider the appropriate measures and scientific methods while evaluating the program, its efficiency increases. However, he also highlighted that the influence of internal and external factors, validity, and reliability of tools might produce flawed assessments and decrease the efficiency of the outcome-based evaluation model. Ewell (2008) noted that the outcome-based model is flexible, transparent, comparable, and portable at the implementation phase. Based on the views of other critics, Tam (2014) highlighted some limitations of the outcomebased approaches in education. Those limitations are specificity, narrowness, quantifiability, and observability which lead to reductionism and negligence of integrative assessment of the educational program.

THE KIRKPATRICK EVALUATION MODEL
The Kirkpatrick evaluation model is a valuable tool to evaluate educational programs and professional training programs, which has been used by several organizations and institutions for evaluating their progress in the US and throughout the world (Smidt et al., 2009; Praslova, 2010; Gill & Sharma, 2013). The model was initially developed by Kirkpatrick in 1959 for evaluating training programs, later on, it was modified and made applicable for assessing the effectiveness of several programs ranging from health care, medical education, organizational performance to higher education (Praslova, 2010; Liao, & Hsu, 2019). Kirkpatrick & Kirkpatrick (2006), justified the need for program evaluation to produce better outcomes of the program and enhance its effectiveness but also emphasized for considering the needs, objectives, subject contents, participants, appropriate facilities, and several other factors before the planning, designing, and execution of that particular program. They explained the Kirkpatrick evaluation model on the basis of its four levels which are interdependent on each other: 1. Reaction -how the participants react i.e. whether they are satisfied or not, and how they feel about the evaluation program.
2. Learning -is the program effective to enhance the learning capabilities of the participants and increase their knowledge and skills.
3. Behaviour -to what extent the program is effective in changing the attitude and behaviour of the participants.  , and e-learning (Galloway, 2005) has been well established.

Strengths and limitations
Bates (2004) Kaufman and Keller (1994), the Kirkpatric model was developed with the intention of evaluating training programs and which is still mainly in practice. Organizations and institutions do not only need to evaluate training but other components of their programs too, which need a more comprehensive model, which will cover all the elements of the programs. In their review, Reio et al. (2017) outlined some limitations of the model. They asserted that in the Kirkpatric model more weightage is given to upper levels (behaviour and outcomes) than lower levels (reaction and learning); therefore most of the organizations and professionals tend to neglect lower levels.
Interdependency of the four levels is another drawback because it is not necessary that the execution of one level would lead to better outcomes for the next level. Similarly, difficulty in evaluating level 3 and four are considered by some researchers as drawbacks of the model (Moreau, 2017). Cahapay (2021) suggested that while applying the Kirkpatric model in higher education, evaluators must consider its limitations, such as consideration of lower levels as less important, rigidity, and causal linkage of the four levels.

STUFFLEBEAM'S CIPP EVALUATION MODEL
CIPP evaluation model was designed by Stufflebeam in 1971, comprising of four quality indicators i.e. Context, Input, Process, and Product (Aziz et al., 2018). The model has been successfully used to monitor and improve the quality of projects, evaluation systems, institutions, and educational programs throughout the world. According to Stufflebeam (2000), the CIPP model uses four key indicators to evaluate a program. These components are context, input, process, and product ( Figure 2). He illustrated that the context includes evaluation of needs, complications, and prospects of the program. In educational programs, context may be the identification of resources, goals, policies, and potential problems. It also includes beneficiaries of the programs. Overall missions and goals, background information, and cultural context of the educational program are also covered in context evaluation (https://poorvucenter.yale.edu/). The context is all about the recent information about the prospective functioning of the program, and the evaluators use different techniques such as surveys, interviews, documents review during context evaluation (Brewer, 2011). Stufflebeam & Zhang (2017) suggested that before designing a program, stakeholders should address the needs, goals, priorities and anticipated problems, potential risks, and opportunities. Context evaluation of the program provides information to decision-makers in the form of evaluation reports which help them in planning and setting goals in an appropriate manner (Stufflebeam & Zhang, 2017). Input evaluation assesses the strategies, resources, both financial and services, mechanisms and designs of the program's functioning, action plans, cost-effectiveness, and arrangements (Stufflebeam & Zhang, 2017). In the case of educational programs, the input may be financial resources, human resources, infrastructural resources, and a documentary work plan. In this approach, availability of financial and service resources, infrastructure and environment, and feasibility are judged and reported whether they are appropriate for executing the program. Process evaluation is an essential phase of the CIPP model, which focuses on the progress of the program. It involves monitoring, assessing, and documentation of the implementation of the designed plan in light of set goals (Stufflebeam & Zhang, 2017). The final component of the CIPP model is the product evaluation which emphasizes that whether the outcomes of the program coheres with the objectives. Stufflebeam & Zhang (2017) illustrated that the product evaluation measures the outcomes and cost-effectiveness of the program. In an educational program, product evaluation focuses on the final results, appropriateness of objectives concerning to outcomes, and the overall costeffectiveness of the program.

WHICH PROGRAM EVALUATION MODEL SHOULD BE EMPLOYED IN EDUCATION?
Like other programs and projects, the evaluation of educational programs is necessary to achieve high standards, better outcomes, and meet the objectives. The evaluation can be done before designing a particular educational program or during the already executing program. The purpose of the evaluation is to make an educational program effective in all respects before and after the initiation of that program. The three models discussed in this review have some similarities, differences, strengths, and weaknesses (Table 2). Educational stakeholders should consider several aspects of the evaluation models and the educational programs to be evaluated. Both the evaluation models and educational programs have diverse characteristics. The prime characteristics for selecting a suitable evaluation model, the stakeholders should consider their feasibility, simple application, cost-effectiveness, objectivity, comprehensiveness, and time consumption. The outcome-based evaluation model focuses on the results and effectiveness of the program, and the benefits the clients draw from the program and it has been applied in specific educational programs, health systems, and organizational evaluation (Brewer, 2011). However, the model, in general, is specific, narrow, quantifiable, and observable in nature (Tam, 2014) and thus has limitations in broad applicability in a diverse range of educational programs. Although the Kirkpatric model is more comprehensive and broader than OBE because it focuses on reaction, learning, behaviour, and outcomes, and tend to simplify the complex process; however, some limitations like focus on training, weighing higher levels more than lower levels, and difficulty in evaluating level 3 and four make it less appropriate for evaluating the educational programs (Moreau, 2017). The CIPP model seems to possess a balanced approach towards evaluation despite its few limitations. First, the model offers a comprehensive, feasible, and straightforward framework, which makes it a more suitable tool for evaluating diverse educational programs. Second, it takes into account pre, during, and after execution approaches for evaluating a program. Moreover, based on the specific needs, either a single component or whole components may be employed while evaluating an educational program.

CONCLUSION
The program evaluation is a necessary process to design, execute, and improve the progress of the program. During the last few decades, different evaluation models have been developed and extensively used for evaluating different projects and programs. Educational stakeholders have been applying many models to evaluate educational programs for improvement. The three models -the outcome-based evaluation model, the Kirkpatric model, and the CIPP modeldiscussed in this review have some strengths and weaknesses. Among the compared models, the CIPP model seems