Peer Review of the Evaluation Framework and Methodology Report for Early Learning and Child Care Programs 2004

Size: px
Start display at page:

Download "Peer Review of the Evaluation Framework and Methodology Report for Early Learning and Child Care Programs 2004"

Transcription

1 Peer Review of the Evaluation Framework and Methodology Report for Early Learning and Child Care Programs 2004 Arnold J. Love, Ph.D. August 11, 2004

2 Background and Introduction In March of 2003, the Federal, Provincial and Territorial (F/P/T) Ministers responsible for Social Services agreed on a Multilateral Framework on Early Learning and Child Care (ELCC) to promote early childhood development and support the participation of parents in employment or training by improving access to affordable, quality early learning and child care programs and services. A F/P/T Subcommittee on ELCC Evaluation was formed to manage the task and meet the goal of producing an Evaluation Framework to guide the jurisdictions undertaking evaluations of early learning and child care programs and services. The Evaluation Framework is a joint effort by the F/P/T Subcommittee on ELCC Evaluation a working group composed of representatives from 7 provincial governments and representatives from Social Policy Directorate and Audit and Evaluation Directorate of Social Development Canada and evaluation specialists. The Evaluation Framework is not intended to be a unified evaluation plan to be applied across Canada. Rather the Evaluation Framework provides suggested approaches and ideas that can be used by jurisdictions when undertaking evaluation of ELCC programs and services. Consequently, the Evaluation Framework should be flexible to accommodate differences among jurisdictions, yet rigorous enough to provide credible evaluation data about ELCC programs and services. This Peer Review Report is part of the Evaluation Framework development process. The ELCC Evaluation Working Group required the Peer Review to provide an objective, expert assessment of the quality, integrity and reliability of the draft Evaluation Framework and Methodology Report and to identify any improvements that will enhance the usefulness of this document. The Peer Review will assist the ELCC Evaluation Working Group to determine if the Evaluation Framework is broad enough in scope to be inclusive of ELCC programs and services in Canada and to provide advice about the logic models, evaluation issues and questions, and measures. Question 1. To what extent did the Evaluation Framework and Methodology Report meet its objectives? In my opinion, the Draft Evaluation Framework met the overall objective to suggest approaches and ideas that can be used by jurisdictions when undertaking evaluations of early learning and child care programs and services. The following table presents my assessment of the extent that the Draft Evaluation Framework met the component objectives. 1

3 Component Objectives Drafting overarching ELCC logic model Identifying groups that benefit from ELCC programs and activities Demonstrating the applicability of the overarching logic model through actual ELCC examples Identifying sample evaluation questions Reviewing potential measures and current sources of data that address key evaluation questions Providing a reference tool for range of possible types of evaluations and associated methodologies Providing a literature review of recent ELCC evaluations in Canada and other jurisdictions Met Objective Fully Met Objective Partially Did Not Meet Objective At All The Draft Evaluation Framework did draft an overarching ELCC logic model, but in my opinion it raised several questions that merit discussion with the evaluation subcommittee, such as: Is the overarching logic model so vague that it does not accurately portray the program theory implied in the ELCC Framework Agreement? Is the overarching logic model sufficient for complex partnership programs and programs that are catalysts for change? Given the importance of outputs in the ELCC Framework Agreement, should they not be made explicit in the overarching ELCC logic model? Given the ELCC Framework Agreement and findings of the literature review, should not systems level outcomes logically proceed client outcomes? With respect to the cluster and individual application logic models, it seems that it would be useful to provide separate examples at the policy and the program levels. The examples provided should be carefully re-examined to clarify their theory of change and further details that document the cause-and-effect relationship should be added to the models. To illustrate, in the Accreditation example, the overall program activity Accreditation Agency confers accredited status requires expansion to identify the theoretically potent activities that lead logically to higher quality ELCC programs and services, such as development of 2

4 widely-recognized accreditation standards by the field itself, forming an accreditation team by the program, conducting a self-study by program Board and staff, conducting consumer and stakeholder surveys by program, correcting deficiencies identified during the self-study and survey process. Although the Draft Evaluation Framework does identify a sample of evaluation questions, sources of data, and data collection tools, in my opinion, some changes are necessary for readers to derive maximum value from them. These changes include making a clear distinction between policy level and program level questions, data sources, and data gathering methods; describing the industry-standard process for planning evaluations, so that readers can receive guidance about tailoring the sample evaluation questions to the readers unique information needs and local situation; describing proven tools for matching questions to methods, such as instructions for developing their own evaluation frameworks; and providing clear recommendations about the preferred evaluation measures and tools to answer specific types of questions. More details may be found in my responses to Questions 2 through 6. In my view, the Draft Evaluation Framework does present a valuable reference tool and it has collected together and reviewed a broad range of recent literature regarding ELCC policy and program evaluation literature. As a reference tool, however, it would benefit from summaries and clear recommendations based on the research and practice literature, succinct descriptions of the psychometric properties of major outcomes measurement tools (e.g., Early Development Instrument) and/or data collection methods (e.g., key informant surveys), and more explanation of methods for monitoring programs, outputs, and short-term outcomes. More details may be found in my responses to Questions 2 through 6. Question 2. Is the Evaluation Framework broad enough in scope to capture early learning and child care programs and services in Canada? If not, please explain why. In my view, the adequacy of the breadth of the Evaluation Framework should be considered in light of the purposes that the evaluations of early learning and child care programs will serve for the F/P/T governments. These purposes are summarized in Section I, page 2: Assessing the progress of programs and services in achieving expected outcomes for young children and their families Fostering continuous improvement of early learning and child care programs and services Promoting best practices Making evidence based decisions 3

5 The Draft Evaluation Framework includes a varied mix of evaluation approaches and methodologies that address evaluation questions at both policy and program levels. From the vantage point of a reader, the Draft Evaluation Framework presents an overwhelming amount of information, largely descriptive, that requires further analysis and synthesis by the Framework authors to identify those methods that are best suited a) to meet each of the purposes for the evaluations noted above and b) to meet the unique information needs of policy makers and those responsible for ELCC programs (e.g., government program supervisors, community planners, program managers and staff). This analysis and synthesis should also provide summary comments about the comparative strengths and weaknesses of evaluation approaches and methods to address these evaluation purposes for the range of early learning and child care programs in Canada. For example, the Evaluation Framework should make clear that measuring outcomes using standardized tests of child development may not be appropriate for some ELCC programs and/or communities and must be pre-tested for factors such as cultural relevance, community acceptance, and supply of trained observers, before being implemented by a provincial or territorial government. Question 3. Is the framework comprehensive in coverage of evaluation issues and questions? If there are omissions, please identify and elaborate. The Draft Evaluation Framework offers an extensive Evaluation Matrix that includes examples of evaluation questions, examples of measures, and examples of data sources and collection strategies. Regarding the evaluation issues, although the four issues (program rationale/continued relevance, program design and delivery, program success, and cost-effectiveness) identified in the Draft Evaluation Framework help impart order by categorizing a vast array of evaluation issues, in my opinion, they do not adequately reflect contemporary use of monitoring and evaluation to improve performance and achieve results. The current focus of evaluation rests more on identifying and managing the contributions of various factors to achieve the intended outcomes. Policy makers, government program supervisors, and managers of programs are being asked to actively apply the information gained through evaluations to support informed decision making, enhance organizational learning and development, ensure accountability and achievement of strategic outcomes, and build organizational and community capacity. These concepts are not new to the ELCC field in Canada, which has long advocated for a holistic approach to policies and programs that focus on children, families and communities at the core, and continuous improvement in the quality of ELCC 4

6 programs to ensure the attainment of important outcomes for children and families while building stronger and healthier communities. Because the Draft Evaluation Framework makes extensive use of the four issues throughout the document and to organize and analyze the evaluation literature, perhaps a practical approach would be to link these four issues together holistically in a continuous process, rather than altering or abandoning them. For example, learning from monitoring and evaluation contributes to better decision making about program design and continuously improve service delivery; better decision making and effective programs lead to the attainment of intended outcomes and greater accountability to stakeholders; and better decision making and partnering closely with stakeholders promotes shared knowledge about good/best practices, transfer of skills, assessment of cost-effective alternatives and continued relevance, and develops the capacity of the ELCC sector for planning, monitoring, and evaluation. In short, my suggestion is to include another section immediately before Section III, 8 (p. 27) that clearly situates the four evaluation issues noted in the Draft Evaluation Framework within current approaches to monitoring and evaluation in the context of results-based management, organizational learning, and capacitybuilding. It is also my opinion that the terms formative and summative evaluation are less useful today, given the current focus on results, use of ongoing evaluation for the active management of activities and outputs to achieve outcomes, and emphasis on evaluation knowledge creation and mobilization. The Draft Evaluation Framework identifies a key point in the first sentence of Section III 9. (p. 27) that, in addition to assessing the evaluation issues, consideration must be given to the program phase of maturity. The terms formative and summative tend to break program evaluation into two phases, whereas the more current approaches to results-based management see evaluation as being useful throughout the entire program developmental cycle, from initial needs assessment, program design, early program implementation, ongoing monitoring and improvement of service delivery practices to achieve short-term outcomes and cost-effective implementation, assessment of intermediate outcomes, and revisions to ensure continued relevancy and attainment of strategic policy and program outcomes. Whereas formative and summative evaluation often suggests that outcome evaluation occurs 3-5 years subsequent to start-up and recurs on a cyclical basis, every 5-7 years, under the more current approach to evaluation, these timeframes become shorter and more results-oriented. Annual reports and annual reviews that emphasize evaluation of progress towards results (outputs and outcomes, including outcomes that are difficult to measure) with strong 5

7 stakeholder participation and generating lessons learned are supplemented by a limited number of in-depth evaluations based on policy-relevance or accountability requirements. These distinctions are important because adopting a results-based approach towards monitoring and evaluation requires a change in the roles, tools, and evaluation processes. For example, F/P/T governments will need flexible monitoring tools (program profiles, evaluation plans, program progress sheets, annual reports, stakeholder meetings) to help program supervisors and policy analysts better understand what program components and strategies are contributing to the desired client, system, and strategic outcomes. Before presenting the Evaluation Matrix (Section IV, p. 29), I think that it would be helpful to provide a summary that gave readers guidance about measuring individual child, parents/family, organizational, and community outcomes. This guidance should link to current best practices for assessing ELCC policies and programs. For example, traditionally child outcome evaluation focused on the child alone, whereas current best practice in the ELCC field assesses child development and functioning in relation to the family, caregivers, and the larger ecosystem (neighbourhood, community, society). As another example, considerable research shows that effective mother-child interactions and strong mother-child attachment are linked to positive child outcomes, and the motherchild relationship mediates risk. These orienting comments should also identify factors that could influence measurement of outputs and outcomes, such as cultural bias, geographic differences, and child/family/community factors (e.g., poverty, oppression, homelessness). Regarding the evaluation questions, although the examples offered in the Evaluation Matrix are helpful, in my opinion the ELCC Evaluation Framework should outline the steps needed to identify the specific evaluation questions that are unique to each evaluation. In other words, the ELCC Evaluation Framework should briefly describe the standard practices used by evaluators for planning an evaluation that includes systematic ways for working with stakeholders to identify key evaluation questions and set question priorities, and then for creating a simple evaluation framework that links the specific evaluation questions to sources of information, specific data gathering methods, evaluation work plan, and dissemination plan. In my mind, by adding this material, the ELCC Evaluation Framework would provide readers with a clear and coherent overview of the evaluation process, a useful tool employed by professional evaluators (evaluation framework), and directions that would show readers how to make use of the examples provided in the Evaluation Matrix to form the specific evaluation questions, potential sources 6

8 of information, and data collection tools that are tailored to their own unique evaluation needs. It would help to build readers evaluation capacity by sharing with them the basic tools used by professional evaluators to plan and manage their evaluations. Question 4. Are the examples of data sources and collection strategies appropriate? Are there others that should be considered? My major concerns have to do with both the measures and the data sources and data collection strategies taken together. My first concern is with the overall design of the Evaluation Matrix. The Evaluation Matrix includes a variety of measures and data collection sources/strategies that address evaluation questions at both policy and program levels. In my experience, presenting the matrix in this manner can be very confusing especially for readers at the program level. For example, the use of primary and secondary sources of data for measuring improved scores on measures of children s improvements in social, intellectual, emotional, and physical development can be quite different for a policy analyst who wants to evaluate aggregate changes across a province or territory versus a program manager who wants to assess progress on individual child outcomes for children in an innovative parent-child drop-in program. The data sources and data collection strategies useful for policy analysis should be presented separately from those suitable for program level evaluation. Presenting the range of measures and data sources and data collection options in the same table can be intimidating for those who are not sophisticated regarding evaluation methodology. In my opinion, useful evaluation requires selecting the appropriate measures, data sources, data collection tools to match the evaluation questions, stakeholder information needs, and demands of the situation. This requires clear explication of the process used by evaluators for making appropriate matches between questions, measures, and methods (see my comments in response to Question 3, above). In my view, the Evaluation Matrix requires introductory and summary commentaries by the authors of the ELCC Evaluation Framework to identify the following for each issue or major category of evaluation question: Most commonly used measures, data sources, and data collection methods for policy and program level evaluations Strengths/weaknesses of the most commonly used approaches Preferred approaches in the Canadian context, if resources and circumstances permit 7

9 Without such commentary, I feel that there is danger that the Evaluation Matrix, as one example, might encourage government policy analysts into thinking that sophisticated evaluation methods should be employed because these methods are presented in the matrix, even though the literature review makes clear that descriptive evaluation studies, using available program records and surveys, remain the workhorse data sources and tools in most jurisdictions. It seems to me that the authors of the Evaluation Framework are in the best position to analyze the lessons learned about data sources and collection strategies, assess their comparative merits within the Canadian context and the evaluation needs of the Multilateral Framework on Early Learning and Child Care, and provide their commentary about the preferred tools that may be used flexibly by F/P/T governments at the policy and program levels to achieve the intended outcomes in their respective jurisdictions. With respect to the data sources and data collection strategies, there is the need for an Appendix that briefly described each data source or data collection strategy (e.g., needs assessment, program utilization studies), perhaps limited to one page per methods, because there are many technical terms used with specific meanings by program evaluators that might not be understood by all readers. As presented in the Evaluation Matrix, there are redundancies that can be eliminated or reduced. For example, in Section IV, 10 (p. 29), use of documents could be mentioned only once, and specific instances as examples in parentheses (e.g., parental priorities, policy documents). Although the redundancies increase length, they can also maintain clarity, so reducing the redundancies should be considered carefully. Question 5. Is the evaluation methodology adequate for addressing the suggested evaluation issues and questions? If not, please identify areas of improvement and ways to achieve them. The Draft Evaluation Framework provides a wide variety of methods for addressing the suggested evaluation issues and questions. Given space constraints, my comments about evaluation methodology will focus on several specific areas only. My response to Questions 3 (above) noted that the ELCC Evaluation Framework should briefly describe the standard practices used by evaluators for planning an evaluation, a process that includes systematic ways for working with stakeholders to identify key evaluation questions and set question priorities, and then for creating an evaluation framework that links the specific evaluation 8

10 questions to sources of information, specific data gathering methods, evaluation work plan, and dissemination plan. By following this process for planning an evaluation, readers should be better able to use the information about evaluation methodology and make choices appropriate for the unique situations. In current approaches to results-based management and evaluation, program supervisors and program managers and assume greater responsibility for managing their policy portfolios or individual programs and for being accountable to their key stakeholders for results. Program templates are one of the key evaluation tools used to support this process. Program templates are brief summarizes the key aspects of a program in a format that is easily understood by program managers, staff, and evaluators. Program templates systematically describe key program processes, resources, and structures in a way that facilitates program implementation, program monitoring, and the achievement of program outcomes. Program templates are similar to the program profiles recommended by Treasury Board and developed during the Results-based Management Accountability Framework (RMAF) process. The inclusion of program templates as a core evaluation methodology in the ELCC Evaluation Framework would greatly facilitate designing and implementing effective ELCC programs and facilitate annual reporting by Ministers on key performance measures and outcomes. In my opinion, the monitoring of program delivery requires greater explanation in Section V, especially given the intention of Ministers to report annually on key performance measures related to expenditures, availability and accessibility, affordability, quality, inclusiveness, and parental choice as stated in the Multilateral Framework on Early Learning and Child Care. Monitoring is an essential methodology within a results-based framework that emphasizes a continuous process involving partners and focused on measuring key service delivery processes and progress towards outcomes. Monitoring regularly assesses the program logic and results chain to determine what is working well and what is not, makes recommendations, and follows-up with documented decisions and actions. Evaluations of outcomes and continued relevance enhances monitoring by assessing the links between program activities and processes, outputs, and outcomes based on empirical evidence. The findings of both monitoring and program evaluation can be important components of an ELCC Knowledge Base that identifies good practices and summarizes lessons learned to enhance organizational learning and foster capacity building in the sector. 9

11 Question 6. What are the strengths and weaknesses of the draft framework? Are there areas of the draft framework that need to be improved? If so, in what ways? In my view, the Draft Evaluation Framework has numerous strengths. The literature review is comprehensive and provides a very valuable summary of the published ELCC evaluation literature from Canada and other jurisdictions. To my knowledge, this information does not exist in similar form elsewhere. Together with the extensive tables that analyze these evaluations, readers can obtain an excellent overview of the range of evaluation topics and methodologies that have been conducted on ELCC policies and programs. The Draft Evaluation Framework makes clear who are the groups that benefit from ELCC programs and activities and the major types of outcomes associated with each group. This type of information can greatly facilitate deliberations about evaluation designs and help clarify the attribution of results in situations involving multiple stakeholders. In my opinion, the greatest strength of the Draft Evaluation Framework contributes to its weaknesses its comprehensive approach to developing a flexible framework that may be applied by F/P/T governments to a vast array of policies and programs. The weaknesses include the overwhelming amount of information and array of choices, lack of clear description of a systematic process that policy makers and program managers could use to involve stakeholders and tailor appropriate evaluation questions and methods from those suggested in the Evaluation Matrix, and insufficient detail about the selection and use of specific data collection strategies and measurement tools that is, weaknesses associated with being a kilometre wide and a millimetre deep. Please do not take misconstrue this as harsh criticism. The Draft Evaluation Framework is without a doubt an impressive piece of work, but, nonetheless, I think that it is likely to present some daunting obstacles, especially to those fairly new to performance measurement and outcomes evaluation, without some additional refinements. The Draft Evaluation Framework makes a strong effort at undertaking the very difficult task of creating an overarching logic model, sample logic models for specific programs, and individual cluster logic models linking groups of common programs and services by client group. In my opinion, however, the logic model should conform to the evaluation industry-standard practice of including a box for resources (inputs), because this will help expenditure reporting, facilitate costeffectiveness evaluation, and aid in the fine-tuning of program designs and implementation strategies. 10

12 Another important aspect of building logic models is the practice of including a thorough description of the contextual and environmental factors, together with a clearly defined statement of the problem being addressed by the policy or program. The context factors are also called enablers and constraints and they are important because they might influence the performance of the program. For example, an enabler might be higher SES among ELCC program clients and constraints might include environmental factors, such as increased economic pressures, or program context factors, such as influx of refugee families or factory closure in the community. These enablers and constraints can effect program outputs and outcomes dramatically and they need to be assessed and recorded during the development of the logic model, as well as periodically afterwards. Because these factors can influence evaluation results, their influence should be assessed carefully when designing monitoring and evaluation strategies. To improve its usefulness for program evaluation, in my opinion, the Draft Evaluation Framework should provide a summary of the major standardized data collection tools that the Evaluation Framework authors recommend for evaluating child, family, community, systems, or strategic outcomes and related factors. As a example, the authors might recommend Early Development Instrument for child development and the Early Childhood Environment Rating Scale for measuring the quality of child care environments. It would be very helpful to have a summary of the psychometric properties of these instruments, including the domains each instrument assesses, appropriate populations (e.g., toddlers, preschoolers, Aboriginal), methods for administration (e.g., direct observation or self-report by parent/caregiver), types of reports generated, and costs/qualifications necessary for use or to receive reports. 11