Empirical Education Inc.

Report Released on the Effectiveness of SRI/CAST's Enhanced Units

Summary of Findings

Empirical Education has released the results of a semester-long randomized experiment on the effectiveness of SRI/CAST’s Enhanced Units (EU). This study was conducted in cooperation with one district in California, and with two districts in Virginia, and was funded through a competitive Investing in Innovation (i3) grant from the U.S. Department of Education. EU combines research-based content enhancement routines, collaboration strategies and technology components for secondary history and biology classes. The goal of the grant is to improve student content learning and higher order reasoning, especially for students with disabilities. EU was developed during a two-year design-based implementation process with teachers and administrators co-designing the units with developers.

The evaluation employed a group randomized control trial in which classes were randomly assigned within teachers to receive the EU curriculum, or continue with business-as-usual. All teachers were trained in Enhanced Units. Overall, the study involved three districts, five schools, 13 teachers, 14 randomized blocks, and 30 classes (15 in each condition, with 18 in biology and 12 in U.S. History). This was an intent-to-treat design, with impact estimates generated by comparing average student outcomes for classes randomly assigned to the EU group with average student outcomes for classes assigned to control group status, regardless of the level of participation in or teacher implementation of EU instructional approaches after random assignment.

Overall, we found a positive impact of EU on student learning in history, but not on biology or across the two domains combined. Within biology, we found that students experienced greater impact on the Evolution unit than the Ecology unit. These findings supports a theory developed by the program developers that EU works especially well with content that progresses in a sequential and linear way. We also found a positive differential effect favoring students with disabilities, which is an encouraging result given the goal of the grant.

Final Report of CAST Enhanced Units Findings

The full report for this study can be downloaded using the link below.

Enhanced Units final report

Dissemination of Findings

2023 Dissemination

In April 2023, The U.S. Department of Education’s Office of Innovation and Early Learning Programs (IELP) within the Office of Elementary and Secondary Education (OESE) compiled cross-project summaries of completed Investing in Innovation (i3) and Education Innovation and Research (EIR) projects. Our CAST Enhanced Units study is included in one of the cross-project summaries. Read the 16-page summary using the link below.

Findings from Projects with a Focus on Serving Students with Disabilities

2020 Dissemination

Hannah D’ Apice presented these findings at the 2020 virtual conference for the Society for Research on Educational Effectiveness (SREE) in September 2020. Watch the recorded presentation using the link below.

Symposium Session 9A. Unpacking the Logic Model: A Discussion of Mediators and Antecedents of Educational Outcomes from the Investing in Innovation (i3) Program

2019-12-26

Posted by: Hannah D'Apice

Tags: California, CAST, effectiveness, Enhanced Units, randomization, RCT, research, stem and Virginia

New Multi-State RCT with Imagine Learning

Empirical Education is excited to announce a new study on the effectiveness of Imagine Math, an online supplemental math program that helps students build conceptual understanding, problem-solving skills, and a resilient attitude toward math. The program provides adaptive instruction so that students can work at their own pace and offers live support from certified math teachers as students work through the content. Imagine Math also includes diagnostic benchmarks that allows educators to track progress at the student, class, school, and district level.

The research questions to be answered by this study are:

What is the impact of Imagine Math on student achievement in mathematics in grades 6–8?
Is the impact of Imagine Math different for students with diverse characteristics, such as those starting with weak or strong content-area skills?
Are differences in the extent of use of Imagine Math, such as the number of lessons completed, associated with differences in student outcomes?

The new study will use a randomized control trial (RCT) or randomized experiment in which two equivalent groups of students are formed through random assignment. The experiment will specifically use a within-teacher RCT design, with randomization taking place at the classroom level for eligible math classes in grades 6–8.

Eligible classes will be randomly assigned to either use or not use Imagine Math during the school year, with academic achievement compared at the end of the year, in order to determine the impact of the program on grade 6-8 mathematics achievement. In addition, Empirical Education will make use of Imagine Math’s usage data for potential analysis of the program’s impact on different subgroups of users.

This is Empirical Education’s first project with Imagine Learning, highlighting our extensive experience conducting large-scale, rigorous, experimental impact studies. The study is commissioned by Imagine Learning and will take place in multiple school districts and states across the country, including Hawaii, Alabama, Alaska, and Delaware.

2018-08-03

Posted by: Hannah D'Apice

Tags: achievement, imagine math, mathematics, randomization, RCT, research, students and usage data

AERA 2018 Recap: The Possibilities and Necessity of a Rigorous Education Research Community

This year’s AERA annual meeting on “The Dreams, Possibilities, and Necessity of Public Education,” was fittingly held in the city with the largest number of public school students in the country—New York. Against this radically diverse backdrop, presenters were encouraged to diversify both the format and topics of presentations in order to inspire thinking and “confront the struggles for public education.”

AERA’s sheer size may risk overwhelming its attendees, but in other ways, it came as a relief. At a time when educators and education remain under-resourced, it was heartening to be reminded that a large, vibrant community of dedicated and intelligent people exists to improve educational opportunities for all students.

One theme that particularly stood out is that researchers are finding increasingly creative ways to use existing usage data from education technology products to measure impact and implementation. This is a good thing when it comes to reducing the cost of research and making it more accessible to smaller businesses and nonprofits. For example, in a presentation on a software-based knowledge competition for nursing students, researchers used usage data to identify components of player styles and determine whether these styles had a significant effect on student performance. In our Edtech Research Guidelines, Empirical similarly recommends that edtech companies take advantage of their existing usage data to run impact and implementation analyses, without using more expensive data collection methods. This can help significantly reduce the cost of research studies—rather than one study that costs $3 million, companies can consider multiple lower-cost studies that leverage usage data and give the company a picture of how the product performs in a greater diversity of contexts.

Empirical staff themselves presented on a variety of topics, including quasi-experiments on edtech products; teacher recruitment, evaluation, and retention; and long-term impact evaluations. In all cases, Empirical reinforced its commitment to innovative, low-cost, and rigorous research. You can read more about the research projects we presented in our previous AERA post.

photo of Denis Newman presenting at AERA 2018

Finally, Empirical was delighted to co-host the Division H AERA Reception at the Supernova bar at Novotel Hotel. If you ever wondered if Empirical knows how to throw a party, wonder no more! A few pictures from the event are below. View all of the pictures from our event on facebook!

We had a great time and look forward to seeing everyone at the next AERA annual meeting!

2018-05-03

Posted by: Hannah D'Apice

Tags: AERA, conference, Division H, Efficacy, Program Evaluation, Quantitative and Randomization

Partnering with SRI and CAST on an RCT

Empirical Education and CAST are excited to announce a new partnership under an Investing in Innovation (i3) grant.

We’ll evaluate the Enhanced Units program, which was written as a development proposal by SRI and CAST. This project will aim to integrate content enhancement routines and learning and collaboration strategies, enhancements to improve student content learning, higher order reasoning, and collaboration.

We will conduct the experiment within up to three school districts in California and Virginia—working with teachers of high school science and social studies students. This is our first project with CAST, and it builds on our extensive experience conducting large-scale, rigorous, experimental impact studies, as well as formative and process evaluations.

For more information on our evaluation services and our work on i3 projects, please visit our i3 /EIR page and/or contact us.

2017-07-27

Posted by: Robin Means

Tags: CAST, i3, randomization, RCT and SRI

Presenting at AERA 2017

We will again be presenting at the annual meeting of the American Educational Research Association (AERA). Join the Empirical Education team in San Antonio, TX from April 27 – 30, 2017.

Research Presentations will include the following.

Increasing Accessibility of Professional Development (PD): Evaluation of an Online PD for High School Science Teachers
Authors: Adam Schellinger, Andrew P Jaciw, Jenna Lynn Zacamy, Megan Toby, & Li Lin
In Event: Promoting and Measuring STEM Learning
Saturday, April 29 10:35am to 12:05pm
Henry B. Gonzalez Convention Center, River Level, Room 7C

Abstract: This study examines the impact of an online teacher professional development, focused on academic literacy in high school science classes. A one-year randomized control trial measured the impact of Internet-Based Reading Apprenticeship Improving Science Education (iRAISE) on instructional practices and student literacy achievement in 27 schools in Michigan and Pennsylvania. Researchers found a differential impact of iRAISE favoring students with lower incoming achievement (although there was no overall impact of iRAISE on student achievement). Additionally, there were positive impacts on several instructional practices. These findings are consistent with the specific goals of iRAISE: to provide high-quality, accessible online training that improves science teaching. Authors compare these results to previous evaluations of the same intervention delivered through a face-to-face format.

How Teacher Practices Illuminate Differences in Program Impact in Biology and Humanities Classrooms
Authors: Denis Newman, Val Lazarev, Andrew P Jaciw, & Li Lin
In Event: Poster Session 5 - Program Evaluation With a Purpose: Creating Equal Opportunities for Learning in Schools
Friday, April 28 12:25 to 1:55pm
Henry B. Gonzalez Convention Center, Street Level, Stars at Night Ballroom 4

Abstract: This paper reports research to explain the positive impact in a major RCT for students in the classrooms of a subgroup of teachers. Our goal was to understand why there was an impact for science teachers but not for teachers of humanities, i.e., history and English. We have labelled our analysis “moderated mediation” because we start with the finding that the program’s success was moderated by the subject taught by the teacher and then go on to look at the differences in mediation processes depending on the subject being taught. We find that program impact teacher practices differ by mediator (as measured in surveys and observations) and that mediators are differentially associated with student impact based on context.

Are Large-Scale Randomized Controlled Trials Useful for Understanding the Process of Scaling Up?
Authors: Denis Newman, Val Lazarev, Jenna Lynn Zacamy, & Li Lin
In Event: Poster Session 3 - Applied Research in School: Education Policy and School Context
Thursday, April 27 4:05 to 5:35pm
Henry B. Gonzalez Convention Center, Ballroom Level, Hemisfair Ballroom 2

Abstract: This paper reports a large scale program evaluation that included an RCT and a parallel study of 167 schools outside the RCT that provided an opportunity for the study of the growth of a program and compare the two contexts. Teachers in both contexts were surveyed and a large subset of the questions are asked of both scale-up teachers and teachers in the treatment schools of the RCT. We find large differences in the level of commitment to program success in the school. Far less was found in the RCT suggesting that a large scale RCT may not be capturing the processes at play in the scale up of a program.

We look forward to seeing you at our sessions to discuss our research. You can also view our presentation schedule here.

2017-04-17

Posted by: Robin Means

Tags: AERA, conference, empirical education, evaluation, methodology, presentation, program evaluation, quantitative, quasi-experiment, randomization, RCT, research and scaling up

Empirical Education Presents Initial Results from i3 Validation Grant Evaluation

Our director of research operations, Jenna Zacamy, joined Cheri Fancsali from IMPAQ International and Cyndy Greenleaf from the Strategic Literacy Initiative (SLI) at WestEd at the Literacy Research Association (LRA) conference in Dallas, TX on December 4. Together, they conducted a symposium, which was the first formal presentation of findings from the Investing in Innovation (i3) Validation grant, Reading Apprenticeship Improving Secondary Education (RAISE). WestEd won the grant in 2010 with Empirical Education and IMPAQ serving as the evaluators. There are two ongoing evaluations: the first includes a randomized control trial (RCT) of over 40 schools in California and Pennsylvania investigating the impact of Reading Apprenticeship on teacher instructional practices and student achievement; the second is a formative evaluation spanning four states and 150+ schools investigating how the school systems build capacity to implement and disseminate Reading Apprenticeship and sustain these efforts. The symposium’s discussant, P. David Pearson (UC Berkeley), provided praise of the design and effort of both studies stating that he has “never seen such thoughtful and extensive evaluations.” Preliminary findings from the RCT show that Reading Apprenticeship teachers provide students more opportunities to practice metacognitive strategies and foster and support more student collaboration opportunities. Findings from the second year of the formative evaluation suggest high levels of buy-in and commitment from both school administrators and teachers, but also identify competing initiatives and priorities as a primary challenge to sustainability. Initial findings of our five-year, multi-state study of RAISE are promising, but reflect the real-world complexity of scaling up and evaluating literacy initiatives across several contexts. Final results from both studies will be available in 2015.

View the information presented at LRA here and here.

2013-12-19

Posted by: Robin Means

Tags: conference, evaluation, evaluator, formative, grant, i3, LRA, presentation, randomization, RCT, Reading Apprenticeship and validation

Empirical Presents about Aspire Public School’s t3 System at AEA 2013

Empirical Education presented at the annual conference of the American Evaluation Association (AEA) in Washington, DC. Our newest research manager, Kristen Koue, along with our chief scientist, Andrew Jaciw reflected on striking the right balance between conducting a rigorous randomized control trial that meets i3 grant parameters, while also conducting an implementation evaluation that provides useful formative feedback to the Aspire population.

2013-10-15

Posted by: Robin Means

Tags: AEA, Aspire, conference, evaluation, formative, i3, implementation, presentation, randomization and RCT

Study Shows a “Singapore Math” Curriculum Can Improve Student Problem Solving Skills

A study of HMH Math in Focus (MIF) released today by research firm Empirical Education Inc. demonstrates a positive impact of the curriculum on Clark County School District elementary students’ math problem solving skills. The 2011-2012 study was contracted by the publisher, which left the design, conduct, and reporting to Empirical. MIF provides elementary math instruction based on the pedagogical approach used in Singapore. The MIF approach to instruction is designed to support conceptual understanding, and is said to be closely aligned with the Common Core State Standards (CCSS), which focuses more on in-depth learning than previous math standards.

Empirical found an increase in math problem solving among students taught with HMH Math in Focus compared to their peers. The Clark County School District teachers also reported an increase in their students’ conceptual understanding, as well as an increase in student confidence and engagement while explaining and solving math problems. The study addressed the difference between the CCSS-oriented MIF and the existing Nevada math standards and content. While MIF students performed comparatively better on complex problem solving skills, researchers found that students in the MIF group performed no better than the students in the control group on the measure of math procedures and computation skills. There was also no significant difference between the groups on the state CRT assessment, which has not fully shifted over to the CCSS.

The research used a group randomized control trial to examine the performance of students in grades 3-5 during the 2011-2012 school year. Each grade-level team was randomly assigned to either the treatment group that used MIF or the control group that used the conventional math curriculum. Researchers used three different assessments to capture math achievement contrasting procedural and problem solving skills. Additionally, the research design employed teacher survey data to conduct mediator analyses (correlations between percentage of math standards covered and student math achievement) and assess fidelity of classroom implementation.

You can download the report and research summary from the study using the links below.
Math in Focus research report
Math in Focus research summary

2013-04-01

Posted by: Robin Means

Tags: CCSD, CCSS, classroom, education, elementary school, fidelity, implementation, math, Math in Focus, mediator, Nevada, problem solving, randomization, RCT, research and Singapore math

Can We Measure the Measures of Teaching Effectiveness?

Teacher evaluation has become the hot topic in education. State and local agencies are quickly implementing new programs spurred by federal initiatives and evidence that teacher effectiveness is a major contributor to student growth. The Chicago teachers’ strike brought out the deep divisions over the issue of evaluations. There, the focus was on the use of student achievement gains, or value-added. But the other side of evaluation—systematic classroom observations by administrators—is also raising interest. Teaching is a very complex skill, and the development of frameworks for describing and measuring its interlocking elements is an area of active and pressing research. The movement toward using observations as part of teacher evaluation is not without controversy. A recent OpEd in Education Week by Mike Schmoker criticizes the rapid implementation of what he considers overly complex evaluation templates “without any solid evidence that it promotes better teaching.”

There are researchers engaged in the careful study of evaluation systems, including the combination of value-added and observations. The Bill and Melinda Gates Foundation has funded a large team of researchers through its Measures of Effective Teaching (MET) project, which has already produced an array of reports for both academic and practitioner audiences (with more to come). But research can be ponderous, especially when the question is whether such systems can impact teacher effectiveness. A year ago, the Institute of Education Sciences (IES) awarded an $18 million contract to AIR to conduct a randomized experiment to measure the impact of a teacher and leader evaluation system on student achievement, classroom practices, and teacher and principal mobility. The experiment is scheduled to start this school year and results will likely start appearing by 2015. However, at the current rate of implementation by education agencies, most programs will be in full swing by then.

Empirical Education is currently involved in teacher evaluation through Observation Engine: our web-based tool that helps administrators make more reliable observations. See our story about our work with Tulsa Public Schools. This tool, along with our R&D on protocol validation, was initiated as part of the MET project. In our view, the complexity and time-consuming aspects of many of the observation systems that Schmoker criticizes arise from their intended use as supports for professional development. The initial motivation for developing observation frameworks was to provide better feedback and professional development for teachers. Their complexity is driven by the goal of providing detailed, specific feedback. Such systems can become cumbersome when applied to the goal of providing a single score for every teacher representing teaching quality that can be used administratively, for example, for personnel decisions. We suspect that a more streamlined and less labor-intensive evaluation approach could be used to identify the teachers in need of coaching and professional development. That subset of teachers would then receive the more resource-intensive evaluation and training services such as complex, detailed scales, interviews, and coaching sessions.

The other question Schmoker raises is: do these evaluation systems promote better teaching? While waiting for the IES study to be reported, some things can be done. First, look at correlations of the components of the observation rubrics with other measures of teaching such as value-added to student achievement (VAM) scores or student surveys. The idea is to see whether the behaviors valued and promoted by the rubrics are associated with improved achievement. The videos and data collected by the MET project are the basis for tools to do this (see earlier story on our Validation Engine.) But school systems can conduct the same analysis using their own student and teacher data. Second, use quasi-experimental methods to look at the changes in achievement related to the system’s local implementation of evaluation systems. In both cases, many school systems are already collecting very detailed data that can be used to test the validity and effectiveness of their locally adopted approaches.

2012-10-31

Posted by: Denis Newman

Tags: achievement, education, evaluation systems, evidence, framework, LEA, leader, observation, observation engine, PD, randomization, research, SEA, student, teacher, teacher effectiveness, teacher evaluation, validity and value-added

Study of Alabama STEM Initiative Finds Positive Impacts

On February 21, 2012 the U.S. Department of Education released the final report of an experiment that Empirical Education has been working on for the last six years. The report, titled Evaluation of the Effectiveness of the Alabama Math, Science, and Technology Initiative (AMSTI) is now available on the Institute of Education Sciences website. The Alabama State Department of Education held a press conference to announce the findings, attended by Superintendent of Education Bice, staff of AMSTI, along with educators, students, and co-principal investigator of the study, Denis Newman, CEO of Empirical Education.

AMSTI was developed by the state of Alabama and introduced in 2002 with the goal of improving mathematics and science achievement in the state’s K-12 schools. Empirical Education was primarily responsible for conducting the study—including the design, data collection, analysis, and reporting—under its subcontract with the Regional Education Lab, Southeast (the study was initiated through a research grant to Empirical). Researchers from Academy of Education Development, Abt Associates, and ANALYTICA made important contributions to design, analysis and data collection.

The findings show that after one year, students in the 41 AMSTI schools experienced an impact on mathematics achievement equivalent to 28 days of additional student progress over students receiving conventional mathematics instruction. The study found, after one year, no difference for science achievement. It also found that AMSTI had an impact on teachers’ active learning classroom practices in math and science that, according to the theory of action posited by AMSTI, should have an impact on achievement. Further exploratory analysis found effects for student achievement in both mathematics and science after two years. The study also explored reading achievement, where it found significant differences between the AMSTI and control groups after one year. Exploration of differential effect for student demographic categories found consistent results for gender, socio-economic status, and pretest achievement level for math and science. For reading, however, the breakdown by student ethnicity suggests a differential benefit.

Just about everybody at Empirical worked on this project at one point or another. Besides the three of us (Newman, Jaciw and Zacamy) who are listed among the authors, we want to acknowledge past and current employees whose efforts made the project possible: Jessica Cabalo, Ruthie Chang, Zach Chin, Huan Cung, Dan Ho, Akiko Lipton, Boya Ma, Robin Means, Gloria Miller, Bob Smith, Laurel Sterling, Qingfeng Zhao, Xiaohui Zheng, and Margit Zsolnay.

With solid cooperation of the state’s Department of Education and the AMSTI team, approximately 780 teachers and 30,000 upper-elementary and middle school students in 82 schools from five regions in Alabama participated in the study. The schools were randomized into one of two categories: 1) Those who received AMSTI starting the first year, or 2) Those who received “business as usual” the first year and began participation in AMSTI the second year. With only a one-year delay before the control group entered treatment, the two-year impact was estimated using statistical techniques developed by, and with the assistance of our colleagues at Abt Associates. Academy for Education Development assisted with data collection and analysis of training and program implementation.

Findings of the AMSTI study will also be presented at the Society for Research on Educational Effectiveness (SREE) Spring Conference taking place in Washington D.C. from March 8-10, 2012. Join Denis Newman, Andrew Jaciw, and Boya Ma on Friday March 9, 2012 from 3:00pm-4:30pm, when they will present findings of their study titled, “Locating Differential Effectiveness of a STEM Initiative through Exploration of Moderators.” A symposium on the study, including the major study collaborators, will be presented at the annual conference of the American Educational Research Association (AERA) on April 15, 2012 from 2:15pm-3:45pm at the Marriott Pinnacle ⁄ Pinnacle III in Vancouver, Canada. This session will be chaired by Ludy van Broekhuizen (director of REL-SE) and will include presentations by Steve Ricks (director of AMSTI); Jean Scott (SERVE Center at UNCG); Denis Newman, Andrew Jaciw, Boya Ma, and Jenna Zacamy (Empirical Education); Steve Bell (Abt Associates); and Laura Gould (formerly of AED). Sean Reardon (Stanford) will serve as the discussant. A synopsis of the study will also be included in the Common Guidelines for Education Research and Development.

2012-02-21

Posted by: Robin Means

Tags: achievement, AERA, Alabama, ALSDE, AMSTI, analysis, classroom, conference, differential, effectiveness, elementary school, evaluation, experiment, exploratory, grant, IES, math, middle school, moderator, presentation, press, randomization, reading, REL, report, research, science, SREE and STEM

blog posts and news stories

Report Released on the Effectiveness of SRI/CAST's Enhanced Units

Summary of Findings

Final Report of CAST Enhanced Units Findings

Dissemination of Findings

2023 Dissemination

2020 Dissemination

2019-12-26

New Multi-State RCT with Imagine Learning

2018-08-03

AERA 2018 Recap: The Possibilities and Necessity of a Rigorous Education Research Community

2018-05-03

Partnering with SRI and CAST on an RCT

2017-07-27

Presenting at AERA 2017

2017-04-17

Empirical Education Presents Initial Results from i3 Validation Grant Evaluation

2013-12-19

Empirical Presents about Aspire Public School’s t3 System at AEA 2013

2013-10-15

Study Shows a “Singapore Math” Curriculum Can Improve Student Problem Solving Skills

2013-04-01

Can We Measure the Measures of Teaching Effectiveness?

2012-10-31

Study of Alabama STEM Initiative Finds Positive Impacts

2012-02-21

Archive