blog posts and news stories

Determining the Impact of MSS on Science Achievement

Empirical Education is conducting an evaluation of Making Sense of SCIENCE (MSS) under an Investing in Innovation (i3) five-year validation grant awarded in 2014. MSS is a teacher professional learning approach that focuses on science understanding, classroom practice, literacy support, and pedagogical reasoning. The primary purpose of the evaluation is to assess the impact of MSS on teachers’ science content knowledge and student science achievement and attitudes toward science. The evaluation takes place in 66 schools across two geographic regions—Wisconsin and the Central Valley of California. Participating Local Educational Agencies (LEAs) include: Milwaukee Public Schools (WI), Racine Unified School District (WI), Lodi Unified School District (CA), Manteca Unified School District (CA), Turlock Unified School District (CA), Stockton Unified School District (CA), Sylvan Unified School District (CA), and the San Joaquin County Office of Education (CA).

Using a Randomized Control Trial (RCT) design, in 2015-16, we randomly assigned the schools (32 in Wisconsin and 34 in California) to receive the MSS intervention or continue with business-as-usual district professional learning and science instruction. Professional learning activities and program implementation take place during the 2016-17 and 2017-18 school years, with delayed treatment for the schools randomized to control, planned for 2018-19 and 2019-20.

Confirmatory impacts on student achievement and teacher content knowledge will be assessed in 2018. Confirmatory research questions include:

What is the impact of MSS at the school-level, after two years of full implementation, on science achievement in Earth and physical science among 4th and 5th grade students in intervention schools, compared to 4th and 5th grade students in control schools receiving the business-as-usual science instruction?


What is the impact of MSS on science achievement among low-achieving students in intervention elementary schools with two years of exposure to MSS (in grades 4-5) compared to low-achieving students in control elementary schools with business-as-usual instruction for two years (in grades 4-5)?

What is the impact of MSS on teachers’ science content knowledge in Earth and physical science compared to teachers in the business-as-usual control schools, after two full years of implementation in schools?

Additional exploratory analyses are currently being conducted and will continue through 2018. Exploratory research questions examine the impact of MSS on students’ ability to communicate science ideas in writing, as well as non-academic outcomes, such as confidence and engagement in learning science. We will also explore several teacher-level outcomes, including teachers’ pedagogical science content knowledge, and changes in classroom instructional practices. The evaluation also includes measures of fidelity of implementation.

We plan to publish the final results of this study in fall of 2019. Please check back to read the research summary and report.

2017-06-19

Determining the Impact of CREATE on Math and ELA Achievement

Empirical Education is conducting the evaluation of Collaboration and Reflection to Enhance Atlanta Teacher Effectiveness (CREATE) under an Investing in Innovation (i3) development grant awarded in 2014. The CREATE evaluation takes place in schools throughout the state of Georgia.

Approximately 40 residents from the Georgia State University (GSU) College of Education (COE) are participating in the CREATE teacher residency program. Using a quasi-experimental design, outcomes for these teachers and their students will be compared to those from a matched comparison group of close to 100 teachers who simultaneously enrolled in GSU COE but did not participate in CREATE. Implementation for cohort 1 started in 2015, and cohort 2 started in 2016. Confirmatory outcomes will be assessed in years 2 and 3 of both cohorts (2017 - 2019).

Confirmatory research questions we will be answering include:

What is the impact of one-year of exposure of students to a novice teacher in their second year of teacher residency in the CREATE program, compared to the Business as Usual GSU teacher credential program, on mathematics and ELA achievement of students in grades 4-8, as measured by the Georgia Milestones Assessment System?

What is the impact of CREATE on the quality of instructional strategies used by teachers, as measured by the Teacher Assessment of Performance Standards (TAPS) scores, at the end of the third year of residency, relative to the business as usual condition?

What is the impact of CREATE on the quality of the learning environment created by teachers, as measured by Teacher Assessment of Performance Standards (TAPS) scores, at the end of the third year of residency, relative to the business as usual condition?

Exploratory research questions will address additional teacher-level outcomes including retention, effectiveness, satisfaction, collaboration, and levels of stress in relationships with students and colleagues.

We plan to publish the results of this study in fall of 2019. Please check back to read the research summary and report.

2017-06-06

Academic Researchers Struggle with Research that is Too Expensive and Takes Too Long

I was in DC for an interesting meeting a couple weeks ago. The “EdTech Efficacy Research Academic Symposium” was very much an academic symposium.

The Jefferson Education Accelerator—out of the University of Virginia school of education—and Digital Promise—an organization that invents ways for school districts to make interesting use of edtech products and concepts—sponsored the get together. About 32% of the approximately 260 invited attendees were from universities or research organizations that conduct academic style research. About 16% represented funding or investment organizations and agencies, and another 20% were from companies that produce edtech (often being funded by the funders). 6% were school practitioners and, as would be expected at a DC event, about 26% were from associations and the media.

I represented a research organization with a lot of experience evaluating commercial edtech products. While in the midst of writing research guidelines for the software industry, i.e., the Software & Information Industry Association (SIIA), I felt a bit like an anthropologist among the predominantly academic crowd. I was listening to the language and trying to discern thinking patterns of professors and researchers, both federally- and foundation-funded. A fundamental belief is that real efficacy research is expensive (in the millions of dollars) and slow (a minimum of several years for a research report). A few voices said the cost could be lowered, especially for a school-district-initiated pilot, but the going rate—according to discussions at the meeting—for a simple study starts at $250,000. Given a recent estimate of 4,000 edtech products, (and assuming that new products and versions of existing products are being released at an accelerating rate), the annual cost of evaluating all edtech products would be around $1 billion—an amount unlikely to be supported in the current school funding climate.

Does efficacy research need to be that expensive and slow given the widespread data collection by schools, widely available datasets, and powerful computing capabilities? Academic research is expensive for several reasons. There is little incentive for research providers to lower costs. Federal agencies offer large contracts to attract the large research organizations with experience and high overhead rates. Other funders are willing to pay top dollar for the prestige of such organizations. University grant writers aim to support a whole scientific research program and need to support grad students and generally conduct unique studies that will be attractive to journals. In conventional practice, each study is a custom product. Automating repeatable processes is not part of the culture. Actually, there is an odd culture clash between the academic researchers and the edtech companies needing their services.

Empirical Education is now working with Reach Capital and their portfolio to develop an approach for edtech companies and their investors to get low-cost evidence of efficacy. We are also getting our recommendations down in the form of guidelines for edtech companies to get usable evidence. The document is expected to be released at SIIA’s Education Impact Symposium in July.

2017-05-30

Carnegie Summit 2017 Recap

If you’ve never been to Carnegie Summit, we highly recommend it.

This was our first year attending Carnegie Foundation’s annual conference in San Francisco, and we only wish we had checked it out sooner. Chief Scientist Andrew Jaciw attended on behalf of Empirical Education, and he took over our twitter account for the duration of the event. Below is a recap of his live tweeting, interspersed with additional thoughts too verbose for twitter’s strict character limitations.

Day 1


Curious about what I will learn. On my mind: Tony Bryk’s distinction between evidence-based practice and practice-based evidence. I am also thinking of how the approaches to be discussed connect to ideas of Lee Cronbach - he was very interested in timeliness and relevance of research findings and the limited reach of internal validity.

I enjoyed T. Bryk’s talk. These points resonated.


Improvement Science involves a hands-on approach to identifying systemic sources of predictable failure. This is appealing because it puts problem solving at the core, while realizing the context-specificity of what will actually work!

Day 2

Jared Bolte - Great talk! Improvement Science contrasts with traditional efficacy research by jumping right in to solve problems, instead of waiting. This raises an important question: What is the cost of delaying action to wait for efficacy findings? I am reminded of Lee Cronbach’s point: the half-life of empirical propositions is short!



This was an excellent session with Tony Bryk and John Easton. There were three important questions posed.



Day 3

Excited to Learn about PDSA cycles





2017-04-27

Presenting at AERA 2017

We will again be presenting at the annual meeting of the American Educational Research Association (AERA). Join the Empirical Education team in San Antonio, TX from April 27 – 30, 2017.

Research Presentations will include the following.

Increasing Accessibility of Professional Development (PD): Evaluation of an Online PD for High School Science Teachers
Authors: Adam Schellinger, Andrew P Jaciw, Jenna Lynn Zacamy, Megan Toby, & Li Lin
In Event: Promoting and Measuring STEM Learning
Saturday, April 29 10:35am to 12:05pm
Henry B. Gonzalez Convention Center, River Level, Room 7C

Abstract: This study examines the impact of an online teacher professional development, focused on academic literacy in high school science classes. A one-year randomized control trial measured the impact of Internet-Based Reading Apprenticeship Improving Science Education (iRAISE) on instructional practices and student literacy achievement in 27 schools in Michigan and Pennsylvania. Researchers found a differential impact of iRAISE favoring students with lower incoming achievement (although there was no overall impact of iRAISE on student achievement). Additionally, there were positive impacts on several instructional practices. These findings are consistent with the specific goals of iRAISE: to provide high-quality, accessible online training that improves science teaching. Authors compare these results to previous evaluations of the same intervention delivered through a face-to-face format.


How Teacher Practices Illuminate Differences in Program Impact in Biology and Humanities Classrooms
Authors: Denis Newman, Val Lazarev, Andrew P Jaciw, & Li Lin
In Event: Poster Session 5 - Program Evaluation With a Purpose: Creating Equal Opportunities for Learning in Schools
Friday, April 28 12:25 to 1:55pm
Henry B. Gonzalez Convention Center, Street Level, Stars at Night Ballroom 4

Abstract: This paper reports research to explain the positive impact in a major RCT for students in the classrooms of a subgroup of teachers. Our goal was to understand why there was an impact for science teachers but not for teachers of humanities, i.e., history and English. We have labelled our analysis “moderated mediation” because we start with the finding that the program’s success was moderated by the subject taught by the teacher and then go on to look at the differences in mediation processes depending on the subject being taught. We find that program impact teacher practices differ by mediator (as measured in surveys and observations) and that mediators are differentially associated with student impact based on context.


Are Large-Scale Randomized Controlled Trials Useful for Understanding the Process of Scaling Up?
Authors: Denis Newman, Val Lazarev, Jenna Lynn Zacamy, & Li Lin
In Event: Poster Session 3 - Applied Research in School: Education Policy and School Context
Thursday, April 27 4:05 to 5:35pm
Henry B. Gonzalez Convention Center, Ballroom Level, Hemisfair Ballroom 2

Abstract: This paper reports a large scale program evaluation that included an RCT and a parallel study of 167 schools outside the RCT that provided an opportunity for the study of the growth of a program and compare the two contexts. Teachers in both contexts were surveyed and a large subset of the questions are asked of both scale-up teachers and teachers in the treatment schools of the RCT. We find large differences in the level of commitment to program success in the school. Far less was found in the RCT suggesting that a large scale RCT may not be capturing the processes at play in the scale up of a program.

We look forward to seeing you at our sessions to discuss our research. You can also view our presentation schedule here.

2017-04-17

SREE Spring 2017 Conference Recap

Several Empirical Education team members attended the annual SREE conference in Washington, DC from March 4th - 5th. This year’s conference theme, “Expanding the Toolkit: Maximizing Relevance, Effectiveness and Rigor in Education Research,” included a variety of sessions focused on partnerships between researchers and practitioners, classroom instruction, education policy, social and emotional learning, education and life cycle transitions, and research methods. Andrew Jaciw, Chief Scientist at Empirical Education, chaired a session about Advances in Quasi-Experimental Design. Jaciw also presented a poster on developing a “systems check” for efficacy studies under development. For more information on this diagnostic approach to evaluation, watch this Facebook Live video of Andrew’s discussion of the topic.

Other highlights of the conference included Sean Reardon’s keynote address highlighting uses of “big data” in creating context and generating hypotheses in education research. Based on data from the Stanford Education Data Archive (SEDA), Sean shared several striking patterns of variation in achievement and achievement gaps among districts across the country, as well as correlations between achievement gaps and socioeconomic status. Sean challenged the audience to consider how to expand this work and use this kind of “big data” to address critical questions about inequality in academic performance and education attainment. The day prior to the lecture, our CEO, Denis Newman, attended a workshop lead by Sean and colleagues that provided a detailed overview of the SEDA data and how it can be used in education research. The psychometric work to generate equivalent scores for every district in the country, the basis for his findings, was impressive and we look forward to their solving the daunting problem of extending the database to encompass individual schools.

2017-03-24

New Mexico Implementation


Empirical Education and the New Mexico Public Education Department (NMPED) are entering into their fourth year of collaboration using Observation Engine to increase educator effectiveness by improving understanding of the NMTEACH observation protocol and inter-rater reliability amongst observers using it. During the implementation, Observation Engine has been used for calibration and professional development with over 2,000 educators across the state annually. In partnership with the Southern Regional Education Board (SREB), who is providing training on best practices, the users in New Mexico have pushed the boundaries of what is possible with Observation Engine. Observation Engine was initially used solely for certifying observers prior to live classroom observations. Now, observers are relying on Observation Engine’s lesson functionality to provide professional development throughout the year. In addition, some administrators are now using videos and content from Observation Engine directly with teachers to provide them with models of what good instruction looks like.

The exciting news is that the collaborative efforts of NMPED, SREB, and Observation Engine are demonstrating impressive results across New Mexico that are noteworthy, especially when compared to the rest of the nation. In a compilation of teacher performance ratings from 19 states that have reformed their evaluation system since the seminal Widget Effect Report, Kraft and Gilmour (2016) found that in a majority of these states, fewer than 3 percent of teachers are rated below proficient. New Mexico stood out as an outlier among these states with 26.2% of teachers rated below proficient, a percentage comparable with more realistic pilots of educator effectiveness ratings. This is likely a sign of excellent professional development, as well as a willingness to realistically adjust the thresholds for proficiency based on the data that is being yielded and examined from actual practice, such as data captured within Observation Engine.

Kraft, M.A., & Gilmour, A.F. (2016). Revisiting the Widget Effect: Teacher Evaluation Reforms and the Distribution of Teacher Effectiveness. Brown University working paper. Retrieved July 21, 2016, from http://scholar.harvard.edu/mkraft/publications/revisiting-widget-effect-teacher-evaluation-reforms-and-distribution-teacher.

2016-12-02

Arkansas Implements Observation Engine Statewide

BloomBoard’s observation tool, EdReflect, has been used across the state of Arkansas since fall 2014. Last year, the Arkansas Department of Education piloted Observation Engine, an online observation training and calibration tool from Empirical Education Inc., in four districts under the state’s Equitable Access Plan. Accessible through the BloomBoard platform, Observation Engine allows administrators and other teacher evaluators to improve scoring calibration and reliability through viewing and rating videos of classroom lessons collected in thousands of classrooms across the country.

Paired with BloomBoard resources and training, the results were impressive. In one district, the number of observers scoring above target increased from 43% to 100%. Not only that but the percent discrepancy (scores that were two levels above or below the target) decreased from 9% to 0%. Similar results were found in the other three pilot districts, prompting decision makers to make Observation Engine readily available to districts throughout the state.

“EdReflect has proven to be a valuable platform for educator observations in Arkansas. The professional conversation, which results from the ability to provide timely feedback and shared understanding of effective practice, has proven to ensure a transparency and collaboration that we have not experienced before. With the addition of Empirical Education’s Observation Engine, credentialed teacher observers have ready access to increase inter-rater reliability and personal skill. For the first time this year, BloomBoard Collections and micro-credentials have begun meeting individualized professional learning needs for educators all over the state.”
– Sandra Hurst, Arkansas Department of Education

In July, the Arkansas Department of Education decided to offer Observation Engine to the entire state. About half of all districts in the state opted in to receive the service, with the implementation spanning three groups of users in Arkansas. The Beginning Administrators group has already started pursuing a micro-credential based on Observation Engine. Micro-credentials are a digital form of certification that indicate a person has demonstrated competency in a specific skill set. The Beginning Administrators group can earn their “Observation Skills for Beginning Administrators” micro-credential by demonstrating observation skill competencies using Observation Engine’s online observer calibration tool to practice and assess observation skills.

Next month, the 26 more districts under the Equitable Access Plan and the remaining Arkansas districts will begin using Observation Engine. We look forward to following and reporting on the progress of these districts during the 2016-17 school year.

2016-11-02

Presentation at the 2016 Learning Forward Annual Conference

Learning Forward announced that our proposal was accepted for the 2016 annual conference being held in Vancouver, Canada this year. Teacher Evaluation Specialist K.C. MacQueen will join Fort Wayne Community Schools’ (FWCS) Todd Cummings and Learning Forward’s Kay Psencik in presenting “Principals Collaborating to Deepen Understanding of High-Quality Instruction.” They will highlight how FWCS is engaged in a process to ensure equitable evaluation of teacher effectiveness using Observation Engine™. If you or someone you know is attending the annual conference in December 2016, here are the details of the presentation.

  • Day/time: Tuesday, December 6, 2016 from 10AM-Noon
  • Session: I 15
2016-08-02

Report of the Evaluation of iRAISE Released

Empirical Education Inc. has completed its evaluation (read the report here) of an online professional development program for Reading Apprenticeship. WestEd’s Strategic Literacy Initiative (SLI) was awarded a development grant under the Investing in Innovation (i3) program in 2012. iRAISE (internet-based Reading Apprenticeship Improving Science Education) is an online professional development program for high school science teachers. iRAISE trained more than 100 teachers in Michigan and Pennsylvania over the three years of the grant. Empirical’s randomized control trial measured the impact of the program on students with special attention to differences in their incoming reading achievement levels.

The goal of iRAISE was to improve student achievement by training teachers in the use of Reading Apprenticeship, an instructional framework that describes the classroom in four interacting dimensions of learning: social, personal, cognitive, and knowledge-building. The inquiry-based professional development (PD) model included a week-long Foundations training in the summer; monthly synchronous group sessions and smaller personal learning communities; and asynchronous discussion groups designed to change teachers’ understanding of their role in adolescent literacy development and to build capacity for literacy instruction in the academic disciplines. iRAISE adapted an earlier face-to-face version of Reading Apprenticeship professional development, which was studied under an earlier i3 grant, Reading Apprenticeship Improving Secondary Education (RAISE), into a completely online course, creating a flexible, accessible platform.

To evaluate iRAISE, Empirical Education conducted an experiment in which 82 teachers across 27 schools were randomly assigned to either receive the iRAISE Professional Development during the 2014-15 school year or continue with business as usual and receive the program one year later. Data collection included monthly teacher surveys that measured their use of several classroom instructional practices and a spring administration of an online literacy assessment, developed by Educational Testing Service, to measure student achievement in literacy. We found significant positive impacts of iRAISE on several of the classroom practice outcomes, including teachers providing explicit instruction on comprehension strategies, their use of metacognitive inquiry strategies, and their levels of confidence in literacy instruction. These results were consistent with the prior RAISE research study and are an important replication of the previous findings, as they substantiate the success of SLI’s development of a more accessible online version of their teacher PD. After a one-year implementation with iRAISE, we do not find an overall effect of the program on student literacy achievement. However, we did find that levels of incoming reading achievement moderate the impact of iRAISE on general reading literacy such that lower scoring students benefit more. The success of iRAISE in adapting immersive, high-quality professional development to an online platform is promising for the field.

You can access the report and research summary from the study using the links below.
iRAISE research report
iRAISE research summary

2016-07-01
Archive