blog posts and news stories

Determining the Impact of CREATE on Math and ELA Achievement

Empirical Education is conducting the evaluation of Collaboration and Reflection to Enhance Atlanta Teacher Effectiveness (CREATE) under an Investing in Innovation (i3) development grant awarded in 2014. The CREATE evaluation takes place in schools throughout the state of Georgia.

Approximately 40 residents from the Georgia State University (GSU) College of Education (COE) are participating in the CREATE teacher residency program. Using a quasi-experimental design, outcomes for these teachers and their students will be compared to those from a matched comparison group of close to 100 teachers who simultaneously enrolled in GSU COE but did not participate in CREATE. Implementation for cohort 1 started in 2015, and cohort 2 started in 2016. Confirmatory outcomes will be assessed in years 2 and 3 of both cohorts (2017 - 2019).

Confirmatory research questions we will be answering include:

What is the impact of one-year of exposure of students to a novice teacher in their second year of teacher residency in the CREATE program, compared to the Business as Usual GSU teacher credential program, on mathematics and ELA achievement of students in grades 4-8, as measured by the Georgia Milestones Assessment System?

What is the impact of CREATE on the quality of instructional strategies used by teachers, as measured by the Teacher Assessment of Performance Standards (TAPS) scores, at the end of the third year of residency, relative to the business as usual condition?

What is the impact of CREATE on the quality of the learning environment created by teachers, as measured by Teacher Assessment of Performance Standards (TAPS) scores, at the end of the third year of residency, relative to the business as usual condition?

Exploratory research questions will address additional teacher-level outcomes including retention, effectiveness, satisfaction, collaboration, and levels of stress in relationships with students and colleagues.

We plan to publish the results of this study in fall of 2019. Please check back to read the research summary and report.

2017-06-06

Academic Researchers Struggle with Research that is Too Expensive and Takes Too Long

I was in DC for an interesting meeting a couple weeks ago. The “EdTech Efficacy Research Academic Symposium” was very much an academic symposium.

The Jefferson Education Accelerator—out of the University of Virginia school of education—and Digital Promise—an organization that invents ways for school districts to make interesting use of edtech products and concepts—sponsored the get together. About 32% of the approximately 260 invited attendees were from universities or research organizations that conduct academic style research. About 16% represented funding or investment organizations and agencies, and another 20% were from companies that produce edtech (often being funded by the funders). 6% were school practitioners and, as would be expected at a DC event, about 26% were from associations and the media.

I represented a research organization with a lot of experience evaluating commercial edtech products. While in the midst of writing research guidelines for the software industry, i.e., the Software & Information Industry Association (SIIA), I felt a bit like an anthropologist among the predominantly academic crowd. I was listening to the language and trying to discern thinking patterns of professors and researchers, both federally- and foundation-funded. A fundamental belief is that real efficacy research is expensive (in the millions of dollars) and slow (a minimum of several years for a research report). A few voices said the cost could be lowered, especially for a school-district-initiated pilot, but the going rate—according to discussions at the meeting—for a simple study starts at $250,000. Given a recent estimate of 4,000 edtech products, (and assuming that new products and versions of existing products are being released at an accelerating rate), the annual cost of evaluating all edtech products would be around $1 billion—an amount unlikely to be supported in the current school funding climate.

Does efficacy research need to be that expensive and slow given the widespread data collection by schools, widely available datasets, and powerful computing capabilities? Academic research is expensive for several reasons. There is little incentive for research providers to lower costs. Federal agencies offer large contracts to attract the large research organizations with experience and high overhead rates. Other funders are willing to pay top dollar for the prestige of such organizations. University grant writers aim to support a whole scientific research program and need to support grad students and generally conduct unique studies that will be attractive to journals. In conventional practice, each study is a custom product. Automating repeatable processes is not part of the culture. Actually, there is an odd culture clash between the academic researchers and the edtech companies needing their services.

Empirical Education is now working with Reach Capital and their portfolio to develop an approach for edtech companies and their investors to get low-cost evidence of efficacy. We are also getting our recommendations down in the form of guidelines for edtech companies to get usable evidence. The document is expected to be released at SIIA’s Education Impact Symposium in July.

2017-05-30

Carnegie Summit 2017 Recap

If you’ve never been to Carnegie Summit, we highly recommend it.

This was our first year attending Carnegie Foundation’s annual conference in San Francisco, and we only wish we had checked it out sooner. Chief Scientist Andrew Jaciw attended on behalf of Empirical Education, and he took over our twitter account for the duration of the event. Below is a recap of his live tweeting, interspersed with additional thoughts too verbose for twitter’s strict character limitations.

Day 1


Curious about what I will learn. On my mind: Tony Bryk’s distinction between evidence-based practice and practice-based evidence. I am also thinking of how the approaches to be discussed connect to ideas of Lee Cronbach - he was very interested in timeliness and relevance of research findings and the limited reach of internal validity.

I enjoyed T. Bryk’s talk. These points resonated.


Improvement Science involves a hands-on approach to identifying systemic sources of predictable failure. This is appealing because it puts problem solving at the core, while realizing the context-specificity of what will actually work!

Day 2

Jared Bolte - Great talk! Improvement Science contrasts with traditional efficacy research by jumping right in to solve problems, instead of waiting. This raises an important question: What is the cost of delaying action to wait for efficacy findings? I am reminded of Lee Cronbach’s point: the half-life of empirical propositions is short!



This was an excellent session with Tony Bryk and John Easton. There were three important questions posed.



Day 3

Excited to Learn about PDSA cycles





2017-04-27

Presenting at AERA 2017

We will again be presenting at the annual meeting of the American Educational Research Association (AERA). Join the Empirical Education team in San Antonio, TX from April 27 – 30, 2017.

Research Presentations will include the following.

Increasing Accessibility of Professional Development (PD): Evaluation of an Online PD for High School Science Teachers
Authors: Adam Schellinger, Andrew P Jaciw, Jenna Lynn Zacamy, Megan Toby, & Li Lin
In Event: Promoting and Measuring STEM Learning
Saturday, April 29 10:35am to 12:05pm
Henry B. Gonzalez Convention Center, River Level, Room 7C

Abstract: This study examines the impact of an online teacher professional development, focused on academic literacy in high school science classes. A one-year randomized control trial measured the impact of Internet-Based Reading Apprenticeship Improving Science Education (iRAISE) on instructional practices and student literacy achievement in 27 schools in Michigan and Pennsylvania. Researchers found a differential impact of iRAISE favoring students with lower incoming achievement (although there was no overall impact of iRAISE on student achievement). Additionally, there were positive impacts on several instructional practices. These findings are consistent with the specific goals of iRAISE: to provide high-quality, accessible online training that improves science teaching. Authors compare these results to previous evaluations of the same intervention delivered through a face-to-face format.


How Teacher Practices Illuminate Differences in Program Impact in Biology and Humanities Classrooms
Authors: Denis Newman, Val Lazarev, Andrew P Jaciw, & Li Lin
In Event: Poster Session 5 - Program Evaluation With a Purpose: Creating Equal Opportunities for Learning in Schools
Friday, April 28 12:25 to 1:55pm
Henry B. Gonzalez Convention Center, Street Level, Stars at Night Ballroom 4

Abstract: This paper reports research to explain the positive impact in a major RCT for students in the classrooms of a subgroup of teachers. Our goal was to understand why there was an impact for science teachers but not for teachers of humanities, i.e., history and English. We have labelled our analysis “moderated mediation” because we start with the finding that the program’s success was moderated by the subject taught by the teacher and then go on to look at the differences in mediation processes depending on the subject being taught. We find that program impact teacher practices differ by mediator (as measured in surveys and observations) and that mediators are differentially associated with student impact based on context.


Are Large-Scale Randomized Controlled Trials Useful for Understanding the Process of Scaling Up?
Authors: Denis Newman, Val Lazarev, Jenna Lynn Zacamy, & Li Lin
In Event: Poster Session 3 - Applied Research in School: Education Policy and School Context
Thursday, April 27 4:05 to 5:35pm
Henry B. Gonzalez Convention Center, Ballroom Level, Hemisfair Ballroom 2

Abstract: This paper reports a large scale program evaluation that included an RCT and a parallel study of 167 schools outside the RCT that provided an opportunity for the study of the growth of a program and compare the two contexts. Teachers in both contexts were surveyed and a large subset of the questions are asked of both scale-up teachers and teachers in the treatment schools of the RCT. We find large differences in the level of commitment to program success in the school. Far less was found in the RCT suggesting that a large scale RCT may not be capturing the processes at play in the scale up of a program.

We look forward to seeing you at our sessions to discuss our research. You can also view our presentation schedule here.

2017-04-17

SREE Spring 2017 Conference Recap

Several Empirical Education team members attended the annual SREE conference in Washington, DC from March 4th - 5th. This year’s conference theme, “Expanding the Toolkit: Maximizing Relevance, Effectiveness and Rigor in Education Research,” included a variety of sessions focused on partnerships between researchers and practitioners, classroom instruction, education policy, social and emotional learning, education and life cycle transitions, and research methods. Andrew Jaciw, Chief Scientist at Empirical Education, chaired a session about Advances in Quasi-Experimental Design. Jaciw also presented a poster on developing a “systems check” for efficacy studies under development. For more information on this diagnostic approach to evaluation, watch this Facebook Live video of Andrew’s discussion of the topic.

Other highlights of the conference included Sean Reardon’s keynote address highlighting uses of “big data” in creating context and generating hypotheses in education research. Based on data from the Stanford Education Data Archive (SEDA), Sean shared several striking patterns of variation in achievement and achievement gaps among districts across the country, as well as correlations between achievement gaps and socioeconomic status. Sean challenged the audience to consider how to expand this work and use this kind of “big data” to address critical questions about inequality in academic performance and education attainment. The day prior to the lecture, our CEO, Denis Newman, attended a workshop lead by Sean and colleagues that provided a detailed overview of the SEDA data and how it can be used in education research. The psychometric work to generate equivalent scores for every district in the country, the basis for his findings, was impressive and we look forward to their solving the daunting problem of extending the database to encompass individual schools.

2017-03-24

New Mexico Implementation


Empirical Education and the New Mexico Public Education Department (NMPED) are entering into their fourth year of collaboration using Observation Engine to increase educator effectiveness by improving understanding of the NMTEACH observation protocol and inter-rater reliability amongst observers using it. During the implementation, Observation Engine has been used for calibration and professional development with over 2,000 educators across the state annually. In partnership with the Southern Regional Education Board (SREB), who is providing training on best practices, the users in New Mexico have pushed the boundaries of what is possible with Observation Engine. Observation Engine was initially used solely for certifying observers prior to live classroom observations. Now, observers are relying on Observation Engine’s lesson functionality to provide professional development throughout the year. In addition, some administrators are now using videos and content from Observation Engine directly with teachers to provide them with models of what good instruction looks like.

The exciting news is that the collaborative efforts of NMPED, SREB, and Observation Engine are demonstrating impressive results across New Mexico that are noteworthy, especially when compared to the rest of the nation. In a compilation of teacher performance ratings from 19 states that have reformed their evaluation system since the seminal Widget Effect Report, Kraft and Gilmour (2016) found that in a majority of these states, fewer than 3 percent of teachers are rated below proficient. New Mexico stood out as an outlier among these states with 26.2% of teachers rated below proficient, a percentage comparable with more realistic pilots of educator effectiveness ratings. This is likely a sign of excellent professional development, as well as a willingness to realistically adjust the thresholds for proficiency based on the data that is being yielded and examined from actual practice, such as data captured within Observation Engine.

Kraft, M.A., & Gilmour, A.F. (2016). Revisiting the Widget Effect: Teacher Evaluation Reforms and the Distribution of Teacher Effectiveness. Brown University working paper. Retrieved July 21, 2016, from http://scholar.harvard.edu/mkraft/publications/revisiting-widget-effect-teacher-evaluation-reforms-and-distribution-teacher.

2016-12-02

Empirical Education Publication Productivity


Empirical Education’s research group, led by Chief Scientist Andrew Jaciw, has been busy publishing articles that address key concerns of educators and researchers.

Our article describing the efficacy trial of Math in Focus program that was accepted by JREE earlier this year just arrived in print to our Palo Alto office a couple of weeks ago. If you subscribe to JREE, it’s the very first article in the current issue (volume 9, number 4). If you don’t subscribe, we have a copy in our lobby if anyone would like to stop by and check it out.

Another article that the analysis team has been working on is called “An Empirical Study of Design Parameters for Assessing Differential Impacts for Students in Group Randomized Trials.” This one has recently been accepted for publication in the Evaluation Review in the issue that should be printed any day now. The paper grows out of our work on many cluster randomized trials and our interest in differential impacts of programs. We believe that the question of “what works” has limited meaning without systematic exploration of “for whom” and “under what conditions”. The common perception is that these latter concerns are secondary and our designs have too little power to assess them. We challenge these notions and provide guidelines for addressing these questions.

In another issue of Evaluation Review, we published two companion articles:

Assessing the Accuracy of Generalized Inferences From Comparison Group Studies Using a Within-Study Comparison Approach: The Methodology


Applications of a Within-Study Comparison Approach for Evaluating Bias in Generalized Causal Inferences from Comparison Groups Studies

This work further extends our interest in issues of external validity and equip researchers with a strategy for testing the limits of generalizations from randomized trials. Written for a technical audience, the work extends an approach commonly used to assess levels of selection bias in estimates from non-experimental studies to examine bias in generalized inferences from experiments and non-experiments.

It’s always exciting for our team to share the findings from our experiments, as well as the things we learn during the analysis that can help the evaluation community provide more productive evidence for educators. Much of our work is done in partnership with other organizations and if you’re interested in partnering with us on this kind of work, please email Robin Means.

2016-11-18

Pittsburgh Public Schools Uses 20 New Content Suite Videos


In June 2015, Pittsburgh Public Schools began using Observation Engine for calibration and training their teacher evaluators. They were one of our first clients to use the Content Suite to calibrate and certify classroom observers.

The Content Suite contains a collection of master scored videos along with thoughtful, objective score justifications on all observable elements of teaching called for by the evaluation framework used in the district. And it includes short video clips focused on one particular aspect of teaching. The combination of full-length and short clips makes it possible to easily and flexibly set up practice exercises, collaborative calibration sessions, and formal certification testing. Recently, we have added 20 new videos to the Content Suite collection!

The Content Suite can be used with most frameworks, either as-is or modified to ensure that the scores and justifications are consistent with the local context and observation framework interpretation. Observation Engine support staff will work closely with each client to modify content and design a customized implementation plan that meets the goals of the school system and sets up evaluators for success. For more information about the Content Suite, click here.

2016-11-10

Arkansas Implements Observation Engine Statewide

BloomBoard’s observation tool, EdReflect, has been used across the state of Arkansas since fall 2014. Last year, the Arkansas Department of Education piloted Observation Engine, an online observation training and calibration tool from Empirical Education Inc., in four districts under the state’s Equitable Access Plan. Accessible through the BloomBoard platform, Observation Engine allows administrators and other teacher evaluators to improve scoring calibration and reliability through viewing and rating videos of classroom lessons collected in thousands of classrooms across the country.

Paired with BloomBoard resources and training, the results were impressive. In one district, the number of observers scoring above target increased from 43% to 100%. Not only that but the percent discrepancy (scores that were two levels above or below the target) decreased from 9% to 0%. Similar results were found in the other three pilot districts, prompting decision makers to make Observation Engine readily available to districts throughout the state.

“EdReflect has proven to be a valuable platform for educator observations in Arkansas. The professional conversation, which results from the ability to provide timely feedback and shared understanding of effective practice, has proven to ensure a transparency and collaboration that we have not experienced before. With the addition of Empirical Education’s Observation Engine, credentialed teacher observers have ready access to increase inter-rater reliability and personal skill. For the first time this year, BloomBoard Collections and micro-credentials have begun meeting individualized professional learning needs for educators all over the state.”
– Sandra Hurst, Arkansas Department of Education

In July, the Arkansas Department of Education decided to offer Observation Engine to the entire state. About half of all districts in the state opted in to receive the service, with the implementation spanning three groups of users in Arkansas. The Beginning Administrators group has already started pursuing a micro-credential based on Observation Engine. Micro-credentials are a digital form of certification that indicate a person has demonstrated competency in a specific skill set. The Beginning Administrators group can earn their “Observation Skills for Beginning Administrators” micro-credential by demonstrating observation skill competencies using Observation Engine’s online observer calibration tool to practice and assess observation skills.

Next month, the 26 more districts under the Equitable Access Plan and the remaining Arkansas districts will begin using Observation Engine. We look forward to following and reporting on the progress of these districts during the 2016-17 school year.

2016-11-02

Presenting Research-Based Tools and Resources for Family & Community Engagement at the NIEA Convention

On October 8, 2016, Jenna Zacamy, Erica Plut, and Haidee Williams (AIR) presented a workshop at the National Indian Education Association convention in Reno, Nevada. The workshop, Research-Based Tools and Resources for Family & Community Engagement, provided approximately 50 attendees with research literature on this topic, examples of promising practices to engage families and communities in various contexts gathered from the literature, and a method for exploring local programs and planning for improvements. Conversations were rich as attendees shared their current strategies and discussed new ideas. Implementing research-based practices for family and community engagement specific to Native Americans can help improve engagement and achievement outcomes in these unique communities.

2016-10-18
Archive