blog posts and news stories

A Rebellion Against the Current Research Regime

Finally! There is a movement to make education research more relevant to educators and edtech providers alike.

At various conferences, we’ve been hearing about a rebellion against the “business as usual” of research, which fails to answer the question of, “Will this product work in this particular school or community?” For educators, the motive is to find edtech products that best serve their students’ unique needs. For edtech vendors, it’s an issue of whether research can be cost-effective, while still identifying a product’s impact, as well as helping to maximize product/market fit.

The “business as usual” approach against which folks are rebelling is that of the U.S. Education Department (ED). We’ll call it the regime. As established by the Education Sciences Reform Act of 2002 and the Institute of Education Sciences (IES), the regime anointed the randomized control trial (or RCT) as the gold standard for demonstrating that a product, program, or policy caused an outcome.

Let us illustrate two ways in which the regime fails edtech stakeholders.

First, the regime is concerned with the purity of the research design, but not whether a product is a good fit for a school given its population, resources, etc. For example, in an 80-school RCT that the Empirical team conducted under an IES contract on a statewide STEM program, we were required to report the average effect, which showed a small but significant improvement in math scores (Newman et al., 2012). The table on page 104 of the report shows that while the program improved math scores on average across all students, it didn’t improve math scores for minority students. The graph that we provide here illustrates the numbers from the table and was presented later at a research conference.

bar graph representing math, science, and reading scores for minority vs non-minority students

IES had reasons couched in experimental design for downplaying anything but the primary, average finding, however this ignores the needs of educators with large minority student populations, as well as of edtech vendors that wish to better serve minority communities.

Our RCT was also expensive and took many years, which illustrates the second failing of the regime: conventional research is too slow for the fast-moving innovative edtech development cycles, as well as too expensive to conduct enough research to address the thousands of products out there.

These issues of irrelevance and impracticality were highlighted last year in an “academic symposium” of 275 researchers, edtech innovators, funders, and others convened by the organization now called Jefferson Education Exchange (JEX). A popular rallying cry coming out of the symposium is to eschew the regime’s brand of research and begin collecting product reviews from front-line educators. This would become a Consumer Reports for edtech. Factors associated with differences in implementation are cited as a major target for data collection. Bart Epstein, JEX’s CEO, points out: “Variability among and between school cultures, priorities, preferences, professional development, and technical factors tend to affect the outcomes associated with education technology. A district leader once put it to me this way: ‘a bad intervention implemented well can produce far better outcomes than a good intervention implemented poorly’.”

Here’s why the Consumer Reports idea won’t work. Good implementation of a program can translate into gains on outcomes of interest, such as improved achievement, reduction in discipline referrals, and retention of staff, but only if the program is effective. Evidence that the product caused a gain on the outcome of interest is needed or else all you measure is the ease of implementation and student engagement. You wouldn’t know if the teachers and students were wasting their time with a product that doesn’t work.

We at Empirical Education are joining the rebellion. The guidelines for research on edtech products we recently prepared for the industry and made available here is a step toward showing an alternative to the regime while adopting important advances in the Every Student Succeeds Act (ESSA).

We share the basic concern that established ways of conducting research do not answer the basic question that educators and edtech providers have: “Is this product likely to work in this school?” But we have a different way of understanding the problem. From years of working on federal contracts (often as a small business subcontractor), we understand that ED cannot afford to oversee a large number of small contracts. When there is a policy or program to evaluate, they find it necessary to put out multi-million-dollar, multi-year contracts. These large contracts suit university researchers, who are not in a rush, and large research companies that have adjusted their overhead rates and staffing to perform on these contracts. As a consequence, the regime becomes focused on the perfection in the design, conduct, and reporting of the single study that is intended to give the product, program, or policy a thumbs-up or thumbs-down.

photo of students in a classroom on computers

There’s still a need for a causal research design that can link conditions such as resources, demographics, or teacher effectiveness with educational outcomes of interest. In research terminology, these conditions are called “moderators,” and in most causal study designs, their impact can be measured.

The rebellion should be driving an increase the number of studies by lowering their cost and turn-around time. Given our recent experience with studies of edtech products, this reduction can reach a factor of 100. Instead of one study that costs $3 million and takes 5 years, think in terms of a hundred studies that cost $30,000 each and are completed in less than a month. If for each product, there are 5 to 10 studies that are combined, they would provide enough variation and numbers of students and schools to detect differences in kinds of schools, kinds of students, and patterns of implementation so as to find where it works best. As each new study is added, our understanding of how it works and with whom improves.

It won’t be enough to have reviews of product implementation. We need an independent measure of whether—when implemented well—the intervention is capable of a positive outcome. We need to know that it can make (i.e., cause) a difference AND under what conditions. We don’t want to throw out research designs that can detect and measure effect sizes, but we should stop paying for studies that are slow and expensive.

Our guidelines for edtech research detail multiple ways that edtech providers can adapt research to better work for them, especially in the era of ESSA. Many of the key recommendations are consistent with the goals of the rebellion:

  • The usage data collected by edtech products from students and teachers gives researchers very precise information on how well the program was implemented in each school and class. It identifies the schools and classes where implementation met the threshold for which the product was designed. This is a key to lowering cost and turn-around time.
  • ESSA offers four levels of evidence which form a developmental sequence, where the base level is based on existing learning science and provides a rationale for why a school should try it. The next level looks for a correlation between an important element in the rationale (measured through usage of that part of the product) and a relevant outcome. This is accepted by ESSA as evidence of promise, informs the developers how the product works, and helps product marketing teams get the right fit to the market. a pyramid representing the 4 levels of ESSA
  • The ESSA level that provides moderate evidence that the product caused the observed impact requires a comparison group matched to the students or schools that were identified as the users. The regime requires researchers to report only the difference between the user and comparison groups on average. Our guidelines insist that researchers must also estimate the extent to which an intervention is differentially effective for different demographic categories or implementation conditions.

From the point of view of the regime, nothing in these guidelines actually breaks the rules and regulations of ESSA’s evidence standards. Educators, developers, and researchers should feel empowered to collect data on implementation, calculate subgroup impacts, and use their own data to generate evidence sufficient for their own decisions.

A version of this article was published in the Edmarket Essentials magazine.

2018-05-09

IES Published Our REL Southwest Study on Trends in Teacher Mobility

The U.S. Department of Education’s Institute of Education Sciences published a report of a study we conducted for REL Southwest! We are thankful for the support and engagement we received from the Educator Effectiveness Research Alliance throughout the study.

The study was published in December 2017 and provides updated information regarding teacher mobility for Texas public schools during the 201112 through 201516 school years. Teacher mobility is defined as teachers changing schools or leaving the public school system.

In the report, descriptive information on mobility rates is presented at the regional and state levels for each school year. Mobility rates are disaggregated further into destination proportions to describe the proportion of teacher mobility due to within-district movement, between-district movement, and leaving Texas public schools. This study leverages data collected by the Texas Education Agency during the pilot of the Texas Teacher Evaluation and Support System (T-TESS) in 57 school districts in 201415. Analyses examine how components of the T-TESS observation rubric are related to school-level teacher mobility rates.

During the 2011-12 school year, 18.7% of Texas teachers moved schools within a district, moved between districts, or left the Texas Public School system. By 2015-16, this mobility rate had increased to 22%. Moving between districts was the primary driver of the increase in mobility rates. Results indicate significant links between mobility and teacher, student, and school demographic characteristics. Teachers with special education certifications left Texas public schools at nearly twice the rate of teachers with other teaching certifications. School-level mobility rates showed significant positive correlations with the proportion of special education, economically disadvantaged, low-performing, and minority students. School-level mobility rates were negatively correlated with the proportion of English learner students. Schools with higher overall observation ratings on the T-TESS rubric tended to have lower mobility rates.

Findings from this study will provide state and district policymakers in Texas with updated information about trends and correlates of mobility in the teaching workforce, and offer a systematic baseline for monitoring and planning for future changes. Informed by these findings, policymakers can formulate a more strategic and targeted approach for recruiting and retaining teachers. For instance, instead of using generic approaches to enhance the overall supply of teachers or improve recruitment, more targeted efforts to attract and retain teachers in specific subject areas (for example, special education), in certain stages of their career (for example, novice teachers), and in certain geographic areas are likely to be more productive. Moreover, this analysis may enrich the existing knowledge base about schools’ teacher retention and mobility in relation to the quality of their teaching force, or may inform policy discussions about the importance of a stable teaching force for teaching effectiveness.

2018-02-01

IES Publishes our Recent REL Southwest Teacher Studies

The U.S. Department of Education’s Institute of Education Sciences published two reports of studies we conducted for REL Southwest! We are thankful for the support and engagement we received from the Educator Effectiveness Research Alliance and the Oklahoma Rural Schools Research Alliance throughout the studies. The collaboration with the research alliances and educators aligns well with what we set out to do in our core mission: to support K-12 systems and empower educators in making evidence-based decisions.

The first study was published earlier this month and identified factors associated with successful recruitment and retention of teachers in Oklahoma rural school districts, in order to highlight potential strategies to address Oklahoma’s teaching shortage. This correlational study covered a 10-year period (the 2005-06 to 2014-15 school years) and used data from the Oklahoma State Department of Education, the Oklahoma Office of Educational Quality and Accountability, federal non-education sources, and publicly available geographic information systems from Google Maps. The study found that teachers who are male, those who have higher postsecondary degrees, and those who have more teaching experience are harder than others to recruit and retain in Oklahoma schools. In addition, for teachers in rural districts, higher total compensation and increased responsibilities in job assignment are positively associated with successful recruitment and retention. In order to provide context, the study also examined patterns of teacher job mobility between rural and non-rural school districts. The rate of teachers in Oklahoma rural schools reaching tenure is slightly lower than the rates for teachers in non-rural areas. Also, rural school districts in Oklahoma had consistently lower rates of success in recruiting teachers than non-rural school districts from 2006-07 to 2011-12.

This most recent study, published last week, examined data from the 2014-15 pilot implementation of the Texas Teacher Evaluation and Support System (T-TESS). In 2014-15 the Texas Education Agency piloted the T-TESS in 57 school districts. During the pilot year teacher overall ratings were based solely on rubric ratings on 16 dimensions across four domains.

The study examined the statistical properties of the T-TESS rubric to explore the extent to which it differentiates teachers on teaching quality and to investigate its internal consistency and efficiency. It also explored whether certain types of schools have teachers with higher or lower ratings. Using data from the pilot for more than 8,000 teachers, the study found that the rubric differentiates teacher effectiveness at the overall, domain, and dimension levels; domain and dimension ratings on the observation rubric are internally consistent; and the observation rubric is efficient, with each dimension making a unique contribution to a teacher’s overall rating. In addition, findings indicated that T-TESS rubric ratings varied slightly in relation to some school characteristics that were examined, such as socioeconomic status and percentage of English Language Learners. However, there is little indication that these characteristics introduced bias in the evaluators’ ratings.

2017-10-30

Work has Started on Analysis of Texas Educator Evaluation (T-TESS) Pilot

Empirical Education, through its contract with the REL Southwest, has begun the data collection process for an analysis of the Texas Teacher Evaluation and Support System (T-TESS) pilot conducted by the Texas Education Agency. This is announced on the IES site. Empirical’s Senior Research Scientist Val Lazarev is leading the analysis, which will focus on the elements and components of the system to better understand what T-TESS is measuring and provide alternative approaches to forming summative or composite scores.

2015-06-05

IES Releases New Empirical Education Report on Educator Effectiveness

Our report just released by IES examines the statistical properties of Arizona’s new multiple-measure teacher evaluation model. The study used data from the pilot in 2012-13 to explore the relationships among the system’s component measures (teacher observations, student academic progress, and stakeholder surveys). It also investigated how well the model differentiated between higher and lower performing teachers. Findings suggest that the model’s observation measure may be improved through further evaluator training and calibration, and that a single aggregated composite score may not adequately represent independent aspects of teacher performance.

The study was carried out in partnership with the Arizona Department of Education as part of our work with the Regional Education Laboratory (REL) West’s Educator Effectiveness Alliance, which includes Arizona, Utah, and Nevada Department of Education officials, as well as teacher union representatives, district administrators, and policymakers. While the analysis is specific to Arizona’s model, the study findings and methodology may be of interest to other state education agencies that are developing of implementing new multiple-measure evaluation systems. We have continued this work with additional analyses for alliance members and plan to provide additional capacity building during 2015.

2014-12-16

Study of Alabama STEM Initiative Finds Positive Impacts

On February 21, 2012 the U.S. Department of Education released the final report of an experiment that Empirical Education has been working on for the last six years. The report, titled Evaluation of the Effectiveness of the Alabama Math, Science, and Technology Initiative (AMSTI) is now available on the Institute of Education Sciences website. The Alabama State Department of Education held a press conference to announce the findings, attended by Superintendent of Education Bice, staff of AMSTI, along with educators, students, and co-principal investigator of the study, Denis Newman, CEO of Empirical Education. The press release issued by the Alabama State Department of Education and a WebEx presentation provide more detail on the study’s findings.

AMSTI was developed by the state of Alabama and introduced in 2002 with the goal of improving mathematics and science achievement in the state’s K-12 schools. Empirical Education was primarily responsible for conducting the study—including the design, data collection, analysis, and reporting—under its subcontract with the Regional Education Lab, Southeast (the study was initiated through a research grant to Empirical). Researchers from Academy of Education Development, Abt Associates, and ANALYTICA made important contributions to design, analysis and data collection.

The findings show that after one year, students in the 41 AMSTI schools experienced an impact on mathematics achievement equivalent to 28 days of additional student progress over students receiving conventional mathematics instruction. The study found, after one year, no difference for science achievement. It also found that AMSTI had an impact on teachers’ active learning classroom practices in math and science that, according to the theory of action posited by AMSTI, should have an impact on achievement. Further exploratory analysis found effects for student achievement in both mathematics and science after two years. The study also explored reading achievement, where it found significant differences between the AMSTI and control groups after one year. Exploration of differential effect for student demographic categories found consistent results for gender, socio-economic status, and pretest achievement level for math and science. For reading, however, the breakdown by student ethnicity suggests a differential benefit.

Just about everybody at Empirical worked on this project at one point or another. Besides the three of us (Newman, Jaciw and Zacamy) who are listed among the authors, we want to acknowledge past and current employees whose efforts made the project possible: Jessica Cabalo, Ruthie Chang, Zach Chin, Huan Cung, Dan Ho, Akiko Lipton, Boya Ma, Robin Means, Gloria Miller, Bob Smith, Laurel Sterling, Qingfeng Zhao, Xiaohui Zheng, and Margit Zsolnay.

With solid cooperation of the state’s Department of Education and the AMSTI team, approximately 780 teachers and 30,000 upper-elementary and middle school students in 82 schools from five regions in Alabama participated in the study. The schools were randomized into one of two categories: 1) Those who received AMSTI starting the first year, or 2) Those who received “business as usual” the first year and began participation in AMSTI the second year. With only a one-year delay before the control group entered treatment, the two-year impact was estimated using statistical techniques developed by, and with the assistance of our colleagues at Abt Associates. Academy for Education Development assisted with data collection and analysis of training and program implementation.

Findings of the AMSTI study will also be presented at the Society for Research on Educational Effectiveness (SREE) Spring Conference taking place in Washington D.C. from March 8-10, 2012. Join Denis Newman, Andrew Jaciw, and Boya Ma on Friday March 9, 2012 from 3:00pm-4:30pm, when they will present findings of their study titled, “Locating Differential Effectiveness of a STEM Initiative through Exploration of Moderators.” A symposium on the study, including the major study collaborators, will be presented at the annual conference of the American Educational Research Association (AERA) on April 15, 2012 from 2:15pm-3:45pm at the Marriott Pinnacle ⁄ Pinnacle III in Vancouver, Canada. This session will be chaired by Ludy van Broekhuizen (director of REL-SE) and will include presentations by Steve Ricks (director of AMSTI); Jean Scott (SERVE Center at UNCG); Denis Newman, Andrew Jaciw, Boya Ma, and Jenna Zacamy (Empirical Education); Steve Bell (Abt Associates); and Laura Gould (formerly of AED). Sean Reardon (Stanford) will serve as the discussant. A synopsis of the study will also be included in the Common Guidelines for Education Research and Development.

2012-02-21

Empirical is participating in recently awarded five-year REL contracts

The Institute of Education Sciences (IES) at the U.S. Department of Education recently announced the recipients of five-year contracts for each of the 10 Regional Education Laboratories (RELs). We are excited to be part of four strong teams of practitioners and researchers that received the awards.

The original request for proposals in May 2011 called for the new RELs to work closely with alliances of state and local education agencies and other practitioner organizations to build local capacity for research. Considering the close ties between this agenda and Empirical’s core mission we joined the proposal efforts and are now part of winning teams in the West (led by WestEd), Northwest (led by Education Northwest), Midwest (led by the American Institutes for Research (AIR)), and Southwest (led by SEDL) The REL Southwest is currently under a stop work order while ED addresses a dispute concerning its review process. Empirical Education’s history in conducting Randomized Control Trials (RCTs) and in providing technical assistance to education agencies provides a strong foundation for the next five years.

2012-02-16

New RFP calls for Building Regional Research Capacity

The US Department of Education (ED) has just released the eagerly anticipated RFP for the next round of the Regional Education Laboratories (RELs). This RFP contains some very interesting departures from how the RELs have been working, which may be of interest especially to state and local educators.

For those unfamiliar with federal government organizations, the RELs are part of the National Center for Education Evaluation and Regional Assistance (abbreviated NCEE), which is within the Institute of Education Sciences (IES), part of ED. The country is divided up into ten regions, each one served by a REL—so the RFP announced today is really a call for proposals in ten different competitions. The RELs have been in existence for decades but their mission has evolved over time. For example, the previous RFP (about 6 years ago) put a strong emphasis on rigorous research, particularly randomized control trials (RCTs) leading the contractors in each of the 10 regions to greatly expand their capacity, in part by bringing in subcontractors with the requisite technical skills. (Empirical conducted or assisted with RCTs in four of the 10 regions.) The new RFP changes the focus in two essential ways.

First, one of the major tasks is building capacity for research among practitioners. Educators at the state and local levels told ED that they needed more capacity to make use of the longitudinal data systems that the ED has invested in through grants to the states. It is one thing to build the data systems. It is another thing to use the data to generate evidence that can inform decisions about policies and programs. Last month at the conference of the Society for Research on Educational Effectiveness, Rebecca Maynard, Commissioner of NCEE talked about building a “culture of experimentation” among practitioners and building their capacity for simpler experiments that don’t take so long and are not as expensive as those NCEE has typically contracted for. Her point was that the resulting evidence is more likely to be used if the practitioners are “up close and immediate.”

The second idea found in the RFP for the RELs is that each regional lab should work through “alliances” of state and local agencies. These alliances would cross state boundaries (at least within the region) and would provide an important part of the REL’s research agenda. The idea goes beyond having an advisory panel for the REL that requests answers to questions. The alliances are also expected to build their own capacity to answer these questions using rigorous research methods but applying them cost-effectively and opportunistically. The capacity of the alliances should outlive the support provided by the RELs. If your organization is part of an existing alliance and would like to get better at using and conducting research, there are teams being formed to go after the REL contracts that would be happy to hear from you. (If you’re not sure who to call, let us know and we’ll put you in touch with an appropriate team.)

2011-05-11

A Conversation About Building State and Local Research Capacity

John Q Easton, director of the Institute of Education Sciences (IES), came to New Orleans recently to participate in the annual meeting of the American Educational Research Association. At one of his stops, he was the featured speaker at a meeting of the Directors of Research and Evaluation (DRE), an organization composed of school district research directors. (DRE is affiliated with AERA and was recently incorporated as a 501©(3)). John started his remarks by pointing out that for much of his career he was a school district research director and felt great affinity to the group. He introduced the directions that IES was taking, especially how it was approaching working with school systems. He spent most of the hour fielding questions and engaging in discussion with the participants. Several interesting points came out of the conversation about roles for the researchers who work for education agencies.

Historically, most IES research grant programs have been aimed at university or other academic researchers. It is noteworthy that even in a program for “Evaluation of State and Local Education Programs and Policies,” grants have been awarded only to universities and large research firms. There is no expectation that researchers working for the state or local agency would be involved in the research beyond the implementation of the program. The RFP for the next generation of Regional Education Labs (REL) contracts may help to change that. The new RFP expects the RELs to work closely with education agencies to define their research questions and to assist alliances of state and local agencies in developing their own research capacity.

Members of the audience noted that, as district directors of research, they often spend more time reviewing research proposals from students and professors at local colleges who want to conduct research in their schools, rather than actually answering questions initiated by the district. Funded researchers treat the districts as the “human subjects,” paying incentives to participants and sometimes paying for data services. But the districts seldom participate in defining the research topic, conducting the studies, or benefiting directly from the reported findings. The new mission of the RELs to build local capacity will be a major shift.

Some in the audience pointed out reasons to be skeptical that this REL agenda would be possible. How can we build capacity if research and evaluation departments across the country are being cut? In fact, very little is known about the number of state or district practitioners whose capacity for research and evaluation could be built by applying the REL resources. (Perhaps, a good first research task for the RELs would be to conduct a national survey to measure the existing capacity.)

John made a good point in reply: IES and the RELs have to work with the district leadership—not just the R&E departments—to make this work. The leadership has to have a more analytic view. They need to see the value of having an R&E department that goes beyond test administration, and is able to obtain evidence to support local decisions. By cultivating a research culture in the district, evaluation could be routinely built in to new program implementations from the beginning. The value of the research would be demonstrated in the improvements resulting from informed decisions. Without a district leadership team that values research to find out what works for the district, internal R&E departments will not be seen as an important capacity.

Some in the audience pointed out that in parallel to building a research culture in districts, it will be necessary to build a practitioner culture among researchers. It would be straightforward for IES to require that research grantees and contractors engage the district R&E staff in the actual work, not just review the research plan and sign the FERPA agreement. Practitioners ultimately hold the expertise in how the programs and research can be implemented successfully in the district, thus improving the overall quality and relevance of the research.

2011-04-20

REL West Releases Report of RCT on Problem-Based Economics Conducted with Empirical Ed Help

Three years ago, Empirical Education began assisting the Regional Educational Laboratory West (REL West) housed at WestEd in conducting a large-scale randomized experiment on the effectiveness of the Problem-Based Economics (PBE) curriculum.

Today, the Institute of Education Sciences released the final report indicating a significant impact of the program for students in 12th grade as measured by the Test of Economic Literacy. In addition to the primary focus on student achievement outcomes, the study examined changes in teachers’ content knowledge in economics, their pedagogical practices, and satisfaction with the curriculum. The report, Effects of Problem Based Economics on High School Economics Instruction is found on the IES website.

Eighty Arizona and California school districts participated in the study, which encompassed 84 teachers and over 8,000 students. Empirical Education was responsible for major aspects of research operations, which involved collecting, tracking, scoring, and warehousing all data including rosters and student records from the districts, as well as the distribution of the PBE curricular materials, assessments, and student and teacher surveys. To handle the high volume and multiple administrations of surveys and assessments, we created a detail-oriented operation including schedules for following up with survey responses where we achieved response rates of over 95% for both teacher and student surveys. The experienced team of research managers, RAs and data warehouse engineers maintained a rigorous 3-day turnaround for gathering end-of-unit exams and sending score reports to each teacher. The complete, documented dataset was delivered to the researchers at WestEd as our contribution to this REL West achievement.

2010-07-30
Archive