Empirical Education Inc.

A Moment of Zen

I visited Denis in La Jolla, in March 2024 . We talked about Denis’ career, and how his philosophy about education and research shaped the mission and vision of Empirical Education. After my visit I wrote the following blog, which I now share in remembrance of Denis. — Andrew

A Moment of Zen

There are specific moments in life that mark a threshold of major change and new direction.

One such occasion was a lunch date I had with Denis Newman almost exactly 20 years ago. He had just started Empirical Education on Sherman Avenue in Palo Alto. We met at Joanie’s Café on California Avenue.

It was one for those perfect days set in the Mediterranean climate, typical of California.

Such events distill to a few concrete memories: I am served a Caesar salad, the afternoon light is streaming in, I hand Denis a copy of Judith Singer’s article on applications of Hierarchical Linear Models (HLMs) in SAS. We begin our first discussion about Randomized Control Trials (RCTs) in educational research.

Denis had just submitted several grant proposals that would fund running RCTs in school districts to evaluate whether educational programs were having beneficial impacts on their students’ learning. Denis was riding a new wave in educational research, and wanted to know if I was interested in joining. It was exciting. My dissertation at Stanford was underway, and I needed a summer job.

It's now 20 years later. That first day is so palpable that I feel I can touch it. Yet, between then and now, there have been myriad paths taken and so much rich exploration. Empirical's office has moved from Sherman Ave. in Palo Alto to University Ave. in Berkeley. We’ve conducted 35 RCTs, and dozens of quasi-experiments across content areas and grade ranges. We've partnered directly with SEAs, LEAs, edtech companies, universities, and large research firms to identify and address critical questions from the field. We've adapted to changes in the education ecosystem, and developed tools and processes to make research more efficient and practical. Our team has expanded (and contracted) and we continue to work as a unit dedicated to improving outcomes for teachers and students.

Denis is retired now and living in La Jolla, California. He actively serves on Empirical's board. I visited him last March. Some of what I write about here reflects my conversations with him, as well as some of my own memories.

A Happy Coincidence

My interest in and commitment to Empirical reflects that Denis and I have remained simpatico in some of our main beliefs.

When we met, I was reading a lot of Lee Cronbach's critiques of standard ideas about causal validity. Cronbach's ideas from the 1970's and 80's emphasized the role of context in program evaluation, and how effects of programs on educational outcomes vary depending on characteristics of students, teachers and schools. Context includes time, and time produces decade-by-treatment interactions, which means that program effects in the social sciences have short half-lives. My appreciation of the complexity of school systems also came from having taught third grade during the tumultuous integration of the class size reduction initiative in California. In my experience, context and unforeseen events undermined the best laid policy plans.

Denis had similar convictions. Ed tech products, in particular, demonstrated what Denis referred to as the short "shelf-life" of their effectiveness, with their rapid advancement leading to new program versions. Denis knew this very well from his earlier career (more on this below). At the same time, Denis was deeply intrigued by the quality of evidence that RCTs could yield. Randomizing students or classes or schools to receive the program, or to a control, yields statistically-equivalent groups. This means that any systematic difference between them in average outcomes has to be the result of one group receiving the program and the other not. That is, randomization permits a very dependable interpretation of the CAUSAL impact of the program relative to the control alternative.

The two ideas of (1) quickly evolving and context-dependent school systems, and (2) the potential of RCTs to measure causal impacts reliably, motivated a specific application of experiments that Denis recognized. The main question was: Could we achieve both context sensitivity, as well as the precision of randomization? (Call this the "relevance X precision challenge".)

Denis recognized that to respond to this challenge, RCTs had to be mobilized locally and rapidly to yield evidence that would be useful to individual school systems; that is, to support decisions concerning programs that were being considered by a district, or that had just been initiated. Context-responsive experiments of this sort contrasted with typically very large-scale and expensive experiments that took years to plan, conduct, analyze, and report, and oftentimes without a clear sense of a suitable inference population.

Denis and I arrived at some of these ideas from different places but with similar interests and passions. Three main ideas were these:

Complexity as boon not bane
The importance of statistical interactions in representing the complexity of conditions for effects
The need to respect locally-defined interests and conditions

Years ago, I shared with Denis a couple of works by Cronbach: his 1975 article Beyond the Two Disciplines of Scientific Psychology, and his 1982 book Designing Evaluations of Educational and Social Programs. After reading them, Denis asked: "What is there left to be said?" During my visit to La Jolla I reminded him of this, and asked if he had anything to add to his insight. He said "no". I smiled at the peculiarity of my question, and the obviousness of his response (given his earlier reaction to Crobach’s work.) It was a moment of Zen.

I am astounded that since those early discussions with Denis about the "relevance X precision" challenge, I can say the field has addressed only some of that challenge, and answers often are in the form of further questions. They inspire inquiry that is engaging, meaningful, and focused on applications.

It is important to point out that since I joined Empirical, much of the methodology we both employed and helped to co-develop was in the process of being worked out. It was a nascent period in the gradual development of IES' standards of evidence. The field was learning and codifying methods. We played a role in that pioneering process. During our discussion in La Jolla, Denis emphasized the importance of HLM. It was a technological capability that moved the field solidly into multivariate analysis and achieved efficiency that previously had been elusive.

Democratization of Research

The priorities noted above—the need for both needs-responsive and contextually-relevant evaluation solutions, and the use of the most rigorous (RCT) solutions for causal inference—underscore another very important idea that is well-understood in the world of program evaluation. It is the need for the research efforts and questions to be motivated by persons for whom the results have the highest stakes: those who have to make decisions, and those who will (or will not) get the program!

With a view to this, Denis often spoke about the democratization of research. This meant that district people had to be right in the mix: informing the questions, understanding the design and language of reporting, and knowing what to do with the results. Here are four short examples where this priority came through.

Denis and I often engaged in a tug-of-war of language in writing our reports. I would often write about "models", "impacts", and "random effects". Denis' response, paraphrased, was: "explain to a non-technical audience—the main consumer of the report—what you mean by these terms, and while you're at it, clarify what the p-value means, and forget about the completely arbitrary 5% significance level; instead, describe results in terms of degrees of confidence". Every instance of jargon would lead more of the main audience to tune out.
Denis took issue with the use of the word "model". Methodologists take mathematical models for granted. The acronym HLM includes "Model". But Denis' concern was with the underlying ambiguity that would make the finding inaccessible. Over time, this has made me realize that notation and jargon function as instruments for efficient communication among experts, but they can be thought-arresting clichés for those less-familiar with the abstraction. They must be used sparingly and clearly.
As noted above, Denis wanted the people on the ground to be in the thick of it. In his program of locally conducted RCTs, he would travel to the districts where the research was taking place, and lead sessions in which teachers would organize themselves into matched pairs for randomization. This approach took advantage of individual judgment and inside knowledge of stakeholders about the myriad factors affecting outcomes, instead of relying on algorithms to sort matched pairs in terms of existing and often weak administrative data. This participation also spiked participants’ interest in the research and evaluation and promoted their understanding of it.
There was also a concern Denis shared about a result from one of our main research projects. The AMSTI project demonstrated positive average impacts of the program, but not for minorities. It seemed to me that Denis was concerned about the funder-required messaging. It prioritized the positive sample-wide average impact, but underplayed the finding of differential impact: a result which would have been of primary interest to large segments of the study sample and the evaluation participants and partners.

Related to these points was Denis' insistence on setting the methods against the presenting reality. For example, he stressed what is obvious, but too easily missed in application of HLM methods: the interdependence of students and their outcomes reflect a highly complex ecology that can be reflected only roughly through mathematical models—the models cannot keep up with reality. He observed that dependencies in outcomes that happen in real school systems, and that motivate the use of HLM, often are way more complex than we can capture through meager indicators in our models. Denis was concerned with important and real facts being overlooked when results were constrained to fit statistical models and criteria. (I think Denis' objection was also on a deeper philosophical level. For example, he pondered how introducing multiple comparisons adjustments to results of an experiment can suddenly alter the interpretation of an effect from it being real to it being unreal - as if the reality depended on rules of the game that we invented for interpreting results.) It seemed to me that Denis was pointing out that the metaphysics and epistemology behind it plays an implicit but very important role because it drives the underlying assumptions behind all our methods. I have come around to this point of view. Unfortunately, on more than a few occasions, I have been reminded by some methodologists who champion experiments that philosophy is mere speculation and anti-scientific gobbledygook.

I can't say I fully understand the layers in Denis' perspective on research, but questions about the motivating philosophy were always there. In La Jolla he noted: "It (Wittgenstein's view) seems to be fundamental to everything we're doing." This is something to reflect on. I have not dug deeply into Wittgenstein, or Denis' view of his ideas, but it gives me pause to imagine all that we do in quantitative experimental research—an enterprise with blueprints for methods, accepted rules, and shared language and assumptions—as a "language game" (in Wittgenstein's sense) conducted in a specific arena. How can we improve the game? What does it mean to change the rules? (This is content for another blog.)

The Past.

The purpose of this blog is not a place to discuss Denis' many endeavors, and I am not the best person to do that. However, when visiting Denis in La Jolla, he described a few memorable milestones. Many of them were prescient of the way research and evaluation in education would move in the decades to come. His experiences included early uses of small group collaborative activities via LAN (local area networks) with children. Denis singled out School PS125 in Manhattan in South Harlem, where he and his colleagues set up a shared server, and allowed students to take work they had done in the previous year into their new year. This allowed growth through accumulated experience, and provided a sense of continuity, purpose and ownership. He noted the product was different because, by including prior years' efforts, the work space was understood by the students as "theirs".

Denis also emphasized some of his early exploration of AI in reading applications. He pointed out the discovery that children can talk-through their reading experiences [via "speech captures"], transforming what it means to read aloud. Instead of being passive, reading becomes active, like talking to an adult who is listening to the talk, and responding intelligently. Denis asserted "we were doing AI!"

One article by Denis and his colleague, Michael Cole, is very interesting to me because it was written and published about the time that Denis started Empirical. It provides a critical link between literature on developmental psychology and experimental field research in education.

The work first discusses learning theory, focusing on the contrast between how problems are presented and solved in laboratory tasks with children, and how program implementation and learning occur in classrooms. Referring to Vygotsky, the authors note that "cognitive processes begin externally and especially in social interactions with adults and peers…before becoming abstract and internal…entail(ing) that implementing educational programs requires establishing systems of interactions (p. 264)." Such interactions invariably introduce variability in implementation and outcomes that are hard to control, and purposefully reducing this variability may compromise ecological validity.

The authors then break to a discussion of what at the time (2002) was a renewed imperative by the Department of Education for scientifically strong research with a focus on experiments. They emphasize the "difficulty of replicating experimental treatments on a large scale" (p. 263). My reading of this, with risk of oversimplification, is that Denis' work in developmental psychology alerted him to the divide between how learning happens in the lab and what occurs in the classroom. The latter identifies the more-experienced adult (teacher) as central to students' internalization of concepts. The interaction between pupils and their teacher—the basic building block of development and learning in the relevant ecology—introduces variance in implementation. This variability presents a fundamental challenge to the use of field experiments in education. Experimental control and adherence to scripted implementation can shut down potential for local program improvements: "the experimental research may be trading off higher levels of overall effectiveness simply for lowering variability" (p. 265). All of these issues are still completely relevant and far from solved: one can imagine a multi-armed RCT with different levels of controlled implementation to see if there are diminishing returns to scripting that implementation, and whether there is a negative correlation between level of control in implementation and program impact.

The Future.

The points summarized above are fodder for the future.

I'm happy to say that Empirical continues in much the same spirit: figuring context into research, while attending to the context of research.

For example, later this month at SREE, I am presenting on the concept of intersectionality in educational research, and its possible implications for quantitative analysis. The idea originates in legal work addressing a "conflation in intragroup differences" (Crenshaw, 1991, p. 1242) that undermines the complexity of group identities, and their relationship to valued outcomes, and that leads to discriminatory practices. Digging into the issue reveals the need to understand and state our positionality, including philosophical commitments and methodological assumptions, in order to address the issues comprehensively and authentically, and avoid a drift towards use of quantitative platitudes.

As another example of upholding this spirit in research, I used the occasion of my article Hold The Bets! Should Quasi-Experiments Be Preferred to True Experiments When Causal Generalization Is the Goal? to provide groundwork for thinking about what it means, operationally, to evaluate the internal and external validity of causal inferences as part of the same problem. Also, concerns Denis expressed about the validity of using results from “one big study” conducted in the past to inform policy for individual locales, including for individual schools and districts, (with parallel questions by Lee Cronbach), provided an impetus to my dissertation efforts. It is reflected in my work about evaluating the accuracy of generalizations of broad causal inferences applied to local contexts—so called "large to small generalizations".

Empirical's mission is established by Denis' interest in serving individuals who are most directly impacted by the results of the research, his commitment to democratizing research, and his imperative to take measure of the methods against the reality they purport to represent. This path is lasting. Denis' early work on applications of computer networks in classroom and AI, and on the ecological validity of causal inferences, are more relevant than ever, and the issues they raise confront us forcefully today as we strive to find solutions in an ever-more challenging world.

The journey continues. Here's to the next 20 years!

2025-11-20

Posted by: Andrew Jaciw

Tags: Denis Newman

Impact of Kami on Student Achievement

Empirical Education conducted a study to evaluate the impact of Kami on student achievement in reading in Moreno Valley Unified School District in the 2022–23 school year. Students in all grades across the district use Kami to create, share, annotate, and collaborate on documents across subject areas. Teachers can use Kami to develop assignments or assessments and share feedback with their students. The main research question is whether usage of Kami has a positive impact on student achievement in reading.

This study used a correlational design aimed at establishing statistical associations between product usage metrics and student outcomes. Unlike experimental and quasiexperimental studies, it does not compare users to similar non-users. It focuses entirely on product users and the differences in outcomes among them that can be attributed to the usage, making appropriate adjustments for differences in users’ individual and class characteristics and pretest scores.

The study found a significant positive association between the use of Kami and student outcomes on the STAR Reading test. Daily usage of Kami during the school year is associated with a 4 percentile test score gain. Most notably, the impact was greater for English language learners (ELL) compared to non-ELL students, and there were no significant differences between students of different races/ethnicities.

The findings were shared widely across the district in the following school year and resulted in an energetic professional learning session with teachers, coaches, and administrators gathering with Kami staff to share best practices and explore new tools and features to support student literacy skills and formative assessment.

2025-03-31

Posted by: Adam Schellinger

Tags: Kami

Happy Holidays 2024

Hi friends,

Do you remember madlibs? You may not realize that madlibs are missing from your life. Don’t worry. We’re bringing back this little bit of history for you to enjoy as our holiday gift to you.

Here’s the original version of who Empirical Education is.

At Empirical Education Inc., our mission is to promote effective and equitable education by providing research services and context-relevant evaluations of programs, products, and policies that empower educators and bring about impactful solutions.

We bring research, data analysis, engineering, and project management expertise to a diverse range of customers including edtech companies and their investors, the U.S. Department of Education, foundations, leading research organizations, and state and local education agencies. Over the last twenty years, we have worked with school systems to conduct dozens of rigorous experiments. Over the last decade, we've been offering services to edtech companies for fast turn-around and low-cost impact studies of their products.

Here’s the madlib for you to create your own version of who Empirical Education is.

Share your results with us! You can email them to us or reply to our facebook or linkedin posts.

Happy Holidays,

The Empirical Education team

2024-12-09

Posted by: Robin Means

Tags: research

Doing Something Truly Original in the Music of Program Evaluation

Is it possible to do something truly original in science?

How about in Quant evaluations in the social sciences?

The operative word here is "truly". I have in mind contributions that are "outside the box".

I would argue that standard Quant provides limited opportunity for originality. Yet, QuantCrit forces us to dig deep to arrive at original solutions - to reinterpret, reconfigure, and in some cases reinvent Quant approaches.

That is, I contend that Quant Crit asks the kinds of questions that force us to go outside the box of conventional assumptions and develop instrumentation and solutions that are broader and better. Yet, I qualify this by saying (and some will disagree), that doing so does not require us to give up the core assumptions that are at the foundation of Quant evaluation methods.

I find that developments and originality in jazz music closely parallel what I have in mind in discussing the evolution of genres in Quant evaluations, and what it means to conceive of and address problems and opportunities outside the box. (You can skip this section, and go straight to the final thoughts, but I would love to share my ideas with you here.)

An Analogy for Originality in the Artistry of Herbie Hancock

Last week I took my daughter, Maya, to see the legendary keyboardist Herbie Hancock perform live with Lionel Loueke, Terrance Blanchard and others. CHILLS along my spine, is how I would describe it. I found myself fixating on Hancock’s hand movements on the keys, and how he swiveled between the grand piano and the KORG synthesizer, and asking: "the improvisation is on-point all the time – how does he know how to go right there?"

Hancock, winner of an Academy award, and 14 Grammys, is a (if not the) major force in the evolution of jazz through the last 60 years, up to the contemporary scene.

His main start was in the 1960's as the pianist in Miles Davis' Second Great Quintet. (When Hancock was dispirited, Davis famously advised him "don't play the butter notes"). Check out this performance by the band of Wayne Shorter's composition "Footprints" from 1967 – note the symbiosis among the group and Hancocks respectful treatment of the melody.

In the 1970’s Hancock developed styles of jazz fusion and funk with the Headhunters (e.g., Chameleon.

Then in the 1980's Hancock explored electro styles, capped by the song "Rockit" – a smash that straddled jazz, pop and hip-hop. It featured scratch styling and became a mainstay for breakdancing (in upper elementary school I co-created a truly amateurish school play that ended in an ensemble Rockit dance with the best breakdancers in our school). Here's Hancock's Grammy performance.

Below is a picture of Hancock from the other night with the strapped synth popularized through the song Rockit.

Hancock did plenty more besides what I mention here, but I narrowed his contributions to just a couple to help me make my point.

His direction, especially with funk fusion and Rockit, ruffled the feathers of more than a few jazz purists. He did not mind. His response was "I have to be true to myself…it was something that I needed to do….because it takes courage to work outside the box…and yet, that’s where the growth lies”.

He also recognized that the need for progression was not just to satisfy his creative direction, but to keep the audience listening; that is, for the music, jazz, to stay alive and relevant. If someone asserts that "Rockit" was a betrayal of jazz that sacrilegiously crossed over into pop and hip-hop, I would counter argue that it opened up the world of jazz to a whole generation of pop listeners (including me). (I recognize similar developments in the genre-crossing works of recent times by Robert Glasper.)

Hancock is a perfect case study of an artist executing his craft (a) fearlessly, (b) not with the goal of pleasing everyone, (c) with the purpose of connecting with, and reaching new audiences, (d) by being open to alternative influences, (e) to achieve a harmonious melodic fusion (moving between his KORG synth, a grand piano), and (f) with constant appreciation reflection of the roots and fundamentals.

Hancock and Band

Coming Back to the Idea of the Fusion of Quant with Quant Crit in Program Evaluation

Society today presents us with situations that require critical examination of how we use the instruments on which we are trained, and an audit of the effect they have, both intended and unintended. It also requires that we adapt the applications of methods that we have honed for years. The contemporary situation poses the question: How can we expand the range of what we can do with the instruments on which we are trained, given the solutions that society needs today, recognizing that any application has social ramifications? I have in mind the need to prioritize problems of equity and social and racial justice. How do we look past conventional applications that limit the recognition, articulation, and development of solutions to important and vexing problems in society?

Rather than feeling powerless and overwhelmed, the Quant evaluator is very well positioned to do this work. I greatly appreciate the observation by Frances Stage on this point:

"…as quantitative researchers we are uniquely able to find those contradictions and negative assumptions that exist in quantitative research frames"

This is analogous to saying that a dedicated pianist in classic jazz is very well positioned to expand the progressions and reach harmonies that reflect contemporary opportunities, needs and interests. It may also require the Quant evaluator to expand his/her arrangements and instrumentation.

As Quant researchers and evaluators, we are most familiar with the "rules of playing" that reinforce "the same old song" that needs questioning. Quant Crit can give us the momentum to push the limits of our instruments and apply them in new ways.

In making these points I feel a welcome alignment with Hancock's approach: recognizing the need to break free from cliché and convention, to keep meaningful discussion going, to maximize relevance, to get to the core of evaluation purpose, to reach new audiences and seed/facilitate new collaborations.

Over the next year I'll be posting a few creations, and striking in some new directions, with syncopations and chords that try to maneuver around and through the orthodoxy – "switching up" between the "KORG and the baby grand" so to speak.

Please stay tuned.

The Band on Stage

2024-10-15

Posted by: Andrew Jaciw

Tags: andrew jaciw, fusion and program evaluation

SREE 2024: On a Mission to Deepen my Quant and Equity Perspectives

I am about to get on the plane to SREE

I am excited, but also somewhat nervous.

Why?

I'm excited
to immerse myself in the conference – my goal is to try to straddle paradigms of criticality, and the quant tradition. SREE historically has championed empirical findings using rigorous statistical methods.

I'm excited
because I will be discussing intersectionality – a topic of interest that emerged from attending a series of Critical Perspectives webinars hosted by SREE in the last few years. I want to try to pay it back by moving the conversation forward and contributing to the critical discussion.

I'm nervous
because the topic of intersectionality is new for me. The idea cuts across many areas - law, sociology, epidemiology, education. It’s a vast subject area with various literature streams. I am new to it. It also gets at social justice issues that I am not used to talking about, and I want to express those clearly and accurately. I understand the power and privilege of my words and presentation and want the audience to continue to inquire and move the conversation forward.

I'm nervous
Because issues of quantitative criticality require a person to confront their deeper philosophical commitments, assumptions, and theory of knowledge (epistemology). I have no problem with that; however, a few of my experimentalist colleagues have expressed a deep resistance to philosophy. One described it as merely a “throat clearing exercise”. (I wonder: Will those with a positivist bent leave my talk in droves?)

Andrew staring at clock

What is intersectionality anyways, and why was I attracted to the idea? It originates in the legal-scholarly work of Kimberle Crenshaw. She describes a court case filed against GM:

"In DeGraffenreid, the court refused to recognize the possibility of compound discrimination against Black women and analyzed their claim using the employment of white women as the historical base. As a consequence, the employment experiences of white women obscured the distinct discrimination that Black women experienced."

The courts refusal to "acknowledge that Black women encounter combined race and sex discrimination implies that the boundaries of sex and race discrimination doctrine are defined respectively by white women's and Black men's experiences."

The justices refused to recognize that hiring practices by GM compounded discrimination across specific intersections of socially-recognized categories (i.e., Black women). The issue is obvious but can be made concrete with an example. Imagine the following distribution of equally-qualified candidates. The court judgment would not have recognized the following situation of compound discrimination:

graphic of gender and race

Why did intersectionality spike my interest in the first place? In the course of the SREE Critical Perspectives seminars, it occurred to me that intersectionality was a concept that bridged what I know with what I want to know.

I like representing problems and opportunities in education in quantitative terms. I use models. However, I also prioritize understanding of the limits of our models, with reality serving as the ultimate check of the validity of the representation. Intersectionality, as a concept, pits out standard models against a reality that is both complex and socially urgent.

Intersectionality as a bridge:

graphic on intersectionality

Intersectionality presents an opportunity to reconcile two worlds, which is a welcome puzzle to work on.

picture of a puzzle

Here’s how I organized my talk. (See the postscript for how it went.)

My positionality: I discussed my background "where I am coming from": including that most of my training is in quant methods, that I am interested in problems of causal generalizability, that I don’t shy away from philosophy, and that my children are racialized as mixed-race and their status inspired my first hypothetical example.
I summarized intersectionality as originally conceived. I reviewed the idea as it was developed by Crenshaw.
I reviewed some of the developments in intersectionality among quantitative researchers who describe their work and approaches as "quantitative intersectionality".
I explored an extension of the idea of intersectionality through the concept of "unique to group" variables: I argued for the need to diversify our models of outcomes and impacts to take into account moderators of impact that are relevant to only specific groups and that respect the uniqueness of their experiences. (I will discuss this more in another blog that is soon to come.)
I provided two examples, one hypothetical, and one real that clarified what I mean by the role of "unique to group" variables.
I summarized the lessons.

picture of a streetlight

There were some other exceptional talks that I attended at SREE, including:

Postscript: How it went!

The other three talks in the session in which I presented (Unpacking Heterogeneous Effects: Methodological Innovations in Educational Research) were excellent. They included a work by Peter Halpin on a topic that I have been puzzled by for a while, specifically, how item-level information can be leveraged to assess program impacts. We almost always assess impacts on scale scores from “ready-made” tests that are based on calibrations of item-level scores. In an experiment one effectively introduces variance into a testing situation and I have wondered what it means for impacts to register at the item level, because each item-level effect will likely interact with the treatment effect. So “hats off” to linking psychometrics and construct validity to discussion of impacts.

As for my presentation, I was deeply moved by the sentiments that were expressed by several conference goers who came up to me afterwards. One comment was "you are on the right track". Others voiced an appreciation for my addressing the topic. I did feel THE BRIDGING between paradigms that I hoped to at least set in motion. This was especially true when one of the other presenters in the session, who had addressed the topic of effect heterogeneity across studies, commented: “Wow, you’re talking about some of the very same things that I am thinking”. It felt good to know that this convergence happened in spite of the fact that the two talks could be seen as very different at the surface level. (And no, people did not leave in droves.)

Thank you Baltimore! I feel more motivated than ever. Thank you SREE organizers and participants.

Picture of Baltimore.

Treating myself afterwards…

A special shoutout to Jose Blackorby. In the end, I did hang up my tie. But I haven’t given up on the idea – just need to find one from a hot pink or aqua blue palette.

Andrew standing by the sree banner

2024-10-04

Posted by: Andrew Jaciw

Tags: andrew jaciw, conference, intersectionality and sree

Considerations for Conducting Research in Digital Learning Platforms

Along with Digital Promise and members of the initial SEERNet research teams, we recently authored a paper illustrating some of the mindset shifts necessary when conducting research using digital learning platforms.

Researchers with traditional backgrounds may need to think flexibly about how to frame their research questions, collaborate closely with developers, identify log data that can inform implementation, and consider iterative study designs.

The paper builds on prior publications and discussions within SEERNet and the broader DLP-as-research infrastructure movement, visit SEERNet.org for more information.

Read the full paper here

2024-07-01

Posted by: Adam Schellinger

Tags: Digital Promise, online learning platforms and research methods

AERA 2024 Annual Meeting

We had an inspiring trip to Philadelphia last month! The AERA conference theme was Dismantling Racial Injustice and Constructing Educational Possibilities: A Call to Action. We presented our latest research on the CREATE study, were able to spend time with our CREATE partners, and attend several captivating sessions on topics including intersectionality, QuantCrit methodology, survey development, race-focused survey research, and SEL. We came away from the conference energized and eager to apply this new learning to our current studies and for AERA 2025!

Thursday, April 11, 2024

Kimberlé Crenshaw 2024 AERA Annual Meeting Opening Plenary—Fighting Back to Move Forward: Defending the Freedom to Learn In the War Against Woke

Kimberle Crenshaw stands on stage delivering the opening plenary. Attendees fill the chairs in a large room, and some attendees sit on the floor.

Kimberlé Crenshaw’s opening plenary explored the relationship between our education system and our democracy, including censorship issues and what Crenshaw describes as a “violently politicized nostalgia for the past.” She brought in her own personal experience in recent years as she has witnessed terms that she coined, including “intersectionality,” being weaponized. She encouraged AERA attendees to fight against censorship in our institutions, and suggested that attendees check out the African American Policy Forum (AAPF) and the Freedom to Learn Network. To learn more, check out Intersectionality Matters!, an AAPF podcast hosted by Kimberlé Crenshaw.

Friday, April 12, 2024

Reconciling Traditional Quantitative Methods With the Imperative for Equitable, Critical, and Ethical Research

Five panelists sit on stage with a projector screen to their right. The heading on the projector screen reads Dialogue with Parents. Eleven attendees are pictured in the audience.

We were particularly excited to attend a panel on Reconciling Traditional Quantitative Methods With the Imperative for Equitable, Critical, and Ethical Research, as our team has been diving into the QuantCrit literature and interrogating our own quantitative methodology in our evaluations. The panelists embrace quantitative research, but emphasize that numbers are not neutral, and that the choices that quantitative researchers make in their research design are critical to conducting equitable research.

Nichole M. Garcia (Rutgers University) discussed her book project on intersectionality. Nancy López (University of New Mexico) encouraged researchers to consider additional questions about “street race” including “What race do you think that others assume what race you are” to better understand the role that the social construction of race plays in participants’ experiences. Jennifer Randall (University of Michigan) encouraged researchers to administer justice-oriented assessments, emphasizing that assessments are not objective, but rather subjective tools that reflect what we value and have historically contributed to educational inequalities. Yasmiyn Irizarry (University of Texas at Austin) encouraged researchers to do the work of citing QuantCrit literature when reporting quantitative research. (Check out #QuantCritSyllabus for resources compiled by Yasmiyn Irizarry and other QuantCrit scholars.)

This panel gave us food for thought, and pushed us to think through our own evaluation practices. As we look forward to AERA 2025, we hope to engage in conversations with evaluators on specific questions that come up in evaluation research, such as how to put WWC standards into conversation with QuantCrit methodology.

The Impact of the CREATE Residency Program on Early Career Teachers’ Well-Being

The Empirical Education team who presented at AERA in 2024.

Andrew Jaciw, Mayah Waltower, and Lindsay Maurer presented on The Impact of the CREATE Residency Program on Early Career Teachers’ Well-Being, focusing on our evaluation of the CREATE program. The CREATE Program at Georgia State University is a federally and philanthropically funded project that trains and supports educators across their career trajectory. In partnership with Atlanta Public Schools, CREATE includes a three-year residency model for prospective and early career teachers who are committed to reimagining classroom spaces for deep joy, liberation and flourishing.

CREATE has been awarded several grants from the U.S. Department of Education, in partnership with Empirical Education as the independent evaluators. The grants include those from Investing in Innovation (i3), Education Innovation and Research (EIR), and Supporting Effective Educator Development (SEED). CREATE is currently recruiting the 10th cohort of residents.

During our presentation, we looked back on promising results from CREATE’s initial program model (2015–2019), shared recent results suggesting possible explanatory links between mediators and outcomes (2021–22), and discussed CREATE evolving program model and how to identify/align more relevant measures (2022–current).

The following are questions that we continue to ponder.

What additional considerations should we take into account when thinking about measuring the well-being of Black educators?
Certain measures of well-being, such as the Maslach Burnout Inventory for Educators, respond to a more narrow definition of teacher well-being. Are there measures of teacher well-being that reflect the context of the school that teachers are in and/or that are more responsive to different educational contexts?
Are there culturally-responsive measures of teacher well-being?
How can we measure the impacts of concepts relating to racial and social justice in the current political context?

Please reach out to us if you have any resources to share!

Survey Development in Education: Using Surveys With Students and Parents

Much of what I do as a Research Assistant at Empirical Education is to support the design and development of surveys, so I was excited to have the chance to attend this session! The authors’ presentations were all incredibly informative, but there were three in particular that I found especially relevant. The first was a paper presented by Jiusheng Zhu (Beijing Normal University) that analyzed the impact of “information nudges” on students’ academic achievement. This paper demonstrated how personalized, specific information nudges about short-term impacts can encourage students to modify their behavior.

Jin Liu (University of South Carolina) presented a paper on the development and validation of an ultra-short survey scale aimed at assessing the quality of life for children with autism. Through the use of network analysis and strength centrality estimations, the scale, known as Quality of Life for Children with Autism Spectrum Disorder (QOLASD-C3), was condensed to a much shorter version that targets specific dimensions of interest. I found this topic particularly interesting, as we are always in the process of refining our survey development processes. Finding ways to boost response rates and minimize participant fatigue is crucial in ensuring the effectiveness of research efforts.

In the third paper, Jennifer Rotach and Davie Store (Kent ISD) demonstrated how demographics play a role in how students score on assessments. The authors explained how disaggregating the data is sometimes necessary to ensure that all students’ voices are heard. They explain that in many cases, school and district decisions are driven by average scores, often leading to the exclusion of those who are above or below the average. The authors explain that in some cases, disaggregating survey data by demographics (such as race, gender, or disability status) may be the most helpful in uncovering a different story than just the “average” will tell.

— Mayah

Sunday, April 14, 2024

Conducting Race-Focused Survey Research in the P-20 System during the Anti-Woke Political Revolt

A presentation slide titled Researcher Positionality Conceptual
Framework shows an image of a brain, with thought bubbles that say Researching the Self, Researching the Self in Relation to Others, Engaged Reflection and Representation, and Shifting from the Self to the System

The four presentations in the symposium titled Conducting Race-Focused Survey Research in the P–20 System During the Anti-Woke Political Revolt focused on tensions, challenges, and problem-solving throughout the process of developing the Knowledge, Beliefs, and Mindsets (KBMs) about Equity in Educators and Educational Leadership Survey. On the CREATE project, where we are constantly working to improve our surveys and center racial equity in our work, we are wrestling with similar dilemmas in terms of sociopolitical context. Therefore, it was very eye-opening to hear panelists talk through their decision-making throughout the entire survey development process. The North Carolina State We-LEED research team walked through their process step-by-step, from conceptualization to the grounding literature and conceptual framing, and instrument development to cognitive interviews, and sample selection to recruitment strategies.

I particularly enjoyed hearing about cognitive interviews, where researchers asked participants to voice their inner monologue while taking the survey, so that they could understand participant feedback and be responsive to participant needs. It was also very helpful to hear the panelists reflect on their positionality and how their positionality connected to their research. I am highly anticipating reviewing this survey when it is finalized!

— Lindsay

A screen projects a slide titled Contemporary Approaches to Evaluation SEL Programs. On the screen is a venn diagram with three circles. The three circles are labeled Skills-Based SEL, Adult Development SEL, and Justice Focused SEL. At the intersection of these three circles are bullet points with the words competencies, pedagogies, implementation, and outcomes. I was excited to attend a session focused on Social Emotional Learning (SEL), a topic that directly relates to the projects I am currently involved in. The symposium featured four papers that all highlighted the importance of conducting high-quality evaluations of Universal School-Based (USB) SEL initiatives.

In the first paper, Christina Cipriano (Yale University) presented a meta-analysis of studies focusing on SEL. This meta-analysis demonstrated that of the studies reviewed, SEL programs that were delivered by teachers showed greater improvements in SEL skills. This paper also provided evidence that programs that taught intrapersonal skills before teaching interpersonal skills showed greater effectiveness.

The second paper was presented by Melissa Lucas (Yale University) and underscored the necessity of including multilingual students in USB SEL evaluations, emphasizing the importance of considering these students when designing and implementing interventions.

Cheyeon Ha (Yale University) presented recommendations from the third paper, which underscored this point for me. The third paper was a meta-analysis of USB SEL studies in the U.S., and it showed that less than 15% of the studies it reviewed included student English Language Learner (ELL) status. Because students with different primary languages may respond to SEL interventions differently, understanding how these programs work on students based on ELL status is important and useful in better understanding an SEL program.

The final paper (presented by Christina Cipriano) provided methodological guidance, which I found particularly intriguing and thought-provoking. It highlighted the importance of utilizing mixed methods research, advocating for open data practices, and ensuring data accessibility and transparency for a wide range of stakeholders.

As we continue to work on projects aimed at implementing SEL and enhancing students’ social-emotional skills, the insights shared in this symposium will undoubtedly prove valuable in our efforts to conduct high-quality evaluations of SEL programs.

— Mayah

2024-05-30

Posted by: Lindsay Maurer and Mayah Waltower

Tags: critical-race-theory-education, intersectionality, racial-justice and research-design

Looking Back to Move Forward

We recently published a paper in collaboration with Digital Promise illustrating the historical precedents for the five digital learning platforms that SEERNet comprises. In “Looking Back to Move Forward,” we trace the technical and organizational foundations of the network’s current efforts along four main themes.

By situating this innovative movement alongside its predecessors, we can identify the opportunities for SEERNet and others to progress and sustain the mission of making research more scalable, equitable, and rigorous.

Read the paper here.

2024-03-27

Posted by: Adam Schellinger

Tags: digital promise, rigor and scale up

Happy Holidays from Empirical Education

All of us at Empirical Education wish you a happy holiday. As a token of our appreciation for you, we want to share this list of books we read in 2023. It comprises contributions from many of our team members. We hope you like it. Cheers to a healthy and prosperous 2024!

photo in the image from Annie Spratt

2023-12-15

Posted by: Robin Means

Tags: empirical education

EIR 2023 Proposals Have Been Reviewed and Awards Granted

While children everywhere are excited about winter break and presents in their stockings, some of us in the education space look forward to December for other reasons. That’s right, the Department of Education just announced the EIR grant winners from the summer 2023 proposal submissions. We want to congratulation all our friends who were amongst that winning list.

One of those winning teams was made up of The MLK Sr Community Resources Center, Connect with Kids Network, Morehouseand Spelman Colleges, New York City Public Schools, The Urban Assembly, Atlanta Public Schools, and Empirical Education. We will evaluate the Sankofa Chronicles: SEL Curriculum from American Diasporas with the early-phase EIR development grant funding.

The word sankofa comes from the Twi language spoken by the Akan people of Ghana. The word is often associated with an Akan proverb, “Se wo were fi na wosankofa a yenkyi.” Translated into English this proverb reminds us, “It is not wrong to go back for that which you have forgotten.” Guided by the philosophy of sankofa, this five year grant will support the creation of a culturally-responsive, multimedia, social emotional learning (SEL) curriculum for high school students.

Participating students will be introduced to SEL concepts through short films that tell emotional and compelling stories of diverse diaspora within students’ local communities. These stories will be paired with an SEL curriculum that seeks to foster not only SEL skills (e.g., self-awareness, responsible decision making) but also empathy, cultural appreciation, and critical thinking.

Our part in the project will begin with a randomized control trial (RCT) of the curriculum in the 2025–2026 school year and culminate in an impact report following the RCT. We will continue to support the program through the remainder of the five-year grant with an implementation study and a focus on scaling up the program.

Check back for updates on this exciting project!

2023-12-07

Posted by: Rebecca Dowling and Robin Means

Tags: EIR, grant, RCT and SEL

blog posts and news stories

A Moment of Zen

A Moment of Zen

A Happy Coincidence

Democratization of Research

The Past.

The Future.

2025-11-20

Impact of Kami on Student Achievement

2025-03-31

Happy Holidays 2024

2024-12-09

Doing Something Truly Original in the Music of Program Evaluation

An Analogy for Originality in the Artistry of Herbie Hancock

Coming Back to the Idea of the Fusion of Quant with Quant Crit in Program Evaluation

2024-10-15

SREE 2024: On a Mission to Deepen my Quant and Equity Perspectives

2024-10-04

Considerations for Conducting Research in Digital Learning Platforms

2024-07-01

AERA 2024 Annual Meeting

Thursday, April 11, 2024

Kimberlé Crenshaw 2024 AERA Annual Meeting Opening Plenary—Fighting Back to Move Forward: Defending the Freedom to Learn In the War Against Woke

Friday, April 12, 2024

Reconciling Traditional Quantitative Methods With the Imperative for Equitable, Critical, and Ethical Research

The Impact of the CREATE Residency Program on Early Career Teachers’ Well-Being

Survey Development in Education: Using Surveys With Students and Parents

Sunday, April 14, 2024

Conducting Race-Focused Survey Research in the P-20 System during the Anti-Woke Political Revolt

2024-05-30

Looking Back to Move Forward

2024-03-27

Happy Holidays from Empirical Education

2023-12-15

EIR 2023 Proposals Have Been Reviewed and Awards Granted

2023-12-07

Archive

blog posts and news stories

A Moment of Zen

A Happy Coincidence

Democratization of Research

The Past.

The Future.

2025-11-20

2025-03-31

2024-12-09

An Analogy for Originality in the Artistry of Herbie Hancock

Coming Back to the Idea of the Fusion of Quant with Quant Crit in Program Evaluation

2024-10-15

2024-10-04

2024-07-01

Thursday, April 11, 2024

Kimberlé Crenshaw 2024 AERA Annual Meeting Opening Plenary—Fighting Back to Move Forward: Defending the Freedom to Learn In the War Against Woke

Friday, April 12, 2024

Reconciling Traditional Quantitative Methods With the Imperative for Equitable, Critical, and Ethical Research

The Impact of the CREATE Residency Program on Early Career Teachers’ Well-Being

Survey Development in Education: Using Surveys With Students and Parents

Sunday, April 14, 2024

Conducting Race-Focused Survey Research in the P-20 System during the Anti-Woke Political Revolt

Contemporary Approaches to Evaluating Universal School-Based Social Emotional Learning Programs: Effectiveness for Whom and How?

2024-05-30

2024-03-27

2023-12-15

2023-12-07

Archive