Data Analysis

When we address questions of program or product effectiveness, we often compare students’ performance on standardized tests, where the classes, teachers, grade level teams, or schools have been assigned to either treatment or control conditions. When assignment is random, we conduct experimental impact analyses. When assignment is non-random, we use quasi-experimental analyses. The mean impact is estimated using multi-level models (or hierarchical linear models, HLM). Our studies examine the moderating effect of teacher- and student-level covariates. We also conduct mediation analyses by determining, for example, the extent to which an impact on student achievement is explained by an impact on teacher practice. The work is conducted in SAS and R for which we have developed common libraries.

In many cases, we analyze data for trends in student outcomes and associations between various parameters of educational processes. Examples of this type of study include:

Such analyses are not intended to establish causal links between inputs and outputs in learning but only to establish empirical regularities in the available data. These studies can be useful for:

Studies can be performed at any level from individual student to district, and typically rely on data from state and local student information systems or data accumulated in educational software systems. Our expertise in data science allows efficient auditing and cleaning of raw data to obtain high quality samples. The strength of a relationship or a trend in the data is estimated using correlation or regression analysis (linear, logistic, or non-parametric), with statistical correction for selection bias when necessary and possible. We also use a variety of multivariate data mining methods as appropriate.