Research Report

Blazar, D. (2015). Effective teaching in elementary mathematics: Identifying classroom practices that support student achievement. Economics of Education Review , 48, 16-29. Publisher's VersionAbstract

Recent investigations into the education production function have moved beyond traditional teacher inputs, such as education, certification, and salary, focusing instead on observational measures of teaching practice. However, challenges to identification mean that this work has yet to coalesce around specific instructional dimensions that increase student achievement. I build on this discussion by exploiting within-school, between-grade, and cross-cohort variation in scores from two observation instruments; further, I condition on a uniquely rich set of teacher characteristics, practices, and skills. Findings indicate that inquiry-oriented instruction positively predicts student achievement. Content errors and imprecisions are negatively related, though these estimates are sensitive to the set of covariates included in the model. Two other dimensions of instruction, classroom emotional support and classroom organization, are not related to this outcome. Findings can inform recruitment and development efforts aimed at improving the quality of the teacher workforce. 

Kelcey, B., Hill, H. C., & McGinn, D. (2014). Approximate measurement invariance in cross-classified rater-mediated assessments. Frontiers in Psychology , 5 (1469). Publisher's VersionAbstract

An important assumption underlying meaningful comparisons of scores in rater-mediated assessments is that measurement is commensurate across raters. When raters differentially apply the standards established by an instrument, scores from different raters are on fundamentally different scales and no longer preserve a common meaning and basis for comparison. In this study, we developed a method to accommodate measurement noninvariance across raters when measurements are cross-classified within two distinct hierarchical units. We conceptualized random item effects cross-classified graded response models and used random discrimination and threshold effects to test, calibrate, and account for measurement noninvariance among raters. By leveraging empirical estimates of rater-specific deviations in the discrimination and threshold parameters, the proposed method allows us to identify noninvariant items and empirically estimate and directly adjust for this noninvariance within a cross-classified framework. Within the context of teaching evaluations, the results of a case study suggested substantial noninvariance across raters and that establishing an approximately invariant scale through random item effects improves model fit and predictive validity.

Blazar, D., Braslow, D., Charalambous, C., & Hill, H. C. (2015). Attending to General and Content-Specific Dimensions of Teaching: Exploring Factors Across Two Observation Instruments.Abstract

New observation instruments used in research and evaluation settings assess teachers along multiple domains of teaching practice, both general and content-specific. However, this work infrequently explores the relationship between these domains. In this study, we use exploratory and confirmatory factor analyses of two observation instruments - the Classroom Assessment Scoring System (CLASS) and the Mathematical Quality of Instruction (MQI) - to explore the extent to which we might integrate both general and content-specific view of teaching. Importantly, bi-factor analyses that account for instrument-specific variation enable more robust conclusions than in existing literature. Findings indicate that there is some overlap between instruments, but that the best factor structures include both general and content-specific practices. This suggests new approaches to measuring mathematics instruction for the purposes of evaluation and professional development. 

Kane, T. J., Taylor, E., Tyler, J., & Wooten, A. (2011). Identifying Effective Classroom Practices Using Student Achievement Data. The Journal of Human Resources , 46 (3), 587-613.Abstract

This paper combines information from classroom-based observations and measures of teachers’ ability to improve student achievement as a step toward addressing the challenge of identifying effective teachers and teaching practices. The authors find that classroom-based measures of teaching effectiveness are related in substantial ways to student achievement growth. The authors conclude that the results point to the promise of teacher evaluation systems that would use information from both classroom observations and student test scores to identify effective teachers. Information on the types of practices that are most effective at raising achievement is also highlighted.

Lynch, K., Chin, M., & Blazar, D. (2013). How Well Do Teacher Observations Predict Value-Added? Exploring Variability Across Districts. In Association for Public Policy Analysis & Management Fall Research Conference . Washington, DC.Abstract

In this study we ask: Do observational instruments predict teachers' value-added equally well across different state tests and district/state contexts? And, to what extent are differences in these correlations a function of the match between the observation instrument and tested content? We use data from the Gates Foundation-funded Measures of Effective Teaching (MET) Project(N=1,333) study of elementary and middle school teachers from six large public school districts,and from a smaller (N=250) study of fourth- and fifth-grade math teachers from four large public school districts. Early results indicate that estimates of the relationship between teachers' value-added scores and their observed classroom instructional quality differ considerably by district.

Hill, H. C., Gogolen, C., Litke, E., Humez, A., Blazar, D., Corey, D., Barmore, J., et al. (2013). Examining High and Low Value-Added Mathematics: Can Expert Observers Tell the Difference? In Association for Public Policy Analysis & Management Fall Research Conference . Washington, DC.Abstract

In this study, we use value-added scores and video data in order to mount an exploratory study of high- and low-VAM teachers' instruction. Specifically, we seek to answer two research questions: First, can expert observers of mathematics instruction distinguish between high- and low-VAM teachers solely by observing their instruction? Second, what instructional practices, if any, consistently characterize high but not low-VAM teacher classrooms? To answer these questions, we use data generated by 250 fourth- and fifth-grade math teachers and their students in four large public school districts.Preliminary analyses indicate that a teacher's value-added rank was often not obvious to this team of expert observers.

Kane, T. J., Jacob, B., Rockoff, J., & Staiger, D. O. (2011). Can You Recognize an Effective Teacher When You Recruit One? Association for Education Finance and Policy , 6 (1), 43-74. Publisher's VersionAbstract

The authors administered an in-depth survey to new math teachers in New York City and collected information on a number of non-traditional predictors of effectiveness: teaching specific content knowledge, cognitive ability, personality traits, feelings of self-efficacy, and scores on a commercially available teacher selection instrument. They find that a number of these predictors have statistically and economically significant relationships with student and teacher outcomes. The authors conclude that, while there may be no single factor that can predict success in teaching, using a broad set of measures can help schools improve the quality of their teachers.

Hill, H. C., Charalambous, C. Y., Blazar, D., McGinn, D., Kraft, M. A., Beisiegel, M., Humez, A., et al. (2012). Validating Arguments for Observational Instruments: Attending to Multiple Sources of Variation. Educational Assessment , 17, 1-19.Abstract

Measurement scholars have recently constructed validity arguments in support of a variety of educational assessments, including classroom observation instruments. In this article, we note that users must examine the robustness of validity arguments to variation in the implementation of these instruments. We illustrate how such an analysis might be used to assess a validity argument constructed for the Mathematical Quality of Instruction instrument, focusing in particular on the effects of varying the rater pool, subject matter content, observation procedure, and district context. Variation in the subject matter content of lessons did not affect rater agreement with master scores, but the evaluation of other portions of the validity argument varied according to the composition of the rater pool, observation procedure, and district context. These results demonstrate the need for conducting such analyses, especially for classroom observation instruments that are subject to multiple sources of variation

Taylor, E. S., & Tyler, J. H. (2011). The Effect of Evaluation on Performance: Evidence from Longitudinal Student Achievement Data of Mid-career Teachers. Publisher's VersionAbstract

The effect of evaluation on employee performance is traditionally studied in the context of the principal-agent problem. Evaluation can, however, also be characterized as an investment in the evaluated employee’s human capital. We study a sample of mid-career public school teachers where we can consider these two types of evaluation effect separately. Employee evaluation is a particularly salient topic in public schools where teacher effectiveness varies substantially and where teacher evaluation itself is increasingly a focus of public policy proposals. We find evidence that a quality classroom-observation-based evaluation and performance measures can improve mid-career teacher performance both during the period of evaluation, consistent with the traditional predictions; and in subsequent years, consistent with human capital investment. However the estimated improvements during evaluation are less precise. Additionally, the effects sizes represent a substantial gain in welfare given the program’s costs.

Hill, H. C., & Grossman, P. (2013). Learning from Teacher Observations: Challenges and Opportunities Posed by New Teacher Evaluation Systems. Harvard Educational Review.Abstract

In this article, Heather Hill and Pam Grossman discuss the current focus on using teacher observation instruments as part of new teacher evaluation systems being considered and implemented by states and districts. They argue that if these teacher observation instruments are to achieve the goal of supporting teachers in improving instructional practice, they must be subject-specific, involve content experts in the process of observation, and provide information that is both accurate and useful for teachers. They discuss the instruments themselves, raters and system design, and timing of and feedback from the observations. They conclude by outlining the challenges that policy makers face in designing observation systems that will work to improve instructional practice at scale.

Pages