# Research Report

Chin, M., Kane, T., Kozakowski, W., Schueler, B., & Staiger, D. (Working Paper). School District Reform in Newark: Within- and Between- School Changes in Achievement Growth. NBER Working Paper 23922 . Publisher's VersionAbstract
In 2011-12, Newark launched a set of educational reforms supported by $20 million gift. Using data from 2009 through 2016, we evaluate the change in Newark students’ achievement growth relative to similar students and schools elsewhere in New Jersey. We measure achievement growth using a “value-added” model, controlling for prior achievement, demographics and peer characteristics. By the fifth year of reform, Newark saw statistically significant gains in English and no significant change in math achievement growth. Perhaps due to the disruptive nature of the reforms, growth declined initially before rebounding in recent years. Aided by the closure of low value-added schools, much of the improvement was due to shifting enrollment from lower-to higher-growth district and charter schools. Shifting enrollment accounted for 62 percent of the improvement in English. In math, such shifts offset what would have been a decline in achievement growth. Chin, M., Kane, T., Kozakowski, W., Schueler, B., & Staiger, D. (2017). Assessing the Impact of the Newark Education Reforms . Center for Education Policy Research at Harvard University.Abstract Aided by$200 million in private philanthropy, city and state leaders launched a major school reform effort in Newark, New Jersey, starting in the 2011–2012 school year. In a coinciding National Bureau of Economic Research (NBER) working paper, we assessed the impact of those reforms on student achievement growth, comparing students in Newark Public Schools (NPS) district and charter schools to students with similar prior achievement, similar demographics, and similar peers elsewhere in New Jersey. This report includes key findings.
Hill, H. C., Kraft, M. A., & Herlihy, C. (2016). Developing Common Core Classrooms Through Rubric-Based Coaching . Center for Education Policy Research at Harvard University.Abstract

The project team is still awaiting student test data to complete the evaluation, but this brief provides a short update on survey results. Students of MQI-coached teachers report that their teachers ask more substantive questions, and require more use of mathematical vocabulary as compared to students of control teachers. Students in MQI-coached classrooms also reported more student talk in class. Teachers who received MQI Coaching tended to find their professional development significantly more useful than control teachers, and were also more likely to report that their mathematics instruction improved over the course of the year.

Kane, T. J. (2016). Let the Numbers Have Their Say: Evidence on Massachusetts' Charter Schools . Center for Education Policy Research at Harvard University.Abstract

In Massachusetts, the charter school debate has centered on four concerns:

• that the achievement of the high-scoring charter schools is due to selective admission and retention policies and not the education that the charter schools provide,
• that charter schools are underserving English language learners and special education students,
• that charter schools are disciplining students at higher rates in order to drive troublesome students back to traditional schools, and
• that charter schools are undermining traditional public schools financially.

This report summarizes the evidence pertaining to these four concerns.

West, M. R., Morton, B. A., & Herlihy, C. M. (2016). Achievement Network’s Investing in Innovation Expansion: Impacts on Educator Practice and Student Achievement.Abstract

Achievement Network (ANet) was founded in 2005 as a school-level intervention to support the use of academic content standards and assessments to improve teaching and learning. Initially developed within the Boston charter school sector, it has expanded to serve over 500 schools in nine geographic networks across the United States. The program is based on the belief that if teachers are provided with timely data on student performance from interim assessments tied to state standards, if school leaders provide support and create structures that help them use that data to identify student weaknesses, and if teachers have knowledge of how to improve the performance of students who are falling behind, then they will become more effective at identifying and addressing gaps in student learning. This will, in turn, improve student performance, particularly for high-need students.

In 2010, ANet received a development grant from the U.S. Department of Education’s Investing in Innovation (i3) Program. The grant funded both the expansion of the program to serve up to 60 additional schools in five school districts, as well as an external evaluation of the expansion. The Center for Education Policy Research (CEPR) at Harvard University partnered with ANet to design a matched-pair, school-randomized evaluation of their program’s impact on educator practice and student achievement in schools participating in its i3-funded expansion.

Hurwitz, M., Mbekeani, P. P., Nipson, M., & Page, L. C. (2016). Surprising Ripple Effects: How Changing the SAT Score-Sending Policy for Low-Income Students Impacts College Access and Success. Education Evaluation and Policy Analysis. Publisher's VersionAbstract

Subtle policy adjustments can induce relatively large “ripple effects.” We evaluate a College Board initiative that increased the number of free SAT score reports available to low-income students and changed the time horizon for using these score reports. Using a difference-in-differences analytic strategy, we estimate that targeted students were roughly 10 percentage points more likely to send eight or more reports. The policy improved on-time college attendance and 6-year bachelor’s completion by about 2 percentage points. Impacts were realized primarily by students who were competitive candidates for 4-year college admission. The bachelor’s completion impacts are larger than would be expected based on the number of students driven by the policy change to enroll in college and to shift into more selective colleges. The unexplained portion of the completion effects may result from improvements in nonacademic fit between students and the postsecondary institutions in which they enroll.

Blazar, D. (2015). Effective teaching in elementary mathematics: Identifying classroom practices that support student achievement. Economics of Education Review , 48, 16-29. Publisher's VersionAbstract

Recent investigations into the education production function have moved beyond traditional teacher inputs, such as education, certification, and salary, focusing instead on observational measures of teaching practice. However, challenges to identification mean that this work has yet to coalesce around specific instructional dimensions that increase student achievement. I build on this discussion by exploiting within-school, between-grade, and cross-cohort variation in scores from two observation instruments; further, I condition on a uniquely rich set of teacher characteristics, practices, and skills. Findings indicate that inquiry-oriented instruction positively predicts student achievement. Content errors and imprecisions are negatively related, though these estimates are sensitive to the set of covariates included in the model. Two other dimensions of instruction, classroom emotional support and classroom organization, are not related to this outcome. Findings can inform recruitment and development efforts aimed at improving the quality of the teacher workforce.

Kelcey, B., Hill, H. C., & McGinn, D. (2014). Approximate measurement invariance in cross-classified rater-mediated assessments. Frontiers in Psychology , 5 (1469). Publisher's VersionAbstract

An important assumption underlying meaningful comparisons of scores in rater-mediated assessments is that measurement is commensurate across raters. When raters differentially apply the standards established by an instrument, scores from different raters are on fundamentally different scales and no longer preserve a common meaning and basis for comparison. In this study, we developed a method to accommodate measurement noninvariance across raters when measurements are cross-classified within two distinct hierarchical units. We conceptualized random item effects cross-classified graded response models and used random discrimination and threshold effects to test, calibrate, and account for measurement noninvariance among raters. By leveraging empirical estimates of rater-specific deviations in the discrimination and threshold parameters, the proposed method allows us to identify noninvariant items and empirically estimate and directly adjust for this noninvariance within a cross-classified framework. Within the context of teaching evaluations, the results of a case study suggested substantial noninvariance across raters and that establishing an approximately invariant scale through random item effects improves model fit and predictive validity.