Making Sense of Student Performance Data

Kim Marshall draws on his 44 years’ experience as a teacher, principal, central office administrator and writer to compile the Marshall Memo, a weekly summary of 64 publications that have articles of interest to busy educators. He shared one of my recent articles, co-authored with doctoral students Britnie Kane and Jonee Wilson, in his latest memo and gave me permission to post her succinct and useful summary.

In this American Educational Research Journal article, Ilana Seidel Horn, Britnie Delinger Kane, and Jonee Wilson (Vanderbilt University) report on their study of how seventh-grade math teams in two urban schools worked with their students’ interim assessment data. The teachers’ district, under pressure to improve test scores, paid teams of teachers and instructional coaches to write interim assessments. These tests, given every six weeks, were designed to measure student achievement and hold teachers accountable. The district also provided time for teacher teams to use the data to inform their instruction. Horn, Kane, and Wilson observed and videotaped seventh-grade data meetings in the two schools, visited classrooms, looked at a range of artifacts, and interviewed and surveyed teachers and district officials. They were struck by how different the team dynamics were in the two schools, which they called Creekside Middle School and Park Falls Middle School. Here’s some of what they found:

  • Creekside’s seventh-grade team operated under what the authors call an instructional management logic, focused primarily on improving the test scores of “bubble” students. The principal, who had been in the building for a number of years, was intensely involved at every level, attending team meetings and pushing hard for improvement on AYP proficiency targets. The school had a full-time data manager who produced displays of interim assessment and state test results. These were displayed (with students’ names) in classrooms and elsewhere around the school. The principal also organized Saturday Math Camps for students who needed improvement. He visited classrooms frequently and had the school’s full-time math coach work with teachers whose students needed improvement. Interestingly, the math coach had a more sophisticated knowledge of math instruction than the principal, but the principal dominated team meetings.

In one data meeting, the principal asked teachers to look at interim assessment data to predict how their African-American students (the school’s biggest subgroup in need of AYP improvement) would do on the upcoming state test. The main focus was on these “bubble” students. “I have 18% passing, 27% bubble, 55% growth,” reported one teacher. The team was urged to motivate the targeted students, especially quiet, borderline kids, to personalize instruction, get marginal students to tutorials, and send them to Math Camp. The meeting spent almost no time looking at item results to diagnose ways in which teaching was effective or ineffective. The outcome: providing attention and resources to identified students. A critique: the team didn’t have at its fingertips the kind of item-by-item analysis of student responses necessary to have a discussion about improving math instruction, and the principal’s priority of improving the scores of the “bubble” students prevented a broader discussion of improving teaching for all seventh graders. “The prospective work of engaging students,” conclude Horn, Kane, and Wilson, “predominantly addressed the problem of improving test scores without substantially re-thinking the work of teaching, thus providing teachers with learning opportunities about redirecting their attention – and very little about the instructional nature of that attention… The summative data scores simply represented whether students had passed: they did not point to troublesome topics… By excluding critical issues of mathematics learning, the majority of the conversation avoided some of the potentially richest sources of supporting African-American bubble kids – and all students… Finally, there was little attention to the underlying reasons that African-American students might be lagging in achievement scores or what it might mean for the mostly white teachers to build motivating rapport, marking this as a colorblind conversation.”

  • The Park Falls seventh-grade team, working in the same district with the same interim assessments and the same pressure to raise test scores, used what the authors call an instructional improvement logic. The school had a brand-new principal, who was rarely in classrooms and team meetings, and an unhelpful math coach who had conflicts with the principal. This meant that teachers were largely on their own when it came to interpreting the interim assessments. In one data meeting, teachers took a diagnostic approach to the test data, using a number of steps that were strikingly different from those at Creekside:
  • Teachers reviewed a spreadsheet of results from the latest interim assessment and identified items that many students missed.
  • One teacher took the test himself to understand what the test was asking of students mathematically.
  • In the meeting, teachers had three things in front of them: the actual test, a data display of students’ correct and incorrect responses, and the marked-up test the teacher had taken.
  • Teachers looked at the low-scoring items one at a time, examined students’ wrong answers, and tried to figure out what students might have been thinking and why they went for certain distractors.
  • The team moved briskly through 18 test items, discussing possible reasons students

missed each one – confusing notation, skipping lengthy questions, mixing up similar-sounding words, etc.

  • Teachers were quite critical of the quality of several test items – rightly so, say Horn, Kane, and Wilson – but this may have distracted them from the practical task of figuring out how to improve their students’ test-taking skills.

The outcome of the meeting: re-teaching topics with attention to sources of confusion. A critique: the team didn’t slow down and spend quality time on a few test items, followed by a more thoughtful discussion about successful and unsuccessful teaching approaches. “The tacit assumption,” conclude Horn, Kane, and Wilson, “seemed to be that understanding student thinking would support more-effective instruction… The Park Falls teachers’ conversation centered squarely on student thinking, with their analysis of frequently missed items and interpretations of student errors. This activity mobilized teachers to modify their instruction in response to identified confusion… Unlike the conversation at Creekside, then, this discussion uncovered many details of students’ mathematical thinking, from their limited grasp of certain topics to miscues resulting from the test’s format to misalignments with instruction.” However, the Park Falls teachers ran out of time and didn’t focus on next instruction steps. After a discussion about students’ confusion about the word “dimension,” for example, one teacher said, “Maybe we should hit that word.” [Creekside and Park Falls meetings each had their strong points, and an ideal team data-analysis process would combine elements from both: the principal providing overall leadership and direction but deferring to expert guidance from a math coach; facilitation to focus the team on a more-thorough analysis of a few items; and follow-up classroom observations and ongoing discussions of effective and less-effective instructional practices. In addition, it would be helpful to have higher-quality interim assessments and longer meetings to allow for fuller discussion. K.M.] “Making Sense of Student Performance Data: Data Use Logics and Mathematics Teachers’ Learning Opportunities” by Ilana Seidel Horn, Britnie Delinger Kane, and Jonee Wilson in American Educational Research Journal, April 2015 (Vol. 52, #2, p. 208-242


Shining a Light on Cultural Blindspots through Teacher Education

I have tweeted a bit about this interesting and important research in teacher education by my doctoral student Elizabeth Self. It always generates a lot of queries and conversations. Liz has really developed and conducted the research, with me and others as a guide, so I have not felt right about explaining her work myself. Instead, I invited her to share the clinical simulation work she has developed to help our pre-service teachers become more culturally competent educators.

Every semester that I teach a social foundations class at Peabody, I end up telling the story about an incident I had at a charter school in Chicago where I was teaching. About how I made a dumb comment without thinking about the context – a White teacher of mostly Black and Brown students – and how, when a Black colleague tried to confront me with the racism inherent in what I’d said, I did everything wrong that White people do in these situations. I was defensive. I tone policed him when he sent an email later. I told friends that I hadn’t mean it “that way.” Then I cried. At some point, I finally got to the place where I could hear what he was trying to say. I can’t say specifically when I finally started to listen or what made me do so, but I can say without a doubt that this incident in large part led me to where I am today.

Now a doctoral candidate at Peabody, I focus on preparing pre-service teachers for culturally responsive teaching, particularly the interactional work. In my first few semesters as an instructor, I tried a variety of approaches to get my pre-service teachers feel the same way I did in the days and weeks that followed that incident with my coworker. When I would share my story or similar examples from case studies, they would gasp in astonishment or groan sympathetically, but at some level, they all thought, “I would never do that!” Nothing seemed to have the effect I was looking to get them to see their own blind spots. It was then that I read about Benjamin Dotger’s work at Syracuse University, using clinical simulations to prepare teachers and administrators for common problems of practice. I thought that with some adaptation, I could develop clinical simulations that served as potential critical incidents for my pre-service teachers.

Clinical simulation is an instructional tool in which pre-service teachers encounter an actor, playing the part of a student, parent, colleague, or administrator, in a way that mimics a real-life event. Participants receive a protocol ahead of time that gives them background on the encounter and provides them with some of the information they would likely have based on when in the school year the event is said to happen. They usually have a few days to a week to prepare. The actors also receive a protocol that they use to prepare so that all actors present the part in a standardized way. The simulation lasts between 15-30 minutes, depending on how it’s designed. Afterwards, participants may do a “raw” debrief right away, but they usually watch their video back before doing a group debrief with the instructor.

While Dotger’s published simulations focus on common problems of practice in secondary education, mine focus on the kinds of incidents that, as Gadamer (1960) wrote, cause someone to be “pulled up short.” To see his assumptions about a person or event go unmet. The simulations I ran this fall were examples of this – talking with a student about an outburst in class, only to learn there is a much more serious problem to deal with; conferencing with a parent about her student, who may have a reading disability, and facing unexpected communication issues; soliciting input from a veteran teacher about new students, and getting way more than what was asked for. In the end, the pre-service teachers who participated in these simulations overwhelmingly came away feeling “pulled up short.” They did not expect the encounter to unfold the way it did, often because they did not pay attention to the relevant information in the protocol that would have prepared them for what occurred. They also struggled (by design) in the simulations because they framed the situations in unproductive ways – as opportunities for telling, rather than asking; as situations in which they wanted to defend, rather then respond. The simulations did not do this on their own; I made careful decisions each cycle (more and less successfully) about how to shape the re-watching of their own videos and what to do during the group debrief. But by the end of the course, the teachers seemed to have become more open to learning about the why and how of culturally responsive teaching and were thinking more productively about how to interact with their future students.

My goals in these simulations are multifold. First and foremost, I want teachers to understand that their knowledge is always partial. Without knowing their students, and in ways deeper than a first-of-the-year interest inventory reveals, they will have difficulty reaching their students, especially those who have been historically marginalized in US society and underserved by our schools. Next, I want them to recognize their blind spots and realize that they will always have some, but must be ready to acknowledge them when someone points them out. Finally, I want to give my pre-service teachers an opportunity to fail in a setting that is supportive of them but also safe for their students. Often in teaching, we send pre-service teachers out to tutor in low-income communities as their first interaction with students. In my mind, this raises the potential harm for students who are already underserved and may reinforce stereotypes for pre-service teachers. Clinical simulations in no way replace the need for teachers to spend time in the communities where they teacher or to interact with real students, but I do hope that they help provide teachers with a better starting place for those interactions.

It occurs to me periodically that I am an unlikely person to be doing this work. Surely, it would seem more reasonable for the person doing this to come from an insider perspective – someone who has personally suffered the effects of racism, ethnocentrism, ableism, or homophobia. For that reason, I make efforts as I develop each simulation to draw on cultural insiders to help make the simulation authentic to their lived experience. Furthermore, I see it as imperative that people of privilege work – thoughtfully and reflectively – to spare these insiders some of the burden they have carried for so long in providing this education to folks like me. It is my desire that by doing so, my own children – both White and Black – will encounter teachers a little more ready to teach them than I was.