Filed under IMPACT

Guest Commentary: Value-Added Has No Value

By: Steven J. Klees
Harold R.W. Benjamin Professor of International and Comparative Education
University of Maryland, College Park

The story of fifth-grade teacher Sarah Wysocki (March 7) is tragic and, unfortunately, this is a tragedy being repeated across the country.  By all reports and evaluations, but one, Wysocki was an excellent teacher.  The one was a piece of statistical legerdemain that has been sweeping the country called “value-added,”  fed, in part, by the Obama administration’s Race to the Top program which mandates it.  Value-added is a statistical set of procedures that purports to measure scientifically what has become theholy grail of education — the impact of a teacher on student test scores.  And these statistics said that Wysocki didn’t do enough to improve test scores so she was fired.

Unfortunately, measuring value-added in practice is simply impossible, illogical, and unscientific, and you don’t have to be a statistician to understand why.  Value-added statistical models are supposed to separate the impact of one factor — the teacher — from the literally dozens of other factors that contribute to a student’s performance on a test.

For example: access to a home computer, other resources in the home, technology access in the schools, effort at homework, parents’ education, parent’s support, influence of previous teachers, peer effects, school climate, aspirations, access to health care, better diet, a good night’s sleep, and many, many others.  Even if you had information on all these factors, believing some statistical model could sort out the relative influence of each is wishful thinking.  Moreover, value-added models only have data on very few factors — usually special education status, English proficiency, attendance, and eligibility for reduced price lunch.  Controlling for these and attributing the rest to the teacher makes no sense.  The effect attributed to the teacher is always incorrect since omitted factors could change the teacher impact measure in either direction.  Statisticians who attempt to control for a few of these factors can do so, and the analysis will always identify so-called meritorious teachers.  But the results are completely illegitimate.  Controlling for different factors will lead to different teachers selected as meritorious, and there is no basis for deciding which factors to control.

Florida tried a value-added approach to merit pay for schools in the 1980s.  This suffers the same problems as a value-added merit pay for teachers’ scheme.  In Florida, school district statisticians found their value-added models identified different schools as meritorious, depending on which factors they controlled for and they realized there was no right way to decide what to control for.  These statisticians were embarrassed when it came time to awarding money to meritorious schools since there was no stable way to estimate which schools were meritorious, and there was no rational basis for explaining to schools why they won or lost.

That is the situation we are now in.  Teachers in the District, who generally know who the good teachers are, said they were “stunned” and “bewildered” by the firings.  The decision to fire Ms. Wysocki would never have been made if statisticians were to provide alternative estimates of teacher impact based on using different value-added models.  But they do not.  Why?  Partly, statisticians become fascinated with their models and want to believe in them.  But partly, this is now a big business and you don’t get paid if you offer equivocal answers.  Unfortunately, while value-added approaches are very complex, they are simply not science.

I am not saying test scores are irrelevant to teacher assessment.  Simple measures of a classroom’s gain in test scores, as one piece of information among many about a teacher’s performance, can be interpreted with knowledge of the local context as part of a professional peer evaluation system.  But we can no more scientifically determine teachers’ effects on test scores than we can legislators’ impact on economic growth or poverty reduction.  Sure, both have an impact, but the processes are too complicated for simplistic solutions.

Tagged , , , , ,

Value-Added Does Not Show True Value of the Teacher

By Kristin Dobbs, WTU Field Services Specialist 

I wanted to take a moment to reflect on the recent Washington Post article that tells the story of former DCPS teacher Sarah Wysocki, who was terminated at the end of the 2010-2011 school year due to IMPACT.  In my opinion, this article (like so many other stories) shines a light on the glaring flaws on the IMPACT evaluation tool.  According to the article, Ms. Wysocki was terminated for a low IMPACT score, which was dragged down due to the “value-added” scoring of her students.  Further, the article writes that this “value-added” tool used by DCPS can end up being 50% of the overall data imputed to determine a teacher’s rating.  To me, an attempt to boil down a year of teaching into number crunching on a calculator is not the proper method for determining true success in the classroom.

I don’t believe that teaching is a mathematical equation that spits out a successful student at the end of a school year.  It takes creativity, dedication, and many other tools that teachers bring to the table to make a classroom successful.  While I have never taught in a classroom, I am the product of public schools, and come from a family of public school teachers (including my mother, aunt, cousin, and numerous friends).  I know we all can think back to the teacher that went above and beyond to make learning come alive.  I immediately think of the example of an elementary school teacher of mine who made the Oregon Trail not just a thing we learned about in a book – but an interactive and creative game that incorporated several forms of learning that taught us not only the history of settling the West, but also leadership and teamwork.  To me, this story and countless others reflect the direction we should be moving in for students.

I believe that when DCPS attempts to put such weight to the “value-added” model, it forces a creative and dedicated teacher such as Ms. Wysocki out of the District.  If DC is ever going to reform its public education system, it cannot keep losing valuable teachers to Maryland and Virginia.  The District should take a hard look into the data that is used for “value-added” model, and look to see if it can paint a realistic picture of student achievement.  According to the Washington Post article, students with special needs, learning disabilities, emotional disturbances, behavioral problems and the like can bring down a teacher’s IMPACT rating, no matter how hard he or she tries to teach the students.  Further, with accusations of cheating on the DC CAS, as well as with the allegations that many special education students not receiving the proper services through DCPS’ current “inclusion” model, it is questionable in my opinion as to whether those numbers should be so heavily relied upon.

To me, DCPS should be encouraging creativity and flexibility of teachers to help their students reach their goals.  It is hard to argue that the equation-driven model for evaluating success is working in the District – or else schools wouldn’t be struggling to make AYP and the achievement gap between white and non-white children would be shrinking, not growing.  Instead, I believe the IMPACT model with the “value-added” component removes the opportunity for teachers, who know the most about their students and what it would take for them to succeed, to have a voice in the process.  Instead, the fear of termination over numbers outside of the control of the educator forces the teacher’s hand.  Teachers should be focused on helping their students solve mathematical equations, not fearing being at the wrong end of one in June.

Tagged , , , , ,

The Washington Post: ‘Creative … motivating’ and fired

Jahi Chikwendiu/The Washington Post - Sarah Wysocki was out of work for only a few days after she was fired by DCPS last year. She is now teaching at Hybla Valley Elementary School in Fairfax County.

By Bill Turque

By the end of her second year at MacFarland Middle School, fifth-grade teacher Sarah Wysocki was coming into her own.“It is a pleasure to visit a classroom in which the elements of sound teaching, motivated students and a positive learning environment are so effectively combined,” Assistant Principal Kennard Branch wrote in her May 2011 evaluation.

He urged Wysocki to share her methods with colleagues at the D.C. public school. Other observations of her classroom that year yielded good ratings.

Two months later, she was fired.

Wysocki, 31, was let go because the reading and math scores of her students didn’t grow as predicted. Her undoing was “value-added,” a complex statistical tool used to measure a teacher’s direct contribution to test results. The District and at least 25 states, under prodding from the Obama administration, have adopted or are developing value-added systems to assess teachers.

When her students fell short, the low value-added trumped her positives in the classroom. Under the D.C. teacher evaluation system, called IMPACT , the measurement counted for 50 percent of her annual appraisal. Classroom observations, such as the one Branch conducted, represented 35 percent, and collaboration with the school community and schoolwide testing trends made up the remaining 15 percent.

Her story opens a rare window into the revolution in how teachers across the country are increasingly appraised — a mix of human observation and remorseless algorithm that is supposed to yield an authentic assessment of effectiveness. In the view of school officials, Wysocki, one of 206 D.C. teachers fired for poor performance in 2011, was appropriately judged by the same standards as her peers. Colleagues and friends say she was swept aside by a system that doesn’t always capture a teacher’s true value.

Proponents of value-added contend that it is a more meaningful yardstick of teacher effectiveness — growth over time — than a single year’s test scores. They also contend that classroom observations by school administrators can easily be colored by personal sentiments or grudges. Researchers for the Bill & Melinda Gates Foundation reported in 2010 that a teacher’s value-added track record is among the strongest predictors of student achievement gains.

Which is why D.C. school officials have made it the largest component of their evaluation system for teachers in grades with standardized tests. The District aims to expand testing so that 75 percent of classroom teachers can be rated using value-added data. Now, only about 12 percent are eligible.

“We put a lot of stock in it,” said Jason Kamras, chief of human capital for D.C. schools.

Yet even researchers and educators who support value-added caution that it can, in essence, be overvalued. Test results are too vulnerable to conditions outside a teacher’s control, some experts say, to count so heavily in a high-stakes evaluation. Poverty, learning disabilities and random testing day incidents such as illness, crime or a family emergency can skew scores.

The District attempts to compensate for some of these factors, weighing special education status, English proficiency, attendance, and eligibility for free or reduced-price lunch — a common proxy for poverty — in developing growth predictions for students.

But some experts say it should never be a decisive factor in a teacher’s future.

“It has a place, but I wouldn’t give it pride of place,” said Henry Braun, professor of education and public policy at Boston College. He contends that only random assignment of teachers and students — wholly impractical in big school systems — can eliminate enough bias and error to obtain a valid measure of how much teachers improve student performance.

Some states are taking a more conservative approach than the District. New York recently set value-added at 20 percent of annual evaluations. Tennessee and Minnesota have the ceiling at 35 percent. Other states, such as Colorado and Ohio, mandate that 50 percent of teacher assessments must use student growth data but leave it up to local school districts whether to use value-added or other measures.

“You can get me to walk down the road with you to say value-added is relevant, but 50 percent is too weighted,” said Washington Teachers’ Union President Nathan Saunders.

Kamras said the disconnect between the observations of Wysocki’s classroom and her value-added scores was “quite rare.” Most teachers with poor ratings in one area, he said, are also substandard in the other.

“It doesn’t necessarily suggest that anything wrong happened,” he said. “Sometimes it’s just not possible to know for sure.”

Wysocki said there is another possible explanation: Many students arrived at her class in August 2010 after receiving inflated test scores in fourth grade.

Fourteen of her 25 students had attended Barnard Elementary. The school is one of 41 in which publishers of the D.C. Comprehensive Assessment System tests found unusually high numbers of answer sheet erasures in spring 2010, with wrong answers changed to right. Twenty-nine percent of Barnard’s 2010 fourth-graders scored at the advanced level in reading, about five times the District average.

D.C. and federal investigators are examining whether there was cheating, but school officials stand by the city’s test scores.

Kamras acknowledged that the Barnard data are “suggestive” of a problem but said that without clear evidence, nothing could be done. Overall, he said that Wysocki was treated fairly and that her case does not reflect a deeper issue with IMPACT.

“I stand behind my evaluation of her,” he said. “It does not, in my view, call into question anything.”

Wysocki was out of work for only a few days. She is teaching at Hybla Valley Elementary School in Fairfax County and came forward to tell her story because she believes it is one that D.C. teachers and parents should know.

“I think what it says is how flawed this system is.”

‘Needs to be clear’

Like many young educators, Wysocki struggled at first. The Chicago-born daughter of a physicist, she came to the District in 2009 from Washington state, where she was a teacher assistant in a private Waldorf school that minimized testing and focused on the emotional and ethical development of the whole child.

In D.C. schools, she found another culture entirely. IMPACT spans an exacting set of nineperformance criteria covering virtually every aspect of pedagogy, including clear presentation, behavior management and skill at asking questions. Teachers are graded on a 1-to-4 scale (ineffective, minimally effective, effective and highly effective).

Wysocki’s 2009-10 evaluation was peppered with twos.

“Your instruction needs to be clear and differentiated to meet your students’ diverse needs,” Sean Precious, then MacFarland’s principal, wrote. “Instructional time should be maximized and student misbehavior should be minimized. Please review your IMPACT binder.”

For the year, classroom observers rated her just short of effective. Her value-added score was low. That left her overall rating in her rookie year as “minimally effective.” If it happened again, she would face dismissal.

MacFarland, on Iowa Avenue NW in the Petworth neighborhood, also was struggling. Four out of five students at the school come from families poor enough to qualify for meal subsidies. Fewer than three in 10 scored proficient on the 2010 city reading test.

But Wysocki got better in 2010-11, improving her ability to tailor lessons and gaining a reputation for her skill at managing multiple groups of children in various activity centers. She drew praise from Assistant Principal Branch for “new and innovative ways” of engaging parents, “dedicating a truly exceptional amount of time towards partnering with them,” through invitations to class events and walking home students who live nearby.

“One of the best teachers I’ve ever come in contact with,” said Bryan Dorsey, head of the MacFarland PTA in 2010-11, who had a daughter in Wysocki’s class. “Every time I saw her, she was attentive to the children, went over their schoolwork, she took time with them and made sure.”

The twos from her first year’s classroom observations were replaced by threes and fours.

But Wysocki was worried. Some students who had scored advanced in fourth grade, she said, could barely read.

“I’m getting a little nervous about testing,” she wrote in an e-mail to Branch and new Principal Andre Samuels in February 2011.

Complicated system

The calculus that ended Wysocki’s career in D.C. schools started as a way of measuring the value of strawberries turned into jam. Value-added began in agriculture, where it was employed to establish the worth of farm products as they changed form. Statistician William Sanders pioneered its conversion to classroom use, starting in Tennessee in the early 1990s.

It’s complicated enough that D.C. schools hired Mathematica Policy Research of Princeton, N.J., to crunch the numbers for each of the 471 teachers in the District from fourth through eighth grades whose students took reading and math tests.

In Wysocki’s case, the firm took the fourth-grade scores of each student in her class and searched for all students in the city with the same numbers. Then, after the students took the spring 2011 tests, Mathematica averaged the scores, weighted for the actual amount of time each student spent in her class and taking into account demographic variables.

Wysocki’s actual average reading score was 54.2 out of 99, less than Mathematica’s predicted average of 59. Her math score, 56.2, was more than 6 points shy of the forecast. Her classroom observation score was 3.2 out of a possible 4, but she was still rated minimally effective and fired in July.

Wysocki was furious. “I want to know how my IVA [Individual Value-Added] can be so OUTRAGEOUSLY different from ALL my other data,” she wrote to the central office on July 19.

School officials said that if she had concerns about cheating she should have alerted the Office of Data and Accountability with specific information, including names of teachers and students from whom she heard allegations of cheating. They said that the office told her this in July 2011 but never heard back from her.

She appealed her dismissal in August to a three-member panel of central office staff, writing a detailed letter outlining concerns about possible cheating on the Barnard scores.

“I was under the impression that the letter with my appeal would be enough to prompt an investigation,” Wysocki said.

It was December before she learned that the firing was upheld. Panel members repeated that Wysocki should have gone to the accountability office sooner. But the panel added that it wouldn’t have mattered.

“The Board and the Chancellor note that investigations of cheating are outside the scope of the Chancellor’s appeals process,” the panel wrote. “As a result, the [value-added] score remains valid.”

Colleagues said they were stunned to hear of the firing. MacFarland’s other fifth-grade teacher, also highly regarded by administrators, also was let go. Teachers said they were bewildered because Samuels and Branch had repeatedly pointed out the progress fifth-graders were making in reading.

“It was celebrated within the school,” said one of several former colleagues of Wysocki’s who spoke on the condition of anonymity to avoid reprisals.

Samuels, who did not respond to calls or e-mails for comment, left Wysocki a sterling recommendation. He endorsed her “without reservation” and described her as “enthusiastic, creative, visionary, flexible, motivating and encouraging.”

Wysocki said she is comfortable in Fairfax. Hybla Valley Elementary, in the Alexandria section of the county, is not a cushy suburban posting. Of her 18 fifth-graders, half are children of immigrants. She is taking courses toward a master’s degree in education at Trinity University.

“We feel fortunate to have Sarah supporting the students at Hybla Valley,” Principal Lauren Sheehy said in an e-mail. “She is a positive and valued team player. Sarah has created a positive learning environment allowing students to be successful academically and socially.”

Wysocki said she feels more at ease generally, especially in seeking help from Sheehy and other mentors. In the District, she said, she often felt that reaching out was considered a sign of weakness.

“Teaching is an art,” she said. “There are so many things to improve on.”

Tagged , , , ,

Get every new post delivered to your Inbox.

Powered by