I wrote a blog last night about the failure of generous stipends to lure top-rated teachers to struggling schools, focusing on Fulton County’s $20,000 offer three years ago. As occurred in other districts that tried incentive pay, few Fulton teachers signed on, and the pilot was eventually discarded.
Fulton spokesman Susan Hale shared several reasons with me. I want to dig deeper into one of them – “the limitations of the tools used to identify high performing teachers (Student Growth Percentiles, TKES).”
Finding ways to identify and reward successful teachers has been on the national agenda for 20 years. Reformers complained virtually every teacher in America earned a satisfactory rating and charged such broad-brush reviews protected ineffective teachers and shortchanged their effective peers.
An oft-quoted source was the 2009 New Teacher Project “Widget Effect” study, which found less than 1 percent of teachers were rated as unsatisfactory even though 81 percent of administrators and 57 percent of teachers could identify a teacher in their school who was ineffective. That study was held up in the legislatures across the country including Georgia as lawmakers pushed for greater honesty in teacher evaluations, a cause taken up the Obama White House. (Here is a good Education Week story on this issue.)
One of the reasons Georgia won a $400 million Race to the Top grant in 2010 was its pledge to tie teacher evaluations to teacher effectiveness. Speaking to the National Education Association in 2009, then Secretary of Education Arne Duncan called for teacher rating systems that weighted student test scores:
A recent report from the New Teacher Project found that almost all teachers are rated the same. Who in their right mind really believes that? We need to work together to change this...Data can also help identify and support teachers who are struggling. And it can help evaluate them. The problem is that some states prohibit linking student achievement and teacher effectiveness.
I understand that tests are far from perfect and that it is unfair to reduce the complex, nuanced work of teaching to a simple multiple choice exam. Test scores alone should never drive evaluation, compensation, or tenure decisions. That would never make sense. But to remove student achievement entirely from evaluation is illogical and indefensible.
In 2014, Georgia rolled out an eval system built on classroom observation and four ratings: exemplary, proficient, needs improvement and ineffective. Last year, the state relaxed the criteria, reducing the reliance on student test scores and granting principals more discretion in how many times they observe teachers.
With all the attention to an allegedly broken teacher evaluation system, are fewer teachers being graded as satisfactory?
Not really, says a recent study that examined revamped teacher rating systems in 24 states including Georgia and surveyed 200 principals.
There are finer gradations of teacher ratings around the proficiency level, but still few teachers earn the scarlet U for unsatisfactory. Only two states in the study, Maryland and New Mexico, rated more than 1 percent of teachers in the very lowest category of unsatisfactory/ineffective.
Why? While principals surveyed by researchers Matthew A. Kraft and Allison F. Gilmour estimated 19.9 percent of teachers in their schools were below proficient in 2014-2015, there were a host of good reasons for them not to label them as such. In fact, the principals rated only 6.3 percent of their staffs below proficient.
Interviews revealed why school leaders were reluctant to ding teachers. These are excerpts from the report:
Time constraints: Rating a teacher as below proficient required intensive amounts of time to document their performance and to provide support for their professional growth. Several principals questioned whether they could collect sufficient evidence in a few observations to justify a rating below proficient. As a middle school principal with nine years of experience put it, “I just feel like sometimes you have to have a lot of detail before you can give somebody a Needs Improvement.” The increased requirements on evaluators of writing detailed improvement plans and conducting up to four unannounced formal observations for teachers whom they rated as unsatisfactory led some principals to use low ratings selectively. An elementary school principal explained: “There were some areas that they could have been needs improvement. Because I was focusing on two or three other teachers who really needed needs improvement. I gave them Proficient in those areas. I did it because I couldn’t tackle that many teachers at the same time as far as writing prescriptions and then following through on the work that I would need to do.”
Teachers’ potential and motivation: Principals reported that they sometimes factored in teachers’ potential when assigning an evaluation rating. For example, one principal spoke about giving new teachers more leeway: “A first year teacher, I tend to give a little more the benefit of doubt. Like, give you a little time, the opportunity to improve, here are some suggestions…Sometimes someone who’s fairly new teaching in the building, they are more apt to accept that feedback.” Principals felt that new teachers were still learning and that it was unfair to rate new teachers as below proficient if they were working to improve their practice.
Personal discomfort: One experienced principal nearing retirement articulated this view clearly: "The most difficult part of the job is probably to deliver those difficult messages, and not everyone is capable of that. That’s where administrators actually fall down is when they’re unable to deliver those type of messages.” Principals spoke about how there was “definitely emotion” involved in assigning below proficient ratings. A middle school principal told us, “I was pretty communicative and still people would be crying, or, ‘I can’t believe you think that.’’’ Principals were keenly aware that an unsatisfactory rating could lead to teachers losing their jobs. A first year high school principal said: “The last thing I think I want to do as a human being is to watch another human being walk out with their head down; dejected, because they just lost their job because they couldn’t do it. This is something that they wanted to do. That’s a little bit harsh, you know?”
The challenges of removing and replacing teachers: Several principals mentioned they also sought to avoid the “long, laborious, legal, draining process” of evaluating out a teacher. Although the evaluation reforms implemented by the district aimed to streamline the dismissal process, it is unclear whether these principals’ perceptions were accurate or a justification for not utilizing the new process. Two principals found it easier to remove teachers outside of the evaluation process. As one principal stated frankly: “I didn't give her a negative evaluation in certain terms of then having to evaluate her out. That would've meant that she would have to stay in my school for another year and I had to go through the whole long process thing. She was clearly not going to work out anyway and she was going to leave. She agreed to leave.”
The researchers found troubling state variations in ratings, noting, “The wide variability in teacher ratings across states suggests that system design features as well as local norms and implementation practices play large roles in shaping ratings distributions. Differences in underlying teacher effectiveness alone cannot account for why 1% or fewer teachers are below proficient in Hawaii but 28.7% are below proficient in New Mexico, or why only 6% of teachers in Georgia and 9% of teachers in Massachusetts are above proficient but 62% meet this higher standard in Tennessee."