Education, From The Capitol To The Classroom

Why Some Educators Say The State’s A-F Grading System Isn’t An Improvement

A teleprompter aids state superintendent Tony Bennett as he delivers a televised address in September 2011.

Kyle Stokes / StateImpact Indiana

A teleprompter aids state superintendent Tony Bennett as he delivers a televised address in September 2011.

Superintendents offered a harsh critique of the state’s A-F grading scale for public schools at the statehouse earlier this month, writes Karen Francisco of the Fort Wayne Journal Gazette:

How uncomfortable Friday afternoon must have been for Republican members of the Indiana Select Commission on Education. Listening to hours of overwhelmingly critical testimony of the A-F school grading system — a centerpiece of the Republican administration’s so-called reform agenda — would be bad enough, but the critics included the Indiana Chamber of Commerce. Nothing like irritating the state chamber if you’re intent on maintaining your pro-business profile.

The problem, speakers said, is not the fact that the state is assessing schools. It’s how the state is making the assessments. And to understand why educators don’t like the system that will replace No Child Left Behind in Indiana, we dive into the world of norm- and criterion-referenced testing.

‘If We Don’t Implement, We Lose The Waiver’

Make no mistake — Indiana schools will get a letter grade next year, Superintendent for Public Instruction Tony Bennett said. It’s one of the conditions Indiana must meet to be released from the national NCLB standards.

“What we should be trying to do is really to build systems,” Bennett told legislators after nearly four hours of public testimony on the A-F grading scale. “Because as a group, we can’t probably solve the specifics but we can build systems that allow the (superintendents who spoke) to do just those things with their teachers and schools.”

Bennett agreed with the educators who said Indiana’s system isn’t perfect. But he also said the state didn’t have time to work out every kink before rolling out the A-F grading scale. Now that Indiana has its waiver, the legislature can use input to improve the system for the future.

Jon Gubera, the chief accountability officer at the Indiana Department of Education, stressed that without the rewired A-F scale, Indiana would have to again adopt NCLB standards.

“If we don’t implement, then we lose the waiver,” Gubera said. “The immediate effects of that is the C-capping mechanism would be back. So in other words, schools that failed in one single subgroup — even if they’re an A school — they’d be capped at a C, which was a major issue last year.”

In 2011, more than 300 Indiana schools would have received As and Bs if not for the cap.

Norm- And Criterion-Referenced Tests, Explained

But some of the superintendents and educators who provided testimony say Indiana’s letter grade system still isn’t the answer. They said Indiana’s growth model places too much emphasis on measuring how students’ test scores improve relative to their peers — and too little emphasis on what information they actually know..

“It is appropriate as a student growth measure, but when you begin to apply and use the same measure to measure the performance of teachers and schools, that’s where it breaks down.”
—Ed Eiler, Lafayette Superintendent

Here’s how the current system works: The state looks at all of the students who scored the same on the ISTEP+ in a given year. Then it tracks their academic progress over the next school year, again using standardized test scores. That helps the Department of Education determine how much growth on the state’s test is representative of one academic year. Students are then assigned a percentile based not on how well they scored, but on how their year-to-year scores ranked against their peers.

And here’s where norm- and criterion-referenced tests come into play. But first, the difference between the two:

  • Norm-referenced tests compare one test taker to the others. Norm-referenced tests can tell an educator if a student knows more or less about a subject than his peers. But they can’t say how much the student knows. When test scores are reported as a percentile, then the exam has probably been norm-referenced.
  • Criterion-referenced tests measure how much the test taker knows about the subject. In theory, everyone can pass a criterion-referenced test because a student’s score isn’t based on how well his peers perform on the test. A good example of a criterion-referenced test is a driving exam — there’s no limit to the number of drivers on the road, so as long as you follow the rules, you get your license.

Another example of a criterion-referenced test is Indiana’s ISTEP+ exam. That’s where things get tricky for the DOE. Indiana’s exam might be criterion-referenced, but the state’s growth model — borrowed from Colorado — is normative.

‘Bottom 33 Percent’ Still Needs Help, Superintendent Says

Here’s how Merrillville Superintendent Tony Lux explains his problem comparing Indiana students to their peers (the Fort Wayne Journal Gazette has the full text of his remarks, which are worth a read):

The phrase ‘poverty is not an excuse’ has become an excuse — an excuse to ignore poverty and disclaim any responsibility for it and its devastating effects. The strategy is becoming all too clear — ignore poverty, blame the effects of poverty on teachers, maintain the public perception of failing teachers and schools with an A-F formula that is designed to rank order students so that the bottom 33 percent will always exist (no matter how much achievement gains are made), use it to designate teachers and schools with low grades, then create a red herring for an impatient public by offering a placebo known as charter schools and school choice to appease them.

Invariably, Lux said, Indiana will have to designate one-third of its students as “low-growth,” and the state hasn’t given school districts the tools they need to help these students achieve. As a possible fix, he proposed mandatory summer school for students who aren’t meeting grade-level expectations.

“If the state can take a year out of a student’s life for not reading at grade level by grade three, how can it not be willing to take a month out of a student’s summer?” Lux asked.

“Frankly, 100 percent of your students can be high-growth.”
—Tony Bennett, State Superintendent

Outgoing Lafayette Superintendent Ed Eiler echoed Lux’s concerns, saying he wasn’t sure how to explain to a parent whose child scored well on standardized tests two years in a row why that student was low-growth. It’s more than a philosophical difference between norm- and criterion-referenced tests, he said.

“We really believe one of the flaws is we’re trying to use one measure, which is a student growth measure, for multiple purposes,” Eiler said. “It is appropriate as a student growth measure, but when you begin to apply and use the same measure to measure the performance of teachers and schools, that’s where it breaks down.”

Elier said he doesn’t buy the argument that the growth model averages out when it’s applied across the state. At the end of the day, educators in low-income or high-mobility areas aren’t teaching in a classroom with a perfect distribution of high-growth students, he said.

All Students ‘Can Be High Growth,’ According To Bennett

Bennett disagrees that the state’s model divides students into high-, typical- and low-growth. He says there’s been a tremendous amount of misinformation about “fixed percents” and the A-F grading scale.

“Frankly, 100 percent of your students can be high-growth,” Bennett told educators at the Select Commission on Education meeting.

He says that he’s been having conversations about putting a criterion-referenced test at the center of a norm-referenced model with Lux since the Department of Education unveiled its A-F scale. Bennett called the state’s model “statistically valid” and promised to work with educators to help dispel the myth that the system will divide Indiana students into thirds.

Bennett says the change he’d most like to see to the current system is more support for high-growth kids in the upper 25 percent.


  • Karyn

    Is Bennett saying that 100% of all students in the state of Indiana can be “high growth”…or that an individual teacher or individual principal could possibly have 100% of the students in his/her classroom or school rated as “high growth” students? The first does not sound at all possible under my understanding of the growth model. From the DOE website…
    Q: How much growth is good enough?

    A: We classify student growth into three bands:

    High Growth is from 66th to the 99th percentile,
    Typical Growth is from 35th to the 65th percentile,
    Low Growth is from 1st to the 34th percentile.

    Q: What average score is used in determining the 50th percentile at building level, district, state?

    A: The average is for all Indiana students in the same grade level and content area with that particular scale score, or the academic peer group.
    If the 50th percentile is based on the statewide average of the academic peer group…and a student at the 50th percentile is considered “typical growth”…how on earth can EVERY student in an academic peer group be “high growth”?!?

    I guess my primary question for the DOE would be why, if Indiana superintendents and other educators are having such a hard time understanding a model that has been in place for the past three years, they haven’t done a much better job of explaining exactly how it works! I have actually witnessed two IN DOE employees being asked a question about the growth model…and disagreeing with each other on the correct answer (and it was on a major concept). If this is basically the “be all and end all” of measuring student achievement, why is it okay that the DOE’s FAQ on the growth model hasn’t been updated on their website since 2009? (see date at the bottom, and see years referenced within the document itself)

    As a teacher, I don’t get to just shake my head and roll my eyes when my students don’t understand an important concept — I need to fix that immediately. The DOE doesn’t get to complain about “misinformation” being out there when they are the ones responsible for developing this system and educating people about it. If there is too much “misinformation” out there, it’s because there was a dearth of correct information out there in the first place, leaving educators to fill-in-the-blanks.

    • David

      Can someone assure that Karyn’s feedback gets to IDOE? It is all too common that programs are shoved in, and those who are responsible for doing so, and meting out consequences to when people don’t go along with them, get a pass on doing what they grade others on. I’m always appalled when education leaders do stupid things. Education is supposed to teach people to know and do the right things. I just don’t get it. The public deserves leaders who are servants of the needs of those they purport to lead! If our leaders won’t check their egos themselves, there are things citizens can do to help them do so.

    • Daniel Wydo

      We are about to implement student growth measures in the evaluation of teachers down here in NC, and there is legislation being offered by the Republicans to implement this school grading scale based on an aggregate of student growth per school (modeling it after Florida’s). I am just a peon teacher, so I can tell you what I think, but I am not guaranteeing that I am correct.

      I think he is stating that it is possible that every student in Indiana shows growth, but I think it stops there, because obviously when you employ the mean and percentiles, you are automatically pigeon-holing your groups. I think it could be likely to see whole schools with very high growth among students, but it would indeed be difficult to see EVERY student even at the school level show growth. I think the author should be have been clearer as to the context of this statement, but who could blame her? What I have noticed about growth scores and Value-Added Methods of calculating growth scores is that it implements doctoral level statistics, and uses complex matrices within multiple linear equations. Not very user-friendly obviously.

      The statement from Tony Lux is dead on.

      Also, this whole growth thing is taken directly from Bill Gates’s philosophy of “stack-ranking” (which occurs at Microsoft) where someone in a group is going to get the boot (no matter how well the performance) – the looming question for our families, and in terms of the time and energy we put into our professional degrees, is who?

      I would say it is high time for teachers to find a new line of work, but on the other hand we leave our kids and schools out to hang because that is exactly what the democrats and republicans want right now. It would be cheaper to sit a student down in front of a computer in a privatized “virtual learning environment” for 6 hours where there is low overhead (no teacher wages, benefits, or pensions). This is the ultimate goal.

About StateImpact

StateImpact seeks to inform and engage local communities with broadcast and online news focused on how state government decisions affect your lives.
Learn More »