The Value-Added Side of Standards

The Value-Added Side of Standards

author:	Chris Pipho
description:	"Standards and assessments have become the universal recipe for improving education. While national goals and state standards are active agents in this mix, the debates over outcome-based education and the second- and third-generation rewrites of standards documents have made the standards more useful. In some states, the wider acceptance of standards, followed by the release of one or more yearly assessments, has added up to a formidable combination. But the unobtrusive and symbiotic value-added activities that are now possible (because a steady stream of usable data has become available) are no less important than the standards themselves. Research activities and new reporting mechanisms tied to high-stakes consequences of assessments are building a base of support around these accountability tools. Instead of seeing standards and testing as the problem, even former critics can sometimes find a way to use the new data to support reform ideas." Reproduced with permission of Phi Delta Kappan (January, 1998, pp. 341-342).
published in:	Phi Delta Kappan
published:	01/01/1998
posted to site:	03/19/1998

THE VALUE-ADDED SIDE OF STANDARDS

BY CHRIS PIPHO

Reproduced with permission of Phi Delta Kappan (January, 1998, pp. 341-342).

Standards and assessments have become the universal recipe for improving education. While national goals and state standards are active agents in this mix, the debates over outcome-based education and the second- and third-generation rewrites of standards documents have made the standards more useful. In some states, the wider acceptance of standards, followed by the release of one or more yearly assessments, has added up to a formidable combination. But the unobtrusive and symbiotic value-added activities that are now possible (because a steady stream of usable data has become available) are no less important than the standards themselves. Research activities and new reporting mechanisms tied to high-stakes consequences of assessments are building a base of support around these accountability tools. Instead of seeing standards and testing as the problem, even former critics can sometimes find a way to use the new data to support reform ideas.

Tennessee's Value-Added Approach

Initiated in 1990, the Tennessee Comprehensive Assessment Program (TCAP) measures student performance in mathematics, reading, language arts, science, and social studies in grades 2 through 8. In 1992 the state legislature mandated an extension of this program, which is known as the Tennessee Value-Added Assessment System (TVAAS). Under the direction of William Sanders and a group of researchers at the University of Tennessee, a strong statistical model was developed that uses the scaled scores from the TCAP to develop a profile of academic growth for individual students. Instead of using stanines or percentile scores-commonly used in reporting norm-referenced test results-the scaled scores can indicate a student's current level of academic attainment. When collected over a period of years, these data can establish a profile of past and future academic growth. Sanders reports that, by statistically aggregating the "dimples" and "bubbles" in these curves from a population of students, the impact of school systems, school buildings, and individual teachers on student academic gains "can be fairly estimated."

Following a recent presentation to a group of Ohio policy makers, Sanders prepared a summary of this program. Large portions of that summary are reproduced here with permission.

"In the United States, individual states, districts, and schools have administered student achievement tests in various academic subjects for decades," Sanders' summary begins. In most instances, he argues, the use of these data has been restricted to placing an individual student somewhere along the distribution of the general population of students or to comparing simple mean scores between districts and schools.

The latter use has been correctly criticized for being "fraught with unfair misinterpretation because of severe socioeconomic biases which affect these rather simplistic views of the test data. The legacy of these misuses has created a commonly held belief that this type of data has little to offer constructively in both the accountability and assessment areas," Sanders writes. In contrast, the TVAAS users these very traditional achievement data as its input and applies a statistical methodology not previously used in educational assessment, which has demonstrated that the relative effectiveness of school districts, schools, and teachers in facilitating academic growth of student populations can be estimated with considerable sensitivity. "This methodology enables a massive mulitvariate, longitudinal analysis," Sanders states, "that in turn yields direct measures of the educational influences on student academic progress, free of the undesirable socioeconomic confoundings."

Without such longitudinal analysis, TVAAS would offer no advantages over traditional practice. According to Sanders, "The minimal requirements to implement a TVAAS-like process are: 1) test each student each year in important academic subjects, [and] 2) use test instruments which a) provide linear scales with appropriate standard errors of measurement, b) provide sufficient stretch to measure progress for the lowest- and highest-achieving students, and c) are highly correlated with curricular objectives." The following list is a partial summary of the research findings from TVAAS.

The single largest factor affecting academic growth of student populations is differences in effectiveness of individual classroom teachers.
The effects of class size and the degrees of heterogeneity of prior achievement within a classroom are but two factors whose impact on student academic gain pales in comparison with the differences in teacher effectiveness.
Research findings from TVAAS-recently confirmed by other research efforts-suggest that teacher effects are cumulative and additive, with little evidence of later compensatory gain.
The latent effects of teachers-both positive and negative-can be measured for at least three years after students have left the classroom, regardless of the effectiveness of the subsequent teachers.
Lower-achieving students are the first to benefit as teacher effectiveness improves. With many exceptions, the higher-achieving students do not have the opportunity to demonstrate academic growth at the same rate as lower-achieving students.
More variability in teacher effectiveness exists in the higher elementary grades than in the lower elementary grades. As the grade level increases, teacher variability increases, and for math the increase continues into high school.
In the aggregate, school principals have very little impact on the academic growth of their school population. Teachers are functioning as independent entities with little evidence of a community effect.
When populations of students change buildings, there is a measurable drop in academic growth for the first year in the new building. This is true regardless of the grade level.

Since 1991, the changes measured in Tennessee by TVAAS have shown that eighth-grade averages in math, language arts, and science have trended slowly upward, while averages in reading comprehension have trended downward. (Math results from the National Assessment of Educational Progress have confirmed the TVAAS results.) In addition, the percentage of extremely low-performing schools is decreasing slowly.

There's a good news/bad news part of this story of the new assessment process in Tennessee, and it stems from the same fact: Sanders has identified individual teachers as the most important factor in student achievement growth. Teachers probably always assumed this when assessment scores went up, but they looked to socioeconomic excuses when scores didn't go up. This program provides evidence that supports good teachers and can put the spotlight on poor teachers, and so it could give policy makers the assessment tool they have been looking for. The role of the principal in providing building-level leadership for instruction and in assigning students to teachers will become increasingly important as principals learn how to use the new tools. Sanders reports that variations in teacher effectiveness are often greater within a single building than across buildings within a district. With such data in hand, a principal will have to make sure that a student doesn't get assigned to a low-performing teacher for two years in a row. In Tennessee the district- and building-level information is covered by the state's open records law, but individual teacher data are excluded.

The Downside of Accountability

The use of standards and assessment programs for high stakes consequences-such as grade promotion, high school graduation, or district academic bankruptcy takeovers-seems to be on the rise. In all cases, early and multiple warnings followed be remediation efforts and improvement plans are important value-added elements of an accountability plan. While the passing rate for the first-time takers of high-stakes test is important, the impact of remediation and other programs can be tracked as high school students move on toward graduation.

In Tennessee, the sixth annual 21st-Century Schools Report Card, issued by Commissioner of Education Jane Walters, does include information on the percentage of first-time takers who pass the state's competency test in mathematics and language arts.

In North Carolina, the Johnston County school district has been sued by 14 students who are opposed to the use of "end-of-grade" exams that can deny grade-level promotion. They maintain that the tests were designed to measure districts and schools, not individual students. The state has three different versions of such a test being used in grades 3 through 8. When the tests are administered, different versions are used in each classroom. The state agency concedes that a single test would not give an accurate individual score but contends that using all three could provide useful data of other kinds. Plaintiffs in the case charge that the use of the tests is unfair to students who do not perform well on tests in general, such as minority and special education students. Currently, at the elementary level, 4.1% of white students in Johnston County, 11.4% of black students, and 10.2% of Hispanic students are being held back.

The Bottom 'Value-Added' Line

There appears to be growing support for more rigorous academic standards. Recent action by the New York Board of Regents to toughen graduation standards for high school students is one case in point.

The Sanders model, which divides teaching into five categories-from low to high effectiveness-based on whether their pupils score better or worse than anticipated over a four-year period, could change the whole landscape of building-level management. The big question is, Will these assessments be viewed as tools useful to the world of teaching and administration, or will they be viewed as something to attack? Meanwhile, standards and assessments are clearly becoming the backbone of the education reform movement.

For more information about TVAAS, contact: William Sanders, Professor and Director, University of Tennessee Value-Added Research and Assistance Center, P.O. Box 1071, Knoxville, TN 37901-1071; ph. 423/974-7189; e-mail: wsander2@utk.edu.

CHRIS PIPHO is a research professor at the University of Colorado, Denver, and senior fellow at the Education Commission of the Sates, Denver.