Searching for Answers to the Testing Question | Teachers College Columbia University

Skip to content Skip to main navigation

Searching for Answers to the Testing Question

Assessment and Evaluation Research Initiative's second conference commemorates TC's legacy in testing, advocating assessment for learning.
Assessment and Evaluation Research Initiative's second conference commemorates TC's legacy in testing, advocating assessment for learning.

By Siddhartha Mitter

“Assessment has become the driver of education. If we don’t get assessment right, I suspect we won’t get education right.”

That admonition, delivered by TC emeritus professor Edmund Gordon, was the central focus of “Testing Then and Now,” a day-long conference held at TC in early December that sought to move the debate over evaluation beyond the reflexive reactions that have made it one of the most contentious issues in American education.

The conference sought to connect the historical roots of the educational assessment enterprise with contemporary issues as well as to bridge the gap between the “psychometricians” who design tests and the policymakers and educators who implement them. Building consensus among those groups is part of the mission of the Assessment and Evaluation Research Initiative (AERI), the TC center which hosted the event.

“Test developers and affiliated researchers should make reports and evidence more understandable to the public and lay users,” said conference organizer Madhabi Chatterji, Associate Professor of Measurement, Evaluation and Education and Director of AERI. “And test users could become more informed consumers of tests and test-based information. They should make decisions that are supported with validity evidence, and curb the over-interpretation or over-ambitious use of tests and test-based information.”

“Testing Then and Now” was co-organized by AERI, TC’s Institute for Urban and Minority Education, the College’s Department of Education Policy and Social Analysis, and the Gordon Commission on the Future of Assessment in Education, a group of 31 leaders from academia and policy chaired by Gordon himself for the past two years at the behest of the Educational Testing Service.

While the conference ultimately sought to chart a stronger future for educational measurement and evaluation, much of the focus was on mining lessons from the field’s rich past, in which TC has played a huge part.

“Since the time of Dewey and Thorndike, this institution has been a place of debate about psychology, measurement, its use in learning and the fairness of that use,” said TC Provost and education historian Thomas James.

James was referring to psychologist Edward Lee Thorndike, who during the first decade of the 20th century published Theory of Mental and Social Measurements, and developed the first “standard scale” to measure student performance. Thorndike’s son Robert Ladd Thorndike also made “major contributions as a test developer, psychometrician, industrial and military psychologist, and educational researcher,” in the words of Neal Kingston of the University of Kansas.

Many of the conference participants argued that a return to the Thorndikes’ emphasis on the diagnostic uses of testing is critical to the field’s future.

“If there was any consensus” among the Gordon Commission’s members, “it’s that it’s time to use assessment to inform teaching and learning,” Gordon told the conference. The Commission released recommendations earlier this year that centered on transforming assessment to serve as an ongoing, context-sensitive, and supportive input in education – rather than as a snapshot exercise with punitive policies tied to results.

But how to realize that vision?

Chatterji argued for a stronger focus on understanding the contexts in which tests are administered, and on using tests for the purposes for which they were designed. Saying that both diagnostic assessments to support individual learning in the classroom and large scale standardized assessments for evaluating school programs each have their place and purpose, she said there is some work ahead to improve both.

“The “use” is part of the validity,” she said. “Validity, test use, and consequences are inseparable.”

Yet at present, Chatterji and other speakers contended, policymakers fail to recognize those connections.  

“The current assessment infrastructure is at odds with the challenge” of public education today, said Kent McGuire, president of the Southern Education Foundation. “The focus is disproportionately on accountability, not learning. We are preoccupied with generating summative information that doesn’t really provide policy-makers or school leaders with the information they need to bring about large improvements in learning.”

For example, McGuire said, “the testing window is in April and May, the results come back in the summer, there’s nothing for parents and nothing that goes back to the students for them to improve.”

These longstanding issues threaten to become even more problematic with the advent of the Common Core standards for mathematics and English language arts, which 45 states have adopted. Chatterji said that while the Common Core is well intended and offers the chance to raise standards for vast numbers of students, the initiative’s implementation of the testing program could jeopardize those hopes. “The tests are used as a top-down policy tool; there are high stakes tied to results of student testing for schools, teachers, and school leaders; and there is an unrealistic timeline for policy implementation.”

Until the focus changes, others argued, the current public backlash against testing is likely to continue, uniting a broad coalition of disparate players and limiting the potential contributions of the field.

Jeffrey Henig, Professor of Political Science and Education, argued that the anti-testing “movement” is not so much an actual movement as a “coalition of strange bedfellows.” It includes suburban parents and urban “opt-outers” who won’t let their children take the tests, public-school funding advocates who decry “dumbing-down,” and groups that see testing as a facet of federal intrusion or an unnecessary government expenditure. “These groups are linked by dissatisfaction but have different core goals and values,” Henig said. “This could lead to many different outcomes depending on how the groups shift and align.”

The conference offered hope that the assessment community will join forces with educators and policy-makers to shape new assessment systems and influence their use in ways proposed by the Gordon Commission.

“Testing as a technical matter has been getting better,” Henig said. And Robert Mislevy, the Frederic M. Lord Chair in Measurement and Statistics at ETS, pointed to assessments that are more customized to individual learners and make creative uses of new technology in areas ranging from an AP studio art class in New Jersey to the Cisco Networking Academy, National Board of Medical Examiners, and certain community colleges.

“The projects where I’ve seen the biggest changes are at the margins, not in the big education machine.” Mislevy said. “Psychometricians and testing companies have the concepts and are already working on them. They can adapt to new kinds of assessment quickly or slowly – but it depends on what people want.”

Published Thursday, Dec. 12, 2013