Jeffrey Hinton
How Did We Get Here? A Brief History of American Standardized Testing
Updated: Dec 9, 2020

Schools in the early 20th century looked much different than they do today. At that time, the American economy was driven mostly by agriculture and manufacturing, two fields that did not require much more than a rudimentary knowledge of the three Rs, reading, writing, and arithmetic. Rural students could achieve this proficiency level with only a few years of formal schooling, usually in one-room schoolhouses. Urban students, many of whom were immigrants, became "Americanized," learning English, personal hygiene, and American history to include the democratic process. Professionalization of the teaching field was in its nascent stages and, only about ten percent of 14-17-year olds graduated from high school. It wasn't until the 1920s that Progressive leaders enacted compulsory education laws requiring students to attend school. However, between 1910 and 1940, in what has become known as the "high school movement," there was a rapid expansion of secondary schools in mostly non-Southern states. Schools were segregated by race and would remain so until the 1950s. High school attendance proliferated as schools began diversifying their curriculum. The newly created I.Q. test steered students into college preparation or vocational tracks. Despite the growing importance of k-12 education, it remained largely a local endeavor. However, America's participation in the Cold War would forever change the educational landscape by strengthening the federal government's influence on educational policy to include accountability and the use of high stakes tests.
On October 4, 1954, the Soviet Union launched Sputnik into low orbit around the Earth, changing not only the trajectory of the Cold War but American education as we knew it. As Sputnik circumnavigated the Earth, it emitted battery-powered radio pulses for the world to hear. For weeks Sputnik was a constant reminder that America's Cold War foe had struck a decisive blow in what would become known as the Space Race. Soviet technological supremacy struck a nerve and sent a chill down Americans' collective spine who believed that a missile gap had developed between the Cold War rivals. Sputnik was a clarion call to American policymakers who argued that the Soviet preeminence in rocket technology would give them a tactical advantage and resulted from severe flaws in our education system that did not produce the scientists, engineers, and mathematicians necessary to defeat our Cold War foe. In response to Sputnik, in 1958, the federal government passed the National Defense Education Act (NDEA). This was the first time the federal government passed comprehensive education legislation that would provide direct aid to public and private learning institutions at all levels. This law forever expanded the federal government's power over education policy by aligning education priorities with national priorities. NDEA was created "To help ensure that highly trained individuals would be available to help America compete with the Soviet Union in scientific and technical fields, the NDEA included support for loans to college students, the improvement of science, mathematics, and foreign language instruction in elementary and secondary schools, graduate fellowships, foreign language, and area studies, and vocational-technical training" (U.S. Department of Education, 2008).
The federal government's encroachment into the education space was not an entirely new idea. In 1787 the Continental Congress passed the Northwest Ordinance, which paved the way to statehood for the territories adjacent to the Great Lakes. Congress directed each territory to create schools for "good government and the happiness of mankind." During the Civil War, Congress passed the Morrill Act of 1862, which allowed for the creation of land-grant colleges using the proceeds from Western federal land sales. These colleges were to teach agricultural sciences, which many believed was essential to the national economy's continued development. Still, before the 20th-century lawmakers understood that matters of education were state issues protected by the 10th Amendment of the Constitution, which states that "The powers not delegated to the United States by the Constitution, nor prohibited by it to the states, are reserved to the States respectively, or to the people." The federal government's role in k-12 education would not significantly change until the 1960s and the Civil Rights movement.
In 1954, the Supreme Court unanimously ruled in the landmark Brown v. Board of Education case that segregation of the nation's public schools based on race was unconstitutional. The Brown decision overturned a sixty-year precedent of legal segregation known as the "separate but equal" doctrine. The court, led by Chief Justice Earl Warren, decided that legal segregation upheld by the Plessy v Ferguson decision was a violation of the 14th Amendment's equal protection of the law clause. Writing for the majority opinion, Warren penned, "We conclude that, in the field of public education, the doctrine of "separate but equal" has no place. Separate educational facilities are inherently unequal. Therefore, we hold that the plaintiffs and others similarly situated for whom the actions have been brought are, by reason of the segregation complained of, deprived of the equal protection of the laws guaranteed by the Fourteenth Amendment." The Brown case was a historic milestone in the fight for African American civil rights. By dismantling legal segregation in the courts, civil rights leaders would test the limits of de facto segregation across the south, launching the modern Civil Rights Movement. Brown did more than just dismantle legal segregation, however. It led to other important ideas concerning equity and access, significantly in the realm of public education. The Brown decision was successful in changing the law, but it would take activists' work on the ground to see that the law was carried out as intended. One such group was the Little Rock Nine. Comprised of nine African American students and organized by the National Association for the Advancement of Colored People, the Little Rock Nine endured a year of verbal and physical abuse to desegregate Little Rock's Central High School in Arkansas. Ultimately, the students had to be escorted to class by the 101st Airborne Division and the Arkansas National Guard. The Little Rock Nine was but the first salvo in a protracted fight to integrate America's classrooms. But as schools gradually integrated across the country, new conversations about the quality of education provided to America's students became a national issue. No longer would it be acceptable to have persistent gaps in achievement between Black and White children or rich and poor.
In response to the growing gaps in achievement, in 1965, Congress passed the Elementary and Secondary Education Act (ESEA) as part of President Lyndon Johnson's War on Poverty. Johnson's landslide victory in the 1964 presidential election gave him the political capital he needed to get through Congress, one of the most sweeping pieces of legislation concerning public education. The law would provide "quality and equality" by delivering over 1 billion dollars annually through its first statutory section called Title 1 to help cover the costs of teaching disadvantaged children. Additionally, the federal government authorized funds to assist in teacher professional development, instructional materials to include textbooks and library books, special education and parental involvement programs, and scholarships for low-income college students. ESEA was a radical departure from all other federal education legislation that preceded it because it forever cemented the federal government's role in providing every child a high-quality education. School under ESEA had become the great social equalizer intended to level the playing field for historically disadvantaged students. As Johnson stated in his remarks on signing ESEA into law, "It represents a major new commitment of the federal government to quality and equality in the schooling that we offer our young people. I predict that all of those of both parties of Congress who supported the enactment of this legislation will be remembered in history as men and women who began a new day of greatness in American society…as President of the United States, I believe deeply no law I have signed or will ever sign means more to the future of America." Johnson could not have known then that the law he had guided through Congress to provide resources for America's neediest children would also begin the country's ambivalent relationship with testing and accountability. The writing was on the wall, however. During Senate hearings to debate the ESEA, Senator Robert Kennedy asked rhetorically: "…I wonder if we couldn't have some system of reporting…through some testing system that would be established (by) which the people at the local community would know periodically…what progress had been made."
The assessment tool that Kennedy longed for was realized beginning in the mid-1960s when Congress authorized an exploratory committee to create the Assessment Progress in Education (ECAPE) test. The first national assessment of student learning, which would later be known as the National Assessment of Educational Progress or NAEP, was given in 1969. Administered by the National Center for Education Statistics (NCES), NAEP is the only national assessment that evaluates public and private school students in grades four and eight. NAEP's testable subjects include reading, mathematics, writing, science, history, civics, economics, foreign language, geography, and the arts. Referred to as the "nation's report card," NAEP is the oldest, continuously administered assessment of what students know and are able to do, which allows for longitudinal comparisons of student learning across state lines. NAEP is significant because it marked a departure from the federal government's educational data that mostly dealt with inputs like class sizes, attendance, and budget expenditures, to outputs such as student achievement. "In the early 1960s, Francis Keppel, then U.S. Commissioner of Education, recognized the need for a national assessment that would provide technically sound and valid data regarding pupils' knowledge, skills, and abilities... For nearly 100 years, reports issued by previous commissioners dealt primarily with summary descriptive statistics of "input" variables in the education system, such as per-pupil expenditures, attendance, number of classrooms, teacher salaries, enrollment, and so forth... Only during Keppel's tenure... (1962-1965) was any attention paid to gathering data on such "output" variables as how much students were learning and what progress was being made in U.S. education."
Nothing did more to cement into the public consciousness the notion that America's schools are failing and in need of immediate reform than a Nation at Risk. In 1983 The National Commission on Excellence in Education released to the Department of Education its seminal report titled, A Nation at Risk: The Imperative for Educational Reform. The Department of Education created the commission to assess teaching and learning quality in the country's primary, secondary, and postsecondary levels. There was a growing fear that the nation was not keeping up with the rest of the world in educating a competitive workforce. The report's primary author was James J. Harvey, who had synthesized the eighteen-member commission's findings made up of professionals from the private sector, government, and education. In its "open letter to the American people," the commission determined that "the widespread public perception that something is seriously remiss in our educational system" had been substantiated as the nation's students had fallen behind the rest of the world in the areas of commerce, industry, science, and technological innovation. Indicators of the risk included the following statistics: Some 23 million American adults are functionally illiterate by the simplest tests of everyday reading, writing, and comprehension. About 13 percent of all 17-year-olds in the United States can be considered functionally illiterate. Functional illiteracy among minority youth may run as high as 40 percent. Average achievement of high school students on most standardized tests is now lower than 26 years ago when Sputnik was launched. Many 17-year-olds do not possess the "higher order" intellectual skills we should expect of them. Nearly 40 percent cannot draw inferences from written material; only one-fifth can write a persuasive essay; and only one-third can solve a mathematics problem requiring several steps. Business and military leaders complain that they are required to spend millions of dollars on costly remedial education and training programs in such basic skills as reading, writing, spelling, and computation. (U.S. Department of Education, 1983). Harvey went on to point out that "the educational foundations of our society are presently being eroded by a rising tide of mediocrity that threatens our very future as a Nation and a people...If an unfriendly foreign power had attempted to impose on America the mediocre educational performance that exists today, we might well have viewed it as an act of war." The findings contained in the paper are not without criticism, however. Two of the report's original authors pointed out that they did not set out to be objective in analyzing the American public school system, quite the opposite. They began already alarmed by what they believed was a decline in education and looked for facts to fit that narrative. A governmental report published just seven years after the publication of a Nation at Risk showed that students standardized test scores indicated "steady or slightly improving trends" in student achievement.
While a Nation at Risk may have been a politically motivated attempt to prevent the Reagan administration from dismantling the Department of Education, the report used language that spoke directly to the American people about the need for increased achievement and accountability in the nation's schools. Eventually, a Nation at Risk received bipartisan support and would influence the school reform movement for years to come focusing on "character, content, and choice." Whether the report's findings were an accurate representation of students' abilities is still debated. However, the report did establish what would become a national refrain, that our schools are broken and they need to be fixed, provoking a national conversation surrounding the need for accountability and choice-based reforms to shape education policy for years to come.
A few years after the publication of a Nation at Risk, President H.W. Bush and a coalition of governors launched "America 2000: An Education Strategy." While not a formal law, it was a framework that sought to improve American schools and student achievement through the attainment of six national educational goals by the year 2000. Some of the goals included that 90 percent of students would graduate high school, students would become first in the world in science and math, and students "will leave grades four, eight, and twelve having demonstrated competency in challenging subject matter including English, mathematics, science, history, and geography; and every school in America will ensure that all students learn to use their minds well, so they may be prepared for responsible citizenship, further learning, and productive employment in our modern economy." In 1994, President Bill Clinton, signed into law the Goals 2000: Educate America Act. The law contained many of the reforms he oversaw while the Governor of Arkansas. This act transformed the America 2000 framework into an outcomes-based law resting upon voluntary national standards. The states would receive federal grant money if they committed themselves to the law’s goals. However, many lawmakers feared that it would result in a federal takeover of local education. And while the law prohibited the "Federal Government to mandate, direct, or control a State, local educational agency, or school's curriculum," it could coerce them into compliance with the lure of federal money. By 2000 few of the goals had been met, and Congress defunded the program in 2001.
One of the most transformational pieces of education legislation of the last twenty years was the No Child Left Behind Act (NCLB) of 2001. Signed into law by President George W. Bush as an update to the ESSA, NCLB resulted from an overwhelmingly bipartisan effort to transform American schools by providing an equitable, high-quality education for every student without regard to race or income. Despite the 200 billion dollars invested in ESSA since 1965, many believed that too many of America's children had fallen behind, creating an ever-widening education gap between the rich and the poor, English language learners, students with special needs, and minority children. Further, many believed that the nation's continued educational malaise was responsible for its lost standing in the world. Speaking to the National Association for the Advancement of Colored People's 91st convention, then-Governor Bush of Texas pointed out that "A great movement of education reform has begun in this country built on clear principles: to raise the bar of standards, expect every child can learn; to give schools the flexibility to meet those standards; to measure progress and insist upon results; to blow the whistle on failure; to provide parents with options to increase their option, like charters and choice; and also remember the role of education is to leave no child behind." Bush wanted to replicate at the federal level what many considered to be the "Texas Miracle" in state education policy by reducing drop-out rates, raising test scores and achievement of primarily disadvantaged students. Today, there is considerable doubt concerning the integrity of the claims made about the Texas miracle due to the creative management of educational data. One example was the deliberate omission of the low-test scores to calculate student achievement (Haney, 2000).
NCLB marked a turning point in federal education policy by holding states responsible for all students' academic progress. Bush believed that conditions had been allowed to let minority students languish in educational failure, NCLB was created to fight the "soft bigotry of low expectations." For example, in reading and math, the law required that states report test scores annually as an aggregate of all tested students and as individual "subgroups" including English language learners, special education students, racial minorities, and children from low-income families. Additionally, NCLB required states to bring all students to the "proficient" level in reading and math by the 2013-2014 school year. A goal seen by many as a "set up for failure." In fact, by 2015, no state had reached the one hundred percent proficiency benchmark. States were kept on track toward their goal by a policy known as annual yearly progress or AYP. Perhaps one of the most controversial aspects of NCLB, the failure of a school to reach AYP, resulted in a cascading litany of punitive measures. For example, if a school is labeled failing two years in a row it must allow students to transfer to a better-performing school. After three years, students are able to receive free tutoring. Persistently failing schools faced state interventions such as shutdowns, conversions to charter schools, or other "turn around" strategies such as rehiring administrators and teachers. (Klein, 2020). Additionally, NCLB required that all teachers be "highly qualified," meaning that they must have a bachelor's degree and state teaching certification. The law stipulated that highly qualified teachers were evenly distributed between wealthier schools and schools with high poverty rates, but it was seldom carried out.
Criticism of NCLB has been significant. Including the fact that few students left failing schools for better-performing ones or took advantage of tutoring opportunities. However, the most significant criticism of NCLB has been the federal government's domination of educational policy, including the increased importance of standardized tests. Standardized testing in education is not a new phenomenon. Students have been taking tests such as NAEP, and the Iowa Test of Basic Skills, for years. What is different is that those tests are used for diagnostic purposes, meaning that the data collected is used to make instructional and policy decisions to increase student achievement. NCLB used standardized test scores to coerce schools into compliance. Due to the emphasis on reading and math scores to reach AYP, many schools significantly narrowed their curriculums by drastically reducing or eliminating the amount of time on untestable subjects such as social studies, art, music, physical education, and foreign languages. Further, "drill and kill" test-taking strategies became a dominant pedagogy for many teachers hoping to raise student achievement scores. This was especially true with "bubble kids," or students who were within a few points of reaching proficiency. Unfortunately, the emphasis placed on standardized test scores incentivized students' marginalization at the very top and bottom of test-takers.
Under NCLB, states were able to administer their own standardized tests for measuring student achievement. However, the law also required biennial NAEP assessments in reading and math in fourth and eighth grade and at least once in high school beginning in 2003. NAEP mandated that the National Center for Education Statistics report achievement data separate from demographic data such as race and ethnicity, gender, economic status, English language ability, disabilities, and parental education. This way, a more accurate picture of student academic achievement could be ascertained. Rather than lumping all students into state averages, states would be held responsible for individual subgroups in reporting AYP. NAEP tests in reading and math were reported for each subgroup using a 0-500 point scale. In reading, a score between 243-280 is considered basic, 281-322 is proficient, and 323-500 is advanced. A score of 262-298 is basic in mathematics, 299-332 is proficient, and 333-500 is advanced.
The federal government required NAEP assessments for two reasons. The first was to make sure that all fourth and eighth-graders reached proficient reading and math levels by 2014, closing the achievement gap between socio-economic advantaged and disadvantaged students. The second reason was that the NAEP would serve as an indicator of the rigor of state-level tests. Subsequent studies of NAEP results between 2003 and 2017 indicate that most students were in fact, left behind. While there were some gains for low-income children scoring below NAEP basic levels, a large percentage of eighth-graders failed to demonstrate minimum levels of achievement by reaching NAEP's "basic" threshold in reading and math (Stevens, Tracy, Baker, & Wolters, 2020). A Brookings Institute study titled The Impact of No Child Left Behind on Students, Teachers, and Schools (2010) found that there were some targeted gains for younger students in mathematics, particularly those from disadvantaged backgrounds. There were, however, no gains in reading achievement. And most importantly, studies indicate that NCLB did nothing to mitigate the persistent and pernicious achievement gap. In other words, NCLB and Title 1 spending under ESSA has failed to achieve its primary goal of student academic parity in the nation. This failure, however, has come at a hefty price.
From 1964 and the implementation of ESEA to its most recent iteration, the Every Student Succeeds Act, signed into law by President Barrack Obama in 2015. The cost of public education has skyrocketed with little progress in standardized testing to show for it. Since 1972 per-pupil funding has doubled in thirty-one states and by 2016 tripled in fourteen states and the District of Columbia. In 2017 the total combined spending of local, state, and federal agencies on public education was almost $708 billion. An international comparison of educational investment reveals that in 2015 the United States spent on average $12,800 per full-time-equivalent (FTE) on elementary and secondary education, that is thirty-five percent higher than the Organization for Economic Cooperation and Development (OECD) average of $9,000 (Education Expenditures by Country, 2020). In that same period, the United States spent 3.5% of its GDP on education, the third-highest in the world behind Norway and New Zealand, a higher percentage than was spent on the military according to the World Bank. NCLB has dramatically contributed to the growth in education spending. For example, the law saw an increased average per-pupil expenditure of six hundred dollars, with little matches in federal support to offset the cost requiring cash strapped states to shoulder the enormous financial burden (Dee & Jacob, 2010). Further, state testing required under NCLB has been estimated to cost states 1.7 billion dollars annually (Ujifusa, 2020). The return on investment of American education dollars has been less than overwhelming.
High stakes standardized testing under ESEA not only wasted billions of dollars with no appreciable educational achievement, tightened federal control over education policy and dollars, but it also created incentives for teachers and administrators to cheat to game the system. The most prominent example can be seen in the Atlanta Public School cheating scandal in 2009, considered by many to be the largest school cheating scandal in the nation's history. Eleven Atlanta educators were found guilty of felony racketeering, resulting from an investigation looking into dramatic increases in student test scores on standardized tests. Forty-four schools, nearly one hundred and eighty employees, including thirty-eight principles, were accused of manipulating test materials to inflate test scores and avoid penalties set by NCLB. Atlanta was not an isolated case, however. In 2011, USA Today investigated standardized test scores in six states and Washington D.C. and found "1,610 examples of anomalies in which public school classes—a school's entire fifth grade, for example—boasted what analysts regard as statistically rare, perhaps suspect, gains on state tests." The image of handcuffed educators being led out of the courtroom convicted of felonies will be forever linked to NCLB as an indelible reminder of the pernicious unintended consequences of high-stakes standardized tests.
ESSA has failed to produce the academic gains promised by politicians for almost sixty years, despite significant investments from state and federal government. The intended goal of ESEA, and its various iterations over the years, has been to even the academic playing field so that all children, regardless of race or zip code, have access to a high-quality public education. While the intended purpose of ESEA was noble, the reality is that there are significant factors outside of the school’s control that have impacted student achievement particularly for students living in poverty. According to Harvard professor Robert Putnam in his book Our Kids: The American Dream in Crisis, the ability of education to act as the “great leveler” has significantly eroded over the years due to disparities in access to enrichment programs, such as daycare and extracurriculars. Further, students in less affluent communities do not have the support that their more affluent peers have in moving through the education system. The achievement gap therefore “is created more by what happens to kids before they get to school, by things that happen outside of school, and by what kids bring (or don’t bring) with them to school—some bringing resources and others bringing challengers—than by what schools do to them."
What has not been commonly discussed, however, is the opportunity cost associated with ESSA. While schools were preoccupied with raising test scores using "drill and kill" approaches, students and teachers suffered the consequences of a curriculum that stripped them of the things they enjoyed about school such as igniting their curiosity and passion through hand-on learning, role-play, simulations field trips, to name a few. Under NCLB, many schools become little more than test-taking factories where students were reduced to data points, and outputs were measured with cold dispassion without regard for the development of the "whole child." "soft skills" such as creativity, problem-solving, collaboration, and critical thinking were ignored as teachers were discouraged from teaching anything that was not tested. The emphasis on testable subjects to exclude mostly everything else will significantly impair American children to become the innovators needed to be competitive in the future. What innovation we do achieve will be despite our education system, not because of it.