Testing, testing: A history of the SAT

Photo by Renee Ge

Throughout its history, the SAT has struggled to define what it means to assess academic ability.

Renee Ge, In-Depth Editor

School closures. Canceled SATs. Test-optional or test-blind college applications. Before the COVID-19 pandemic, standardized testing was ubiquitous to the high school experience — millions of students took the SAT and ACT, which were administered nearly monthly, each year. However, the pandemic halted most testing around the world and accelerated a movement among universities to go test-optional for applicants who apply for admission during the 2020-21 school year. The decision casts a shadow on the future viability of standardized testing, such as the SAT exams, but it is important to look back at how the SAT came to be to answer the question of how it will change in the near future.

The first SAT, then called the Scholastic Aptitude Test, was adapted from IQ tests used in the Army during World War I by Carl Brigham, who worked in the military and would later join Princeton University as a professor in psychology. The Army used these IQ tests to assign recruits to tasks based on perceived intellectual ability, limiting participation in officer training to those who scored well.

At the time, the tests received positive publicity and helped lend respect to the emerging field of psychology, but the results were dubious and reaffirmed racist and nationalist sentiments in the U.S. In Brigham’s book, “A Study of American Intelligence,” where he analyzed data from the World War I army IQ tests, he concluded that native-born Caucasian Americans had the highest intelligence and that Eastern Europeans, Southern Europeans and Black Americans had inferior intelligence. On that premise, he warned of racial mixture lowering overall American intelligence. But the data from the IQ tests was deeply flawed. The test administrators assumed that the test-takers were equally fluent in English and the questions tested for familiarity in American culture instead of intelligence.

“[The administrators] are going to look at [the immigrants] who are not performing well on the test, and then extrapolate that there’s a problem with the newer immigrants coming in being genetically deficient, and the extrapolation they make doesn’t necessarily correlate to the actual data,” said U.S. History teacher Kyle Howden. “Because they were finding that if you’ve been here for a year, you did better on the IQ test. Well, that should be an indication that the IQ test is somewhat faulty.”

Sample questions from IQ tests used in the military. (Photo by Carl Brigham from “A Study in American Intelligence”)

Despite weaknesses in Brigham’s conclusions, his findings reinforced the idea that Caucasians are superior to other races, which was a widely-held belief at that time.

“Social Darwinism is coming out of the late 1800s, eugenics is becoming popular at that time, and then the IQ test fits within that,” Howden said.

Brigham later backtracked on his results, acknowledging that prejudiced administration and flawed research techniques made his conclusions baseless, but his work had already influenced anti-immigration legislation, such as the implementation of immigration quotas.

At Princeton, Brigham devised his own admissions test, adapted from the IQ tests he worked with in the past, and administered it to small samples of Princeton freshmen and applicants to Cooper Union, then an all-scholarship technical college in New York. 

His adaptation arrived at an ideal time. Before World War I, a national effort to centralize college admissions was already on the rise. In the seventeenth and eighteenth centuries, only the upper classes were expected to receive a college education, and colleges evaluated prospective students on the basis of character, background and proficiency in Latin and Greek. However, by the middle of the nineteenth century, and especially with the passage of the Morrill Act of 1862, which allowed for the establishment of land-grant colleges specializing in agriculture and mechanics, colleges began to favor a more practical education that emphasized a wider range of subjects, including arithmetic and the sciences. 

As entrance requirements became more complex, secondary schools, especially public schools, struggled to guide their students through the process. The College Entrance Examination Board, more commonly known as the College Board, consisted of 12 universities and formed in 1899 to standardize the college admissions process. The first exams in 1901 required students to write essays for English, history, mathematics, chemistry, French, German, Latin and Greek.

The College Board approached Brigham with a request to chair a new committee and develop the SAT, which would be used for a wide range of school admissions. It was first administered in 1926 and later popularized by James Bryant Conant, president of Harvard College, who established a scholarship program for low-income students in 1934 that required candidates to take the SAT in an effort to make the university more inclusive. A year later, Harvard began requiring the test for all applicants, and many universities soon followed.

Sample questions from the SAT in 1926. (Photo by Smithsonian Magazine)

In the following decades, the SAT became the standard test for college admissions in America. This was partly due to the development of machine-based scoring, which made it possible to mass-distribute standardized multiple-choice tests, making it cheaper than ever to administer the SAT. But the biggest factor contributing to the SAT’s widespread use was the rise of college applicants resulting from the G.I. bill, a law passed during World War II providing a wide range of benefits to military veterans, including college tuition payments.

“After World War II, when there was explosive growth of students going to college because of the G.I. Bill, colleges began looking around for a device to help them sort through these piles of applications, particularly large public universities,” said Bob Schaeffer, interim executive director of FairTest, an organization that addresses fairness and accuracy for standardized tests. “The SAT test was fair, accurate, simple and very cheap for colleges, as colleges pay nothing to use the test. All the costs are borne by students, their parents, and by their schools who provide the labor to administer tests. Colleges get the scores and a host of other information they use for recruitment for free. So it was a good deal.”

This dependence on standardized testing scores when evaluating applicants is encouraged by organizations like the College Board and the Educational Testing Service (ETS), a nonprofit organization founded in 1947 to take over testing activities and conduct research on educational measurement. The College Board turned over the SAT’s development operations to ETS, although the College Board still maintains ownership of and administers the SAT. The ETS also develops and administers the Test of English as a Foreign Language, Test of English for International Communication, Graduate Record Examination, HiSET and the Praxis test. Both the College Board and ETS have drawn criticism for monopolizing the testing market, their excessive profits and high trustee and executive compensation. 

The College Board and the ETS make their argument for the importance of the SAT test on the basis of predictive validity — that evaluating applicants’ academic performance purely by their high school GPA is not as effective or accurate as evaluating their GPA in combination with their SAT score.

“The combination of grades and test scores are the strongest predictor of a student’s likelihood to succeed in college. Grades and test scores serve as a check and balance,” Jerome White, the College Board’s Director of Media Relations and External Communications, said in an email.

Studies from the College Board confirm their claim of predictive validity. One conducted in 2001 found that SAT scores and high school records predicted academic performance, nonacademic achievement, leadership roles and postgraduate income.

“The College Board’s mission isn’t to ensure all colleges require the SAT, it’s to expand access to college for more students and help them succeed when they get there,” White said. “Whether required for admission or not, SAT scores help colleges create data-driven programs to ensure admitted students get the support they need to graduate.”

However, this claim has been questioned several times in the past when checked by independent organizations, sometimes by prominent figures in higher education. In 2001, Richard C. Atkinson, then the president of the UC, proposed to abandon the SAT test, then called the SAT I test, criticizing the American educational system’s overemphasis on an exam that, from his point of view, was not a meaningful measure of academic achievement. 

“The framers of [IQ] tests assumed that intelligence was a unitary inherited attribute, that it was not subject to change over a lifetime, and that it could be measured and individuals could be ranked and assigned their place in society accordingly,” Atkinson said in the paper Achievement Versus Aptitude in College Admissions. “Although the SAT I is more sophisticated from a psychometric standpoint, it evolved from the same questionable assumptions about human talent and potential.”

In a later speech, he cited a study that analyzed the records of nearly 78,000 freshmen who entered the UC system and found minimal correlation between SAT I scores and UC freshman grades, even when controlling for family income and socioeconomic status. On the other hand, the SAT II, a set of specific subject tests, was a much better predictor of freshman grades and less sensitive to differences in socioeconomic status. Atkinson argues that standardized tests should reflect curriculum taught in schools and that admissions policies should approach applications more comprehensively and holistically. 

This contributed to changes made to the SAT in 2005 in response to criticism from the UC system. Questions like the analogies from the verbal section and comparison items from the math section were removed from the test to better reflect high school curricula and a new writing section similar to the former SAT II Writing Test was added. The SAT, with a previous maximum score of 1600, changed to the new maximum score of 2400 when the mandatory new writing section was added. 

However, Les Perelman, a research affiliate at the Massachusetts Institute of Technology, found in 2005 that there was a high correlation between essay length and score received, and his criticism contributed to the College Board’s decision to make the essay section optional in 2016. The College Board also overhauled the exam in 2016 by shifting away from testing obscure vocabulary, focusing more on evidence-based questions, cutting down on areas covered in the math section, returning to the 1600-point scale and removing penalties for wrong answers to better reflect course material taught at high schools across America.

In Jan. 2020, the UC Academic Senate released a report which found that SAT test scores aided in predicting undergraduate GPA, retention and graduation rates. However, the report recommended that the UCs continue to decrease its reliance on standardized test scores, although it did not recommend a test-optional policy at that time because studies on the effects of evaluating applicants solely on high school GPA for universities as large as the UCs were limited.

However, some believe that standardized tests like the SAT act as a common measuring stick across different schools across the nation.

“I feel like there should be some kind of standardized test just so students have a way to show their college preparedness other than grades, because grades are so variable across schools,” said senior Neeraja Sripada.

While each university weighs the SAT differently, many universities in recent years have been weighing standardized test scores less and less.

“It’s more than just tests, even though that has been a big chunk of it,” said guidance counselor Nikki Dang. “The other big chunk would be what classes are you taking, and what grades are you receiving in those classes.”

Amid the ongoing debate over the effectiveness of the SAT, the most damaging blow to its credibility has been SAT scores’ correlation with socioeconomic status. 

“Test scores correlate very strongly with socioeconomic status,” said Schaeffer. “Kids from families with higher incomes and more parental education on average score significantly higher, since those tests are highly susceptible to pricey test prep courses.”

In a paper from Brookings Institution that analyzed scores on the math section of the SAT using College Board’s population data in 2015, the authors highlighted racial gaps in SAT scores, observing that Black and Latino students generally scored lower than white and Asian students. They acknowledged that family income also plays a factor but that it is difficult to disentangle from race because publicly available College Board data on class and SAT scores is limited, and they noted that, with the reliance of colleges on test scores for admissions, gaps in SAT scores due to class and race perpetuate inequality in American society.

While the SAT plays a role in reproducing inequalities in America, it is also an indicator of broader systemic issues relating to education and social mobility. A paper published in 2013 in the Teachers College Record explored the relationship between class and race by studying the effects of family income on SAT scores for Black and white students and found that family income had a nonlinear, differential direct effect on total SAT performance for both Black and white students and that in some cases the effect was nearly twice as large for Black students. It speculated that one of the factors contributing to the results was the effect of schooling — Black students are likely to be attending poorer quality schools due to residential racial and economic segregation, and property values and tax policies contribute to the quality of schooling in the area. It also suggested that parental education levels had some influence on SAT scores.

Some of these inequalities can be seen at Lynbrook as a result of varying access to test prep resources.

“There are things beyond the school that I can’t control,” said Dang. “Like whose families are going to put out money for SAT lessons, versus who are not.”

The inequalities in standardized testing and the college admissions process were keenly felt when Operation Varsity Blues, an investigation into a criminal conspiracy to influence college admissions decisions at several top American universities, was made public in 2019 and was well-documented by the media. Many involved in the scandal were affluent figures such as Felicity Huffman and Lori Loughlin, who were among 50 individuals charged for fraud and bribery-related offenses in a college entrance examination cheating scandal and fabrication of sports credentials. Methods for cheating on the SAT and the ACT tests included bribing psychologists to falsify paperwork certifying that students had a learning disability to give them access to test accommodations, bribing proctors to correct students’ answers and paying other people to pose as students to take the tests.

“That kind of shows some of the problems with the admissions process in general, that people can just pay their way into schools,” Sripada said. “The SAT is supposed to be this measure that is universal, but even that now is being affected by how much money the parent has and how much they can pay for other people to help their student take the test.”

The College Board has made some efforts to address the issue of inequality. At the start of the school year in 2015, it partnered with Khan Academy to provide access to free SAT preparation and introduced an “adversity score,”which was later dropped in Mar. 2019. The adversity score sought to address the common criticism that wealthier students scored higher on the SAT compared to low-income students by introducing an additional score on a scale from 1 to 100, in which a higher score meant that a student faced less adversity. The College Board considered neighborhood crime rate, poverty level and school quality, among other factors, to calculate the adversity score and planned to send the score to colleges along with the SAT scores, although test takers would not be able to see their adversity score. However, the College Board withdrew the adversity score due to criticism that adversity in life could not be accurately distilled into a single number.

“The idea of a single score was confusing because it seemed that all of a sudden the College Board was trying to score adversity. That’s not the College Board’s mission,” said David Coleman, the College Board’s CEO, in an interview with NPR. “The College Board scores achievement, not adversity.”

Instead of calculating data points from students’ backgrounds, the College Board instead launched a tool called Landscape which provides admissions counselors data like average neighborhood income and crime rates, but lets universities interpret the data themselves.

Because of the present COVID-19 pandemic, which caused school closures and the cancellation of SAT tests in the first half of 2020, many universities have elected to use a test-optional admissions policy. Test-optional means that while the universities still accept SAT and ACT scores, they are no longer required in the application.

The trend of eliminating SAT and ACT requirements for college admissions is on the rise due to COVID-19 and closer scrutiny, as universities reassess how much weight they should give standardized tests. Some believe that standardized testing will become obsolete in the near future. On May 21, the UC Board of Regents unanimously passed UC President Janet Napolitano’s proposal to eliminate the SAT or ACT requirement, planning to be test-optional for the high school classes of 2021 and 2022 before going test-blind for the classes of 2023 and 2024. Going test-blind means that UC would no longer accept test scores from the SAT or the ACT in prospective students’ college applications. However, a court ruling by Alameda County Superior Court Judge Brad Seligman on Sept. 1 has accelerated that process, meaning that UC admissions will be test-blind for the class of 2021.

“People are going to understand that they can still evaluate a student without tests, and are still really going to get a good idea of whether or not that student is going to be successful at their institution and fit with their institution or not,” Dang said.

However, many states across the U.S. require students, especially high school juniors, to take the SAT or ACT due to contracts with College Board or ACT, Inc. These states include Colorado, Connecticut, Delaware, Illinois, Michigan, Rhode Island and West Virginia, among others. It is unclear how the pandemic has affected these contracts and whether these states will continue to require the SAT after the pandemic.

Regardless, there is much uncertainty surrounding the future of standardized testing and college admissions in the U.S., but two conclusions remain clear: the SAT is paramount in America’s struggle to measure student academic performance in an education system where inequality is widespread, and the SAT plays a significant role in America’s continuous pursuit toward equality. Whether that role is negative or positive is left up to debate.

At its inception, the SAT was intended to objectively and fairly assess prospective students’ academic abilities for college admissions, but the test perpetuates the same socioeconomic and racial inequality that has been ingrained in American society for centuries. America’s obsession with measuring intelligence was rooted in beliefs of white supremacy and superiority, which were strengthened by Brigham’s results from the IQ tests. In the 21st century, the SAT is used widely as a measuring stick for college applications, but critics have raised questions about whether its content reflects high school curricula, its validity as a predictor for academic performance and its dependence on socioeconomic status.

The SAT has gone through many changes over the years, but some questions are still unanswered: How is academic or intellectual ability defined? How does one measure academic achievement in a world where access to educational resources is so unequal? How much of a role should this measurement play in one’s future?