Corpus Linguistics in the Supreme Court

Since I have been fascinated by this topic for many years, it was a pleasure to have Justice Thomas Lee (Utah Supreme Court) and Stephen Mouritsen guest blogging in 2017 on their outstanding work. Yesterday’s debate was about Prof. James Phillips and Prof. Jesse Egberts upcoming article. An analysis of the ‘Foreign Tribunal’ corpus linguisticsSo I am very happy to have the opportunity to forward this information from Professor Phillips.

Yesterday, oral argument ZF Automotive US, Inc.The Supreme Court reviewed a paper that we had recently written. We performed corpus analysis on the word “foreign Tribunal” to determine how it was being used at the time that it was included in the relevant statutory provision. A few justices were unsure whether they would rely on our findings.

Chief Justice Roberts admitted, “I don’t know what to do with that.” That’s … something new. Have we ever relied upon that source in the past?” Counsel replied that this was the type of method used by Court in case called MuscarelloIn this instance, the majority opinion focused on the New York Times articles that used the verb “carry”. The Chief Justice of the United States asked the majority, “[H]Have I done this before?

Counsel stated that Chief Justice in his opinion was correct. AT&T The Court also employed this method. Justice Barrett stated then that the Court had never before used Corpus Linguistics. She noted that two lower courts have—the Sixth Circuit and the Utah Supreme Court—but repeated that “this Court has not.” She also described what she said the Court did. Muscarellothe opinion of Chief Justice in AT&T Both were “a more informal study.”

There are many responses to the colloquy. We also note that petitioners’ counsel did an excellent job of describing and protecting corpus linguistics.

While it is true that Chief Justice Has never used one of these “sources”, and that Court has not previously used the Corpus Linguistics Database in majority opinions, that’s a bit like worrying about using briefing that relies on LexisNexis cases. The Court has done all its research in Westlaw. Some of the documents that the Court looked into in the past are the ones used to create these corpora or databases. For example, the Corpus of Historical American English (COHA), hosted by Brigham Young University, includes articles from the New York Times—the very articles the Court was content to rely on in its more “informal” corpus linguistic analysis in Muscarello. If one searches for meaning in a corpus, a New York Times Article can provide valuable information.

A second, related point is that the Court was willing to do “a less formal survey”, so a rigorouser one shouldn’t be treated with suspicion. It is like saying you are fine asking just a few of your neighbors who they would vote for, and then drawing an inference about who the winner will be, but have serious reservations about polling a larger, random national sample.

It is a fact that the Court has sampled legal and everyday language texts for a long time to gain an understanding of when a term or phrase was used. Justice Thomas looked at the Federalist Papers. Justice Ginsburg turned his attention to poetry. Justice Kagan referenced Dr. Seuss. Justices have always been open to this method of analysis. The Court may not be able to generalize due to the small size of past texts and the insufficient representativeness to allow it to apply to the people and type of language they are interested in. Corpus Linguistics provides us with both corpora and corpus analysis. However, it is important to have confidence in generalizations.

We also looked at five other sources during our research. Justice Barrett called the COHA a “Corpus Linguistic databank”, but this corpus provided very few details. While a second was a corpus—BYU Law School’s Corpus of Supreme Court Opinions of the United States (COSCO-US)—it merely consists of Supreme Court opinions. The exact same search and the exact same analysis—where the search results were just read in context—could have been done in Westlaw. Other databases that were searched by the Court in the past included Westlaw, HeinOnline Core U.S. Journals and HeinOnline U.S. Code. The Court can ignore a small portion of COHA’s analysis, if it so chooses. A majority opinion never has cited COHA before. We don’t understand why the Court would disregard analysis it relies upon regularly. Our analysis of the Court’s four other sources is sufficient to clarify the meaning and usage of “foreign tribunal.”

Additionally, Chief Justice Roberts and a majority opinion have never referenced one of these corpora, but other members of Court have. Justice Thomas used BYU Law School’s Corpus of Founding-Era American English in his dissent. Carpenter v. United StatesIn another case, Justice Alito cited corpus-based scholarship. Justice Alito has twice referenced scholarly articles which rely on one of these corpora, in two separate opinions that he’s written.

There may also be doubts about the method, probably more than just the source. The Court, as we have already mentioned, has been performing “informal corpus linguistics” for a long time without actually calling it such. We could also replicate what we did in chambers.

A Chief Justice could ask one his clerks to look at the 100 times the term “foreign tribunal” was used in Westlaw by the Supreme Court before 1964’s enactment. He could then have two clerks independently read each instance and determine whether the more narrow, government-authority sense or the broader, private/non-government authority sense was being used. The clerk could be able to compare the frequency with which they agreed, and the proportion of each sense that was found.

No wonder corpus linguistics was called Westlaw on steroids. The Chief Justice can repeat the same analysis in U.S. Codes, Supreme Court opinions and law reviews. You can also replicate the remainder of our analysis from our instructions or appendices.

A corpus linguistic analysis, although it is more frequent in lower courts than the courts named by Justice Barrett’s, is also common. However, we acknowledge that she wasn’t trying to give an exhaustive list. We currently have about 3 dozen opinions in 22 lower courts. This includes six U.S. Courts of Appeal as well six District Courts and four State Supreme Courts.

The respondent’s attorney then attacks the study. Although he claims it’s self published, the Virginia Law Review Online will publish it. The Court has used articles from SSRN in court before before they become published. It is filled with gaps. He doesn’t describe what they are. Although he claims that it was inconsistent whether there were three or four coders, it is not evident that it matters. It’s clear that two coders looked over the material for every analysis (not always the exact same coders).

He claimed that it only proved that the expression didn’t have any meaning prior to 1964. The team was able to find a handful of hundred uses. Both of these statements are false. Both statements are false, as our research revealed. And we only analyzed 259 uses because when we found hundreds in a specific corpus or database we sampled those uses closer in time to 1964—the year the term “foreign tribunal” was adopted in the statute in question. We found thousands of possible uses for the term. However, our paper focuses on those that are most relevant to the time. It is worth noting that 259 instances of “foreign Tribunal” are exponentially greater than what respondents or petitioners have suggested. Moreover, the Court has traditionally relied on “informal surveys” when performing more formal survey.[s].”

Corpus linguistics can do what dictionaries cannot—namely analyze words AndYou can use phrases to show the meaning of the word in any given context. We believe that the Court needs to supplement its dictionary usage with corpus analysis when presented by parties, just as the Court moved from having paper copies of lawyers to looking online for cases. It will be easier for the Court to feel more certain that it is uncovering legal meanings.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.