More political scientists doing bad linguistics

There’s a new article out that uses faulty methods to study the linguistic complexity of politicians’ speech. It makes many of the same mistakes that I criticized Schoonvelde et al. (2019) for – and even references that article. But it somehow comes to the right conclusion… for the wrong reasons. I know, it’s strange. Let’s check it out.In a recent study titled “The Language of Right-Wing Populist Leaders: Not So Simple” (2020), McDonnell & Ondelli analyze speeches by politicians in the US, UK, France and Italy. Their goal is to see whether populist politicians’ speech is simpler in comparison to their more mainstream political opponents. They are motivated by reports which say Trump speaks in a simpler way than Hillary Clinton.

This is all fine and good, but the authors use faulty methods to study linguistic complexity. I’ll mention at the top that neither author is a linguist. I’m not saying that being a linguist or language scholar is necessary to do linguistic research, but in this case it probably would have helped prevent the authors from making simple mistakes.

I’m going to go into detail below (a whole lotta boring detail), so I’ll summarize up here my main concerns with McDonnell & Ondelli’s study.

Main criticisms

  1. The study needs more sources for the claims made, especially the ones supporting the methodology

McDonnell & Ondelli claim multiple times that their methods are “well-established” but they’re really not – or at least it should be easy to slip a reference in when they make this claim. The authors cite sources which make the exact same mistakes. For example, they cite Schoonvelde et al. (2019), which is a study that I commented on and showed that it should not have been published. I spent a lot of time pointing out the problems with that study and I included many sources. Despite this, the McDonnell & Ondelli cite that study approvingly.

  1. Garbage in, garbage out

McDonnell & Ondelli’s study uses the same faulty methodology as other studies (Schoonvelde et al. 2019, etc.), so it suffers from the same problems. These problems are:

  • Confusing written language with spoken language
  • Using an ineffectual test for written language on spoken language
  • Not taking into account how transcriptions and punctuation affect the data
  • Citing almost no linguistic sources in a study about language

The tests used in this study are dependent on periods. Periods, like all other forms of punctuation, does not appear in speech. Punctuation is a feature of written language. But this study is a study of political speech. This is a basic error.

  1. Right answer, wrong reasoning

The authors somehow come to the correct conclusion – that their methodology can’t show whether one politician’s speech is more complex – but they do so without recognizing that the issue is with their analysis. The tests that they use to study language could have shown anything. The fact that the tests showed that the politicians were quite similar in their complexity is a fluke, not a result of a sound analysis.

The authors do end up making some good points in their study. I’ll show those too because I think they deserve credit for them. But first, the bad stuff.

Lack of sources, lack of knowledge about the subject

In the Intro, McDonnell & Ondelli lay out the tests that they use to study the linguistic complexity of politicians’ speeches. They are:

  1. The Flesch-Kincaid tests
  2. Type/token ratio & Lemma/token ratio
  3. Ratio of function words to content words.
  4. The Dale-Chall test (for measuring the amount of “difficult” words)

The authors claim that these are the “main measures employed by linguistics scholars for evaluating simplicity” but they don’t offer any references for this. I don’t think linguists would use these tests, but what do I know? If these measures are so “main” it should be easy enough to provide some references. Later, in the Background, the authors state that their approach is “in line […] with a long tradition in linguistics research” but they don’t give any sources for this claim. I don’t know of any. Do you? Give me one of those studies in that long tradition in linguistics research. Instead they offer a list of recent studies which “have sought […] to analyze the simplicity of populist and other politicians’ language”, but only one of the seven sources that they cite appears in a linguistics journal (Wang and Liu 2018). This is worrying.

They do it again in their Methods section, where they state:

To be clear: we are not proposing any new methods here, but are using an array of standard, well-established methods that have long been used by linguistics researchers (and, occasionally, political scientists) to assess the simplicity of language.

Saying the same thing over and over again doesn’t make it true. If these methods are so standard and well-established, then cite some of them. Who are they? When are they? Are linguists still doing this research? Or have they abandoned it because it’s based on shoddy tests (looking at you, Flesch-Kincaid)? It shouldn’t be hard to give us some studies throughout the last few decades which use these same tests.

The lack or sources problem comes up again at the end of the Methods section:

What we have done, as explained, is to carefully assemble a sufficiently large set of comparable speeches for each of our four country-cases, which we examine with a wider range of well-established linguistic measures than has been used before, to analyze the comparative simplicity of populist and non-populist language.” (emphasis mine)

I really need some sources on these “well-established linguistic measures”. Just because some people have used some of these measures in the past doesn’t mean that we should keep using them. And one of the studies that these authors cites specifically says that the readability tests are a “crude” way to measure linguistic complexity (more on that later).

By now you’ve probably figured out what’s up. Linguists don’t use these measures to study complexity in language. There have been studies which discuss how complex speech and writing are in comparison to each other, but they use other features to describe complexity. The studies by linguists discuss how many subordinate clauses are used, how many modifiers are in the noun phrases, or the various ways that cohesion is achieved. In speech, there are paralinguistic and non-verbal ways of establishing cohesion (eye gaze, gesture, intonation, etc.), while in writing, cohesion is achieved through lexical and syntactic structures (because writing can’t make use of the metalinguistic and non-verbal features, whereas practical memory constraints force speech to avoid lexical and syntactic cohesive measures). But linguists aren’t in agreement about whether speech or writing is more “complex” – there are various studies pointing different ways, and the methods used in each study are going to influence the results and conclusions of the studies. (See Biber 1989, Chapter 3, for an overview of these not-so-recent studies)

Another way that the authors show a lack of knowledge and a lack of sources about the topic is when they discuss the average length of each word in syllables. They state: “we find that Trump uses slightly more words that are longer than three syllables (10.97% versus 10.75%, and an average word length in syllables of 1.44 versus 1.43)”

But the authors need to prove that words with more syllables are more “complex” than words with fewer syllables. This isn’t something we can take at face value. Is December more complex than May? Is Saturday more complex than Monday? Is Afghanistan more complex than Russia? And besides, the numbers between Trump and Clinton are so close that the difference could come from quirks in the English language.

There are two studies referenced in the Background that provide a foundation for this article: Oliver & Rahn (2016) and Kayam (2018). These are two studies on Trump’s language – neither of which was done by a linguist or published in a linguistics journal. Why do you think that is? Hmmmm…. Maybe this linguistic complexity stuff isn’t worth it. Because there are linguists studying Trump’s language, but not like this.

McDonnell & Ondelli do manage to cite an article that was published in a linguistics journal (Language and Society, Wang and Liu 2018), but this article uses the F-K test. If you don’t know what’s wrong with this, stay tuned for a facepalm.

And finally – saving the best for last – in the Background, McDonnell & Ondelli say that Schoonvelde et al. (2019) “provide more support for the claim that right-wing populists use simpler language than their opponents.” Friends, I lol’ed. I wonder if McDonnell & Ondelli read my comments on that article. The title of the comment thread was called “There are significant problems with this study”. I posted them almost immediately after the article was published and I showed (with resources) that the methods in that article (the same methods that McDonnell & Ondelli use) were critically flawed.

Garbage in, Garbage out

The fundamental problem with McDonnell & Ondelli’s study is their methodology. They test for linguistic complexity using measures which cannot show how simple or complex the language is.

Let’s take these one by one.

F-K test

First, McDonnell & Ondelli use the Flesch-Kincaid (F-K) test. This test measures a text for average word length (in syllables) and average sentence length. It then applies a score to the text which says which education level is needed to understand the text (based on the US education system in the 1970s). The test is therefore dependent on sentence-final punctuation, aka “periods” (or full stops for our friends across the pond). Savvy readers will immediately notice that McDonnell & Ondelli are studying speech with a test that depends on periods… even though periods don’t exist in speech! This apparently flew right over the heads of the authors. They state:

As Schoonvelde et al. (2019, 5-6) discuss, FK and similar tools have long been used to analyze political speeches. While common, their suitability for assessing the simplicity of contemporary political language has been questioned recently by Benoit, Munger, and Spirling (2019), on the grounds that such measures cannot distinguish between texts that are “clear” (which is of course positive) as opposed to “dumbed down” (which is negative).

The suitability of the F-K test was also questioned in the comment section of Schoonvelde et al. (2019)… by me. 😘 And I showed exactly why the F-K test is garbage for studying speech (so I won’t bore you with the details here). Basically, the F-K test is overly simplistic and was designed to work on one style of WRITTEN English, so it can’t be applied to SPOKEN English. These are better grounds to question the suitability of the F-K test, and are COMPLETELY OBVIOUS to people with a basic knowledge of linguistics. This problem is sort of related to the problem of a lack of (knowledgeable) sources on the topic.

In McDonnell & Ondelli’s defense, they didn’t apply the F-K test to the non-English languages that they studied. They could at least see that a test developed to study English might not be appropriate for languages that aren’t English. So good job! I’m not being sarcastic here. Schoonvelde et al. (2019) applied the F-K test to Spanish, French, German and Dutch because … fuck it, why not? So good on the McDonnell & Ondelli for not doing this.

But… the tests that they do use on the other languages don’t seem to be much better. For French they use the Kandel-Moles index, which is just the F-K modified for French. For Italian, they used the Gulpease index, which is like an Italian version of the F-K test. And they supplemented these scores with something called LIX. I’m not here to discuss their analyses of French and Italian, but I would like to note that the authors say that LIX “has been proven to perform well on Western European languages” and to back this up they link to a company called SiteImprove (in endnote 9). What the heck is this? I’m supposed to believe SiteImprove when they say these tests are good stuff? Yeah, no thanks. Surprisingly, all of the tests listed by SiteImprove – tests which they will do for you if you pay them – are great tests. The best tests. A lotta people are saying they’ve never seen such tests.

And, not for nothing, what the hell are “Western European languages”? Is Basque a Western European language? German? Catalan? Irish? Maltese? Welsh? Yiddish? Forget it. “Western European languages” is not a meaningful linguistic term. It’s marketing speak from a company trying to sell you linguistic snake oil. It’s Bruce Banner in Avengers saying that Wakandan is an “African dialect”.

Type/token ration… and beyond!

Moving on, the authors have a second language grinder to run their speeches through:

We complemented our readability analyses with a number of other widely used linguistic simplicity measures. The first is lexical richness, which is based on the premise that the higher the repetition rate of word-types in a corpus, the easier the language being used, since lexical variation, in addition to increasing difficulty per se, may also imply a broader range of contents. As an additional check of lexical richness, expressed as the Type/Token Ratio (TTR) and percentage of hapax legomena (i.e., words occurring only once in the corpus), we calculated the lemma/token ratio (LTR). This is because, when we refer in everyday speech to the “rich vocabulary of a speaker,” we tend to mean the unusually large number of different lemmas used (i.e., the basic forms for a paradigm, such as “love” as the basic form for “loves,” “loved,” “loving,” etc.) and not the great variety of inflected forms (Granger and Wynne 1999).

Hoo-boy! We finally got a reference to linguists! Too bad that these linguists (Granger and Wynne) never say what they are cited as saying. Let’s go through the claims in this paragraph.

  1. “The first is lexical richness, which is based on the premise that the higher the repetition rate of word-types in a corpus, the easier the language being used, since lexical variation, in addition to increasing difficulty per se, may also imply a broader range of contents.” I’m not sure where they got this idea, but they certainly don’t cite anyone. It’s not in Granger & Wynne (1999). Whose premise is this? Nobody knows!
  2. “As an additional check of lexical richness, expressed as the Type/Token Ratio (TTR) and percentage of hapax legomena (i.e., words occurring only once in the corpus), we calculated the lemma/token ratio (LTR).” Granger & Wynne (1999) do talk about Type/Token Ratio. They call it “a rather crude measure”. *Sad trombone*. Granger & Wynne also discuss Lemma/Token Ratio. They say “Our study shows that it is not safe to use crude type/token or lemma/token ratios with learner corpora”. The “with learner corpora” part is important. Granger & Wynne investigated texts written by English learners – not professional speech writers – and they tested whether traditional measures of lexical richness should be used on them. And they found that these measures are too crude for investigating texts by English learners. They didn’t test them on speech, or written political texts, and they never ever talked about these tests showing how “complex” or “difficult” a text is.
  3. “This is because, when we refer in everyday speech to the “rich vocabulary of a speaker,” we tend to mean the unusually large number of different lemmas used (i.e., the basic forms for a paradigm, such as “love” as the basic form for “loves,” “loved,” “loving,” etc.) and not the great variety of inflected forms (Granger and Wynne 1999).” Granger & Wynne (1999) never refer to the “rich vocabulary of a speaker”. They only say that “A learner who uses five different forms of the verb go (go/goes/going/gone/went) in one and the same text has a less varied vocabulary than the one who uses five different lemmas (such as go/come/leave/enter/return).” That’s it. No rich vocabulary, no linguistic complexity.

In the analysis, the authors do indeed report this crude analysis as if it’s supposed to tell us something:

Second, we find that Clinton’s speeches are lexically richer (albeit only slightly) than those of Trump, since TTR, LTR, and hapax values are greater in her corpus. In other words, when Clinton speaks, she uses both a wider range of morphological variants of the same words (e.g., love, loves, loved, loving, lover, etc.) and a greater number of distinct words (e.g., love, hate, passion, fear, etc.). This might indicate that, given the same total amount of words, either she speaks about a greater variety of contents (requiring different words to be expressed) or, when dealing with the same contents, she uses a broader range of synonyms, in both cases increasing the complexity of her language. (emphasis mine)

Again, there’s no research cited which indicates that greater TTR or LTR values mean a text is more complex. In fact, we’re not even offered a baseline of what TTR or LTR we’re supposed to expect. How high is high? Are both of the speakers high on the type-token ratio? Or low? This is a problem that will reoccur in the analysis.

For the time being, let’s stay on the topic of type/token ratio. We’ve already seen that the source cited by McDonnell & Ondelli calls it a “crude” form of analysis (along with LTR). But they’re not the only ones. In the Longman Grammar of Spoken and Written English (which, to those who aren’t grammar G’s like me, is one of the grammars of English), we see some claims which the authors could’ve cited. For example, in section, Longman says “The high TTR in news reflects the extremely high density of nominal elements in that register, used to refer to a diverse range of people, places, objects, events, etc.” So perhaps the high TTR of Clinton really does mean that she refers to a greater variety of contents.

But the next sentence in Longman says “At the other extreme, academic prose has the second lowest TTR, reflecting the fact that a great deal of academic writing has a restricted technical vocabulary and is therefore less variable than fiction and news reportage.” I think we can all agree that academic English is harder to follow than news reports – so a high TTR does not mean that a text is more complex. This is exactly the kind of thing I’m talking about. The authors make assumptions based on little or no evidence.

And the very next sentence after that in Longman says that TTR is a “crude” measure. But that’s not all!

Halliday also discusses TTR. But he cautions against using TTR to say that a text is more complex:

One caution should be given: By expressing the distinction in this way, we have already ‘loaded’ it semantically. To say that written language is ‘more dense’ is to suggest that, if we start from spoken language, then written language will be shown to be more complex.(Halliday 1989: 62)

Halliday notes that our conception of language and its complexity is therefore crucially dependent on which part of it we’re looking at. It’s not hard to see spoken language as more complex. If we want to use TTR to indicate complexity, we have to ground that analysis in something – more complex than what?

Biber’s (1988) foundational work on the variation between different genres of language used multiple aspects of language to analyze how each genre differed. He (1988: 202) says about complexity in language: “the fact that [markers of discourse complexity] occur in a largely complementary pattern shows that there are different types of complexity, and that it is not adequate to simply characterize particular genres as complex or not – rather, different genres are complex in different ways to different extents. The discourse complexity of spoken informational genres takes the form of structural elaboration, while the complexity of planned, written genres takes the form of lexical elaboration and precision.”

That raises the question of what McDonnell & Ondelli are studying here – written language or spoken language? Because complexity is likely to show up in different ways. But more on that question right now!

Function words vs. Content words. Fight!

The authors go on to describe another way that they are apparently complementing their use of the F-K tests and other readability measures:

The second is lexical density, expressed as the share of grammar words in a text, i.e., words that are used primarily to build syntactic structures (conjunctions, determiners, prepositions, and pronouns) and content words, i.e., words conveying meaning (adjectives, adverbs, nouns, and verbs). If a text conveys more information than another of equal length, then it is more difficult to understand. (emphasis mine)

Says who? How do we know that a text which conveys more information is more difficult to understand? A source would be really nice here. Alas, we wait in vain. (And not for nothing, the term is “function word”, not “grammar word”. That’s the one used by the large majority of linguistic grammars.)

There’s another very important question behind this assumption though. Basically, we don’t know – and the authors don’t tell us – what kinds of figures we’re supposed to expect. What do the numbers mean? As the authors show, Trump and Clinton have very similar percentages of content and grammar words. Maybe that’s just normal for English. No matter what kind of text you pick up, about 40-45% of the words are going to be function words and 50-55% of them are going to be content words (allowing for some percentage of “words” which don’t fit into either category – interjections, numbers, etc.). But we aren’t told what to expect and we aren’t told what an informative/complex text looks like. That means we have nothing to compare the figures to. We can only compare the figures for Trump and Clinton to each other. This doesn’t tell us much. The authors should have either cited a source or given us examples from other SPOKEN genres.

Because that’s what you want to see, ain’t it? Don’t worry. I got you, bro/sis.

For example, here’s a comparison of the percentages of function and content words in various corpora and novels:

function_content_across corpora

The percentages for other corpora of spoken language (from COCA and the BNC, two large representative corpora) look pretty close to Trump and Clinton. The corpus of TV and Movie dialogue is pretty darn close too. Even the novels aren’t that far off. Maybe a comparison of function and content words isn’t a good way to tell if one text is more complex than another.

But Chall Knew That…?

The authors use a test called the Dale-Chall score to see how many “difficult” words the English speeches contain. This is supposed to guard against problem I pointed out above: the fact that the F-K test is based on word length… but word length doesn’t really matter in terms of complexity. McDonnell & Ondelli show that ersatz is harder to comprehend than substitute, even though its shorter. Presumably this is because ersatz is less frequent, but the authors wrap it up in the idea that the average speaker can understand substitute better than ersatz. I don’t know why they do this. (We’ll talk about it more below)

My first question is: if the tests are so bad that you need to guard against them – if they are dependent on something which doesn’t matter in terms of making the language more difficult, such as word length – then why use those tests at all? They don’t seem to be very good tests. But whatever. It’s done.

The Dale-Chall score is pretty much the same as the F-K test, except it compares the text to a list of 3,000 words that are considered to be familiar to 80% of 4th graders in the US. If a word doesn’t appear on the list, then it’s considered “difficult”… to a ten-year-old American. The Dale-Chall test also suffers from the same problems as the F-K test. Dale-Chall includes sentence length in the calculation. So, again, a test made for WRITTEN language being used on SPOKEN language. When the authors state “Clinton uses longer sentences (on average 15.02 words per sentence compared to 12.55 for Trump)” they are measuring the punctuation styles of either the transcribers or the speechwriters. They are taking the “sentences” at face value – wherever a period was placed, that’s where the speaker ended a sentence. But of course that isn’t accurate. Speech doesn’t have periods. There is absolutely no punctuation in speech. None. Zip. Zilch. Periods are a feature of written language. And the decision of where to place them in a sentence is somewhat arbitrary. Tell me if Trump or Clinton used more coordinating conjunctions. And how long they paused when using these words. That would give us a better idea of the length of the speaker’s “sentences”.

I swear to Thor, I’m going to keep repeating this until political scientists get it.

The authors claim in Appendix A that the speeches were punctuated by either the transcribers (whoever they were) or university students. They also say that the students checked the transcriptions, and that the authors checked a portion of the transcriptions (20,000 words). I don’t know if the students checked the transcriptions against the actual speeches (by watching the videos, for example) or if they just checked the punctuation and spelling. The authors had each speech transcribed by multiple students. They were doing all this “to avoid individual transcription preferences/idiosyncrasies biasing results”. But I’m not so sure that they avoided this.

First, let’s say that introducing punctuation (aka a feature of written language) into speech is methodologically fine when we want to study the complexity of language. It’s not, but go with me here. We have to put the punctuation symbols in because our complexity tests depend on them. So where do we put them? This isn’t an easy question to answer. English has these things called conjunctions. These are words like and and but. With these words, we can avoid using a period and connect two independent clauses. But we can also use a period and start the next sentence with one of these conjunctions. The decisions to use more periods is going to affect the scores of our totally reasonable readability tests.

Yeah, but the authors checked against this, I hear you saying. Did they, though? Check this: The Clinton sub-corpus has 218 more sentences which start with a conjunction than the Trump sub-corpus. 157 of these sentences start with coordinating conjunction (and, or, but, etc.). Why is that? If we normalize these figures in order to make them comparable, we see that Clinton has 243 more sentences starting with a conjunction per 100,000 words. (This is some linguistics jargon. If you don’t get it, fine. I don’t have time to explain it here. But just know that the authors should know this stuff before doing their research and it appears that they don’t.)

If we carry this forward a bit, we see that 24% of Clinton’s sentences start with a conjunction, while only 16% of Trump’s do. Why is that? Could it have something to do with where the periods were placed? Perhaps the preferences and idiosyncrasies of the transcribers weren’t so successfully avoided.

But this whole matter could’ve been avoided. The authors could have placed the speeches into the tool on and seen how frequent each word is. This would have given them an idea of which speaker uses more common vocabulary items and which speaker uses less common words. Yeah, linguists have these tools. And they’re better than some rinky-dink readability score. Why didn’t the authors do this? Oh yeah, because they’re not linguists and they don’t know what they’re doing.

The authors performed some statistical tests to check their readability scores. But this doesn’t help the matter. The readability scores are flawed from the start. No amount of T-tests is going to make them better.

Finally, remember Halliday? He was the linguist who wrote (waaay back in the days when Michael Keaton was Batman) that TTR was a bad way to analyze complexity in language. So right after Halliday advises against using TTR, he goes on to explain why we shouldn’t use “sentence” in our analyses of spoken language because the “sentence” is a feature of written language and is not applicable to spoken language. But, you know, you got these readability tests from the 70s and their just asking to be used. Who cares if they’re garbage? Here, have a Camel and chill, ok? Trust your T-Zone.

Wrap it up, boys!

Despite the critical methodological problems with their analysis, McDonnell and Ondelli believe that the tests have shown something significant. They write:

In this study, we have analyzed the linguistic simplicity of four right-wing populist leaders compared to their principal mainstream competitors in the United States, France, the United Kingdom, and Italy. Taken together, our cases do not show a clear pattern in support of the claim that right-wing populists use simpler language than their rivals. Trump is the populist leader most in line with the theoretical literature. Nonetheless, while he is generally simpler than Clinton, ours is not a “Trump speaks at the level of a fourth grader” story. Rather, we find that Trump and Clinton fall within the same bands on the FK and Dale-Chall scales and are only separated by one grade on the FKGL. (emphasis mine)

As I said above: garbage in, garbage out. Flawed tests revealed flawed results. It doesn’t matter that the results don’t show support for the claim that populists use simpler language, or that Trump and Clinton are more or less linguistically similar in complexity, because the methodology was flawed from the start. No matter what results came, we would have to reject them (which is what I argued in my comments on the Schoonvelde et al. 2019 article, but whatevs)

They attempt to address one of the major flaws in their research – that they are studying speech with tests designed for writing. They say “it is true that we cannot consider the effect of factors such as speech speed and intonation, as well as pauses, on simplicity. However, this applies also to other studies of the simplicity of political language.” But how is this an excuse? “The other studies were flawed and so is ours”? I mean, those aren’t the only aspects of spoken language that this study (and others) doesn’t consider. There’s also dialect and pitch. These things affect how speech is evaluated by listeners (as do other language-external factors). The other studies which used readability scores weren’t good. That doesn’t mean you can do the same thing they did because screw it who cares.

Good points. Credit.

To the authors’ credit, they do manage to make some good points about language. Some of these are self-evident – although they weren’t made by previous research (ahem, Schoonvelde et al. 2019). Some of the other points, however, the authors seem to arrive at through serendipity. The methods that they use in their research are not able to tell us anything (that’s what we saw in the last section), but the authors come to some astute conclusions. Quotes followed by my comments.

Finally, we have assessed how difficult the lexis of our corpora is. While the main simplicity measures are based on word length and word variation, it can of course be the case that shorter words used by a speaker are more difficult to understand for the average citizen (e.g. “ersatz” is shorter, but considered harder to comprehend, than “substitute”).

We saw this one up above and it’s actually a good point. Credit where credit’s due. It’s not explained very well and it’s not backed up with any sources or evidence, but I think they’re on the right track here.

All of the measurements detailed here were replicated for each leader’s subcorpus to assess whether the language of right-wing populist leaders is simpler than that of their principal mainstream rivals. To be clear: while we use the same types of measures on all speakers, it only makes sense to compare between speakers within the same language and national political culture, rather than to do so across languages and cultures. In other words, we are able to say if Le Pen uses simpler language than Macron, but not if she uses simpler language than Clinton. Indeed, even comparing between English-language speakers like Trump and Farage is problematic because the U.S. and UK have different traditions and habits when it comes to political discourse.

Another good point, one that was completely lost on Schoonvelde et al. (2019)

How can we explain this counter-intuitive finding? First, we are open to the theoretical possibility that if we had examined different outlets other than speeches, we might have found our right-wing populist cases used simpler language than their rivals (although this remains an open empirical question). However, in the light of the methodological issues we raised earlier regarding debates and interviews, and the fact that speeches are the most traditional form of public political language, we are confident that studying them is sufficient to test the claim that populists are characterized by their use of simpler language than their opponents. (emphasis mine)

This is both a good and bad point. Yes, studying speeches is a way to test whether populists use simpler language. But McDonnell & Ondelli didn’t study them well. So this point is like saying “Taking tests is a way to finish fifth grade.” That’s true, you can’t fail all the tests and still pass fifth grade.

What do these findings mean for our understanding of populism? The idea that populists portray themselves as standing with ordinary people against the elites is at the heart of the concept and scholars have long claimed that populists use simpler language than those elites in order to drive home their simplistic messages and reinforce their positions as men and women of the people (Canovan 1999; Zaslove 2008). Our research suggests, however, that we need to distinguish simple language from populist arguments that are perceived by scholars as simplistic. One does not automatically accompany the other. Similarly, the automatic pejorative association of populists (or any other politicians) with simple language should also be reconsidered. As Benoit, Munger, and Spirling (2019) rightly argue, there is a difference between simple language that is clear and simple language that is dumbed down. For example, The Catcher in the Rye and The Old Man and the Sea score 3.4 and 3.9 respectively on the Flesch Kincaid Grade Level, while Harry Potter and the Philosopher’s Stone and The Da Vinci Code score 4.7 and 5.8. But few would argue that the greater linguistic simplicity of the first two novels indicates they are more dumbed down than the latter.

Holy shit! I can’t believe it. They just came right out and said it. This is a great point. The F-K test can’t tell us whether language is complex or simple. SO WHY ARE YOU GUYS USING IT?

Our study also suggests that, if we wish to understand the defining linguistic features of populists, we need to look beyond simplicity and consider the choice of vocabulary and rhetorical instruments used to convey recurrent contents.

I mean, good on them for straight up claiming that their results aren’t really useful, except for showing that people (and I’m looking at you, political scientists) should stop using the garbage readability scores. It’s almost like McDonnell & Ondelli are suggesting that someone should study language in a more rigorous way. I wonder who would do something like that. Maybe some sort of scientist of language?

Finally, our findings should make us question other aspects of the received wisdom about populists. The claim of simpler language has long gone hand-in-hand with the idea that populists are vulgar figures who appeal to people’s baser instincts, by “lowering the level of political discourse” (Moffitt 2016, 60-61). Again, however, the notion of populists being more vulgar may seem intuitively convincing but is one that—just like linguistic simplicity—needs to be operationalized and tested. It may be clearly true in some cases, but not others. Similarly, if right-wing populists do not always use simpler language, then the related negative assumption that simple and vulgar language must be appealing to those who vote for them also needs to be reassessed. The underlying message of the Boston Globe story and much of the research literature is that right-wing populists deliberately speak like fourth graders and their (simple) supporters lap it up. Our research shows that while this may be a convenient and even comforting idea, the reality—like populist language itself—is likely to be rather more complex. (emphasis mine)

Yeah, no fucking shit. Sorry, but this good point makes me angry because it’s something a linguist could have just told you. Just fucking ask. You don’t have to go reinventing Linguistics. Just pick up a goddamn linguistics book and see that people already know this stuff – they’re called linguists and they study language for a living. Wild, huh?

The right answer for the wrong reasons

So, after this womp womp of an academic article, McDonnell & Ondelli somehow – surprisingly – come to the correct conclusion. They state:

We argue therefore that, while future studies of discourse or the complexity of ideas may reveal populists are indeed simpler in those senses than their rivals, the long-standing claim that populists are characterized by their use of simpler language than other politicians needs to be revised. (emphasis mine)

Did they come to the right solution by doing the wrong work? [Ron Howard voice: Yes. Yes, they did.]

I mean, I guess I know how they got there – but since they didn’t see the problems with their methodology, I’m kind of shocked.

I want to say one more thing though: Refuting this kind of research takes a lot of time – the work I’m doing here should have been done by the authors or reviewers. But their article is already out there and it’s been written up in the Conversation. Just like with Schoonvelde et al., lots of people are going to hear about it and believe it. This sucks and that crap is on the authors.

I welcome scholars from other disciplines into the field of Linguistics. Scholars of other fields can bring new perspectives to the study of language. But you have to show up. If you’re going to study language, you have to do it right. You can’t do some half-baked slapdash study and call it good. If linguists published this kind of stuff in your field, would you take them seriously? Didn’t think so.


Biber, Doug. 1988. Variation in Speech and Writing. Cambridge: CUP.

Longman Grammar of Written and Spoken English. 1999. Douglas Biber, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan. London: Longman

Halliday, M.A.K. 1989. Spoken and Written Language (2nd ed.). Oxford: OUP.

Leave a Reply

Your email address will not be published. Required fields are marked *