Abby Kaplan begins her book by explaining its two purposes. First, the book is meant for “debunking language myths” such as those about linguistic sex differences and text messaging. Second, Kaplan’s book is about “how to study language”, or to reveal insights on what linguists do (p. 2). This has my interest piqued. There is no shortage of downright nonsense about language in the news, social media, and bookstores and so Kaplan’s book, which is suited to combat that nonsense, is therefore a welcome addition to the shelf.

Consider Kaplan’s thesis for the book:

This book is about two things […] First, it is about popular beliefs about language: the conventional wisdom on topics from linguistic sex differences to the effects of text messaging. Sometimes, of course, popular opinion has things more or less right – but it’s more interesting to examine cases where “what everyone knows” is wrong, and so we will put a special focus on debunking language myths. […] Second, this book is about how to study language – not in the sense that it will train you to do linguistic analysis for yourself, but in the sense that it provides a glimpse of the kinds of things linguists do. (p. 2)

Kaplan’s thesis on “popular beliefs about language” vs. “how to study language” aims to strike a balance between what people think they know about language and how we (or linguists) can figure out what is really going on. Such a thesis may sound heavier than usual for a book aimed at the general public, but Kaplan’s writing makes this book very approachable. In fact, Kaplan’s goal of the book has me hoping that journalists will read it: “The goal is for you to you to become an informed consumer of social science research with an appreciation of how the scientific process works” (p. 2).

Kaplan picks up this theme of the gibberish that is published about language by claiming “The world is full of self-appointed experts who feel free to make pronouncements on language with little or no supporting evidence” (p. 3). She is certainly correct there. One of the problems that I don’t often see reported is that linguistics is a tricky subject. Everyone speaks a language and many people feel justified in making claims about language. This doesn’t happen with other scientific subjects. No one makes claims about mathematics because they took algebra in high school. But some people who had a strict English teacher, or who got straight A’s in English class, feel it is their right to pass judgement on what is the appropriate use of language and what it not. One of the first assignments that I give my first-year students is to have them write about their linguistic pet peeves because I want them to let go of those notions before they start to learn that studying the modern use of language is not like paleontologists studying a T-rex from its skeleton, but rather like studying a living T-rex up close and without tranquilizer darts. That said, there are people who feel comfortable having learned the “rules” of their language and who do not want to be told different. I’d like to think that there are more people who learned the “rules” but are willing to keep learning more, even though they did not pursue a degree in linguistics. Kaplan’s book is for them.


The cover of Women Talk More than Men: … And Other Myths about Language Explained.

I thoroughly enjoyed every chapter in this book, but I want to highlight a few that I thought were especially good.

Chapter 1 – “A dialect is a collection of mistakes”

Perhaps it’s good to start with a discussion of dialects (a topic that everyone seems to have an opinion on) and the Ebonics debate, an occurrence which received an incredible amount of input from non-linguists or language professionals, a.k.a. people who don’t know what they’re talking about. Kaplan quotes some people who say that African American English (AAE), also called Black English, is a way of speaking in which you “you can say pretty much what you please, as long as you’re careful to throw in a lot of ‘bes’ and leave off final consonants” (p. 11). In my opinion, Kaplan is too easy on the writers who spout this nonsense (which is akin to the nonsense on the Urban Dictionary, a source that Kaplan also quotes), but the rest of the chapter is a detailed analysis of why non-standard dialects follow specific rules, just like Standard English does.

Kaplan offers a very good insight in this chapter. She writes:

There is one final point to be made here. Linguists argue that no variety of a language is linguistically superior to any other; every dialect of every language follows regular grammatical rules and is capable of fulfilling the communicative needs of its speakers. This is true even for languages and dialects that are widely thought to be crude or unsophisticated: as soon as linguists start studying what speakers actually do, we discover that these languages are just as rule-governed as any other.

But linguists also recognize that not all dialects of a given language are socially equal. Standard English is no better or worse than AAE in many social situations. Whether we like it or not, it’s a fact of life that a person who speaks Standard English will find it much easier to excel in the academic world or get certain kinds of jobs than a person who speaks only AAE. Thus, there are good pragmatic and ethical arguments for helping speakers of non-standard dialects learn Standard English too, while acknowledging that it’s only by historical accident that this particular variety is the prestigious one. (p. 20)

This is a point that you will not find in most books or articles about language that are written by non-language scholars. There is an idea (a very old idea) that the standard variety of the language is The One True Way™. Kaplan does a good job explaining that this idea, like the idea that a dialect is a collection of mistakes, is misleading. She also notes that it is “a simplification to talk about a single ‘Standard English’” (p. 11), since there are different standards in different English-speaking countries. There are also different standards among different genres of writing and speaking.

Chapter 5 – “Children have to be taught language”

Every chapter of Kaplan’s book starts with a myth about language in the title. Kaplan explains what is behind the myth and gives background information from language studies. She then offers summaries of case studies which have been done to investigate the myth. This chapter on child language acquisition is about how children who receive the most language input, i.e. those who are taught language, are likely to do better in life and it references the celebrated research done by Hart and Risley (1995), which supposedly found that children from low-income families have lower IQ scores because “low-income parents talk to their children much less, and in different ways, than high-income parents do” (p. 83). But Kaplan also highlights an important distinction in studies of this kind:

Look again at the list of things that parents can apparently do to boost their children’s language development: talk a lot, directly to the child; use a large vocabulary; treat the child as a conversational partner and engage with her intensively; ask her lots of questions; use indirect requests instead of giving demands; and so on. This picture looks suspiciously like the western mainstream middle-class model of parenting – which […] is far from universal. Not only that, but this is exactly the social group to which researchers on child language acquisition are most likely to belong. (p. 89)

Kaplan shows that measuring a child’s linguistic ability based solely on how many words they say while a researcher around is perhaps wrong-headed. Different cultures and social groups place different restrictions on how much children are allowed to talk around adults/strangers/researchers. Likewise, researchers from different socio-economic and cultural groups may place value on objects and experiences that are unfamiliar to children from different groups. The study of language is not as straightforward as it seems. Kaplan again shows a good insight when she writes about our biases and problems in language studies:

The point here is not that Hart and Risley had it backwards, that the parenting practices they thought were good are actually bad. Rather, the point is that any time we try to study parents and children – including their use of language – our research is inevitably influenced by culture-specific assumptions about the kinds of things parents and children ought to do. It’s all too easy to study parents and children in our own culture and conclude that we’ve learned something about parents and children everywhere. (p. 92)

I was a bit disappointed that a discussion of Motherese was left out of this chapter. Motherese is the idea that the primary caregiver(s) explicitly correct their child’s language mistakes, thus giving instructions on what is acceptable in their language. Motherese was perhaps most famously put forth by Steven Pinker in The Language Instinct. Pinker argued that Motherese is “folklore” and that its non-existence proves that humans have Universal Grammar (Motherese is wrapped up in the Poverty of the Stimulus argument). Kaplan claims that we can answer the question of whether Motherese exists, or whether “parents systematically and explicitly correct their children’s grammar mistakes” with “a resounding ‘no’” (p. 104). She says no study has ever found this, but Sampson (2005) references a study which showed that the speech directed at children by caregivers is more “proper” (i.e. free of grammatical errors) than linguists assume, especially Pinker and other believers in Universal Grammar. I concede that taking on Universal Grammar is a lot to ask out of one chapter of one book, but I would have liked to see this debate at least mentioned. Kaplan does address the poverty of the stimulus argument and makes a very pertinent point about how it’s a theory on child language acquisition which was put forth by someone (Chomsky), who is not an expert in child language acquisition. She writes:

The poverty of the stimulus remains a controversial hypothesis, and some linguists have argued that Chomsky (who is not a specialist in child language acquisition) underestimated how much information is in the speech that a young child typically hears. (p. 93, bolding mine)

The shade, it is thrown.

In discussing the speech that children overhear, Kaplan has a very nice side-note which I think anyone who has been around children can appreciate. It shows that this book is also fun: “It seems entirely reasonable that children would pay more attention when they are being spoken to directly, but it’s also clear that children ‘eavesdrop’ as well. (If you doubt this, try swearing within earshot of a two-year-old.)” (p. 91).

Chapter 6 – “Adults can’t learn a new language”

Kaplan’s chapter here does a very good job of discussing the myth that there is a critical period in language learning, or an unspecified age sometime before adulthood after which it is impossible for people to become fluent in a second language. Kaplan frames this question very well, or shows how linguists should frame the myth, by writing:

But our anecdotal impressions may not be accurate; it’s true that many adults struggle with a second language, but it’s also true that many adults become competent and fluent speakers of a language they first learned late in life. In addition, even if children really are better on average at learning a second language than adults are, that fact by itself doesn’t prove that there is a critical period for second-language acquisition: children and adults are different in many ways, and it could be that adults have trouble with new languages for some reason other than just their age. (p. 115)

This explanation is an example of the insightful ways that Kaplan approaches the linguistic myths in the book. And this explanation is especially pertinent here since the critical period myth comes directly from linguists. It is unfortunate, however, that in this chapter Kaplan does not define what “native proficiency” means and does not tell us how the studies mentioned define the term. To reach the proficiency of a native speaker was once ultimate goal for second-language learners, but that idea has fallen under question since “native” speakers do not always serve as exemplars of their language and since speaking like a native is not desirable in all situations. For example, when two or more non-native English speakers with different first languages are working together, an international variety of English might be preferable.

Chapter 8 – “Women talk more than men”

It is easy to see why Chapter 8 gives the book its title. This chapter, on the myth that women talk more than men, is probably the most insightful chapter in the book, perhaps because the myth is so misleading. For example, Kaplan shows how even if we were to observe that women talked more than men, this would leave us with a host of additional questions and few answers:

Suppose you conducted an experiment and found that women were more likely to say um more than men. Does this mean that women are more insecure than men? Or that they’re more thoughtful and take more time deciding what to say next? How much do the results depend on the design of the experiment? For example, was the data collected in a lab setting, or from a corpus of spontaneous conversation? If it was in a lab setting, could that task have biased the results? Were the subjects discussing a topic that men might traditionally be expected to know more about? Were the subjects giving monologues, conversing in pairs, or talking in small groups? Were they talking with others of the same sex or the opposite sex?

As we will see, factors like these have a huge effect on how men and women speak. (p. 155)

Kaplan explains various ideas from different cultures about how men and women speak. And she astutely points out the what is really behind these ideas:

By this point, contemporary western ideas about women’s superior verbal skills are starting to look anomalous. Obviously societies vary in what they believe about women’s speech: according to the medieval song discussed above, women are gossipy and unable to keep secrets; according to Jespersen, women are languid and insipid; according to rural Malagasy communities, women are unskilled and blunt. What all of these beliefs have in common is not the specific characteristics that are attributed to women, but the idea that women are inferior to men. Where assertiveness and directness are highly valued, those behaviors are considered to be characteristic of men; where indirectness and self-effacement are highly valued, those behaviors are attributed to men. (p. 162)

I like that Kaplan discusses the ways that women’s speech is viewed in other places in the world, but I appreciate it even more that this book – which is written in English and is from contemporary western society – shows that the ideas in our own culture about how women speak are deficient. I have a sneaking suspicion that the talk of places and languages in far off lands would fall on deaf ears for general readers, so it is very good that Kaplan contextualizes our own views of language.

Chapter 9 – Texting makes you illiterate

This is a myth that linguists have been at pains to debunk in recent years because texting and microblogging have become so popular. Along with the rise of these technologies and platforms has, unfortunately, also come the Chicken Little language commentary, which screams that texting is ruining the English language. The most infamous propagator of such hysteria is perhaps Lynne Truss, author of Eats, Shoots and Leaves, a book which starts of bemoaning the harm caused to English by texting and then goes on moaning for over 100 pages. So it was nice to see that this is one of the best chapters in Kaplan’s book. Kaplan begins by explaining that texting is a form of language unto itself and that there are valid reasons for why it will most likely not influence other forms of language:

When we use some technology to transmit language, its form isn’t neutral: it shapes how we say things, and therefore also potentially what we say. It matters, for example, that writing (but not speech) is permanent, that it can be revised and edited, and that it carries only limited information about tone of voice. Telephone and radio transmit audio but not video; the listener has access to the voice but not the nonverbal cues. Telegrams used to be priced by the word, which encouraged senders to use as few words as possible in what became the classic ‘telegraphic’ style. (p. 190)

Many language commenters often do not realize these facts and think that the way people tweet is the way that they will write letters to the editor, or job applications, or whatever. But there is little reason to assume this is the case (and the language commenters rarely present evidence to support such an assumption). In addition, Kaplan makes another important point that is overlooked by people who adhere to this myth: the abbreviations used in texting (and tweeting, chatting, etc.) serve a meaningful sociolinguistic function besides saving space or time. The proof of this is that some of the abbreviations actually take more time and space to compose then writing the words out, and yet people still use them.

Later in the chapter, Kaplan gets to what’s at the heart of this chapter’s myth: people don’t like texting because it’s not proper English. She writes:

Despite the similarity between some types of hieroglyphic writing and some types of text message abbreviations, I have yet to hear a modern commentator decry hieroglyphics with the same fervor that is applied to texting. It’s hard to avoid the impression that these abbreviations are condemned, not because they’re inherently bad, but because they simply do not happen to be part of standard written English. (p. 198)

Well, sure, but Ancient Egypt used hieroglyphics and look what happened to them.



This is one of the best books on language and linguistics that I have ever read. It is wide-ranging and well-written. It offers more in terms of actual data than the usual language books aimed at the general public, but it is not so technical as to be inaccessible to non-linguists. It’s like a peek behind the curtain of linguistics and shows the sticky nature of seemingly simple (but wrong) ideas such as “a dialect is a collection of mistakes”, “the most beautiful language is X” and especially “women talk more than men”. For each myth, Kaplan has built a response based on solid linguistic sources. In each chapter, she also offers a bullet point summary, and list of points for further reflection on the topic, a concise and explanatory list of references for further reading, and a bibliography. If any of the topics covered in this book leave you hoping for more, you will not be let down. I highly recommend reading this book.

You can see other reviews of this book on Stan Carey’s blog Sentence First and Lauren Gawne’s blog Superlinguo. Both of them enjoyed the book. You can also read a blog post by Kaplan on the myths and facts of “uptalk” in English.

Women Talk More than Men …And Other Myths about Language Explained is available in paperback (ISBN: 9781107446908) for $24.99 (UK£15.99) and in hardcover (ISBN: 9781107084926) for $94.99 (UK£59.99). CUP kindly sent me a copy of Kaplan’s book for this review.

As a dictionary of English vocabulary and phrases, the American English Compendium by Marv Rubinstein is satisfactory. It is 500 pages long so it covers a lot of ground. As a book of American English or Americanisms, this book is not what it seems. A brief glance at any of the pages will make you question if the entries really are words or phrases that are exclusive to American English. And a comparison to another source will most likely show that they are not. As a commentary on language, however, this book is terrible.

American English Compendium

Cover of American English Compendium by Marv Rubinstein. Published by Rowman & Littlefield. Cover design by Neil Cotterill.

The problems start on the first page of Chapter 1. The author defends the use of the term American English by proclaiming it is better than British English:

Dynamic. Versatile. Imaginative. Capable of capturing fine nuances. All these terms can truthfully be used to describe the American language. “Don’t you mean the ‘English language’?” some readers may ask. No, I mean the American language. Over many years, American English has vastly expanded and changed, a transmutation that has left it only loosely connected to its mother tongue, British English. (p. 3)

Although no one would (or should) argue that American English is a term that needs to be defended, the imaginary readers in this passage come off as more knowledgeable about language than the author. Are we really to believe American English is the only variation of English that is “dynamic” or “imaginative” or “capable of capturing fine nuances”? The problem gets compounded when the author recognizes the influence of American English in England, but seems to suggest that the reverse is not happening:

[W]hile there are numerous localisms [in countries where English is the primary language], more and more the terminology, idioms, slang, and colloquialisms smack of American English. Even in England this is slowly but surely happening. (p. 3)

And it only get stranger from there. On the next page we are told:

Things have changed so much, and the use of American English in international communications has grown so much, one can now safely say that most English speakers use (to a greater or lesser degree) Americanized English – that is, the American language. And rightly so. The American language is so much richer and more adventurous. British English neve stood a chance. (p. 4, emphasis mine)

Excuse me, Mr. Rubinstein, but H. G. Wells, J. K. Rowling, Grant Morrison, Agatha Christie and a thousand other British writers would like a word.

After this “proof” that ‘Murican English is better than British English, readers are given a “microcosm of what is happening” (p. 4) in the world. Rubinstein relates a story from an article by New York Times columnist and economist Thomas Friedman about how a senior Moroccan official is sending his kids to an American school even though he was educated in a French school. Rubinstein uses this story to claim that

There are now several American schools in Casablanca, each with a long waiting list. In addition, English (primarily American English) courses are springing up all over that country. If this is happening in Morocco, a country with long-lasting French connections and traditions, it is undoubtedly happening everywhere. The American language is becoming ubiquitous. (p. 5)

But it needs to be noted that Friedman does not claim that these English-language schools which are supposedly popping up all over Casablanca are teaching American English. Nor are readers given any proof that Casablanca is an example of what is happening around the world. I am very hesitant to believe it is. While it’s a cute story, this kind of claim needs to be backed up with evidence. How do we know that the English being taught in these schools is strictly British or American or some variation of English as an international language? We have to take the Rubinstein’s word for it, but as we have seen with his dismissal of British English, he is not to be trusted when it comes to linguistics commentary.

Further down the page, in a section titled The Richness of the American Language, Rubinstein claims that “much of the richness of the American language lies in the fact that it has absorbed words and expressions from at least fifty other languages.” (p. 5) He lists some examples, but completely fails to acknowledge the fact that many of them, such as brogue and orangutan and typhoon, were originally borrowed into British English and then used by Americans.

Rubinstein then presumes readers will ask how the American language differs from other languages, which obviously also use foreign words and phrases. But the answer given is just as confused as the question. The author states that “there is no question that American English has been like a sponge absorbing and modifying words from many other languages” (p. 7) without realizing (or reporting) that this is true of English in general, not American English in particular. This is actually true of languages in general, although English does appear to be particularly greedy when it comes to borrowing words from other languages.

Later, there is a fairly reasonable, but short and undefinitive, discussion of “Black English” (African American Vernacular English). The section unfortunately ends with this quote: “Educated African Americans, of course, use standard American English” (pp. 11–12). Well, good for them.
Things get really bonkers in the section on compounding, which includes this howler:

Compound words exist in almost all languages, but never anywhere near the extent that they do in American English. […] during the last few decades, compounding has reached epidemic proportions. The vast majority of compound words are of relatively recent origin languagewise (p. 15)

This is nonsense. Does the author know how any other languages work? Finnish compounds words much more than English does. In fact, the syntax of Finnish demands it, unlike in English where compounding is very often a matter of style. And how do we know that the “vast majority” of compound words are not old? Let’s say “the last few decades” goes back to 1960. Do you really think words such as outcast, outdoors, outlook, output, overcome, overdoes, overdue, oversee, oddball, goofball, downfall, and downhill (all words supplied by the author) were made compound words after 1960?

Here are some other WTFs in this book along with the thoughts I had after reading them:

In general [the English speakers of Australia, Canada, Guyana, India, Ireland, New Zealand, and South Africa] all understand each other, but, as you have seen in the previous chapter on American and British English, there are substantial differences. The same can be said of the English used in the other countries listed above. With a few exceptions, Canadian English consists of a blending of American and British English, but the other English-speaking countries have all developed their own unique and distinctive expressions (including slang and colloquialisms). (p. 267)

Hahahahaha! Fuck you, Canada! Get your own expressions, eh!


English is an Anglo-Saxon language with roots in Latin, the Romance Languages, and German. [No.] This means that most, if not all, English words are variations of foreign words, and such words have legitimately entered the language. (p. 281)



The Oxford English Dictionary prides itself on keeping up to date, and it does pretty well (but not perfect) with including new words in its latest editions. Unfortunately, libraries with limited budgets these days do not always have the most recent revisions. Your best bet for researching neologisms is probably the Internet – for example, Google. (p. 403)

Because the OED is the only dictionary in the world. I’ve said it before and I’ll say it again: In linguistics research there is only the OED and Google. It’s a wonder we get anything done.


Chairman has become chairperson and has been further reduced to chair. But many gender-based terms remain unresolved. While, for example, policeman easily becomes police officer, other words and phrases resist change. One almost invariably hears expressions such as “Everyone to their own taste. [What? Who invariably hears this?] Grammatically incorrect [Nope!] but why risk offending potential female customers of advertised products? [Bitches be trippin’, amiright?] However, when a woman mans the controls of an aircraft, should the term be changed even though it denotes action, not identity? What should we now call a “manhole cover”? [Serious questions, you guys.] Note that we no longer have actresses; they all insist on being called actors. [How dare they?!] (p. 13)

Based on the claims about language alone, I would not recommend this book. I don’t know how someone writes a book about language and gets so much wrong. The word and phrase entries may be useful, but any online dictionary will have most if not all of them. Go there instead or get a proper reference book from a respected dictionary.

As the authors state in their foreword (pp. xii-xiii):

This book represents an attempt to defang the slang and crack the code. In writing this, we tried to think back to when we were new to Washington and wishing, like wandering tourists lost in a foreign city, that we had a handy all-in-one-place phrasebook.

I would say they have largely accomplished this. Dog Whistles, Walk-backs & Washington Handshakes is an up-to-date glossary of American political terms. I think that people interested in language and politics would find this book enjoying for a few reasons. First, the book is well referenced (always a plus). The authors are not trying to discover the first known use of some political code word, but rather to show that politicians from all sides use this type of language and that you are likely to come across it in tomorrow’s newspaper or news broadcast. So their references mostly come from very recent sources, which is refreshing. The foreword and introduction make nuanced points about language and slang, and the authors back up these points with references to reputable sources.

Dog Whistles has appeal for people who follow American politics, since although they are likely to already know some of the terms in here, they will probably find some they don’t know or haven’t thought about. That’s because the book isn’t just made up of eye-catching terms such as Overton window and San Fransisco values. Readers will appreciate the care that the authors have taken to explain each term. For example, here is the entry for the seemingly innocent term bold (p. 40):

Bold: A politician’s most common description of their own or their party’s proposals. It manages to be a punchy, optimistic-sounding break with conventional thinking and deliberately vague all at once.

Image copyright ForeEdge and University Press of New England

Image copyright ForeEdge and University Press of New England

But the book is not just for language and politics heads. In the introduction (p. ix), the authors recognize the problem that people who do not closely follow politics might have when reading about or listening to their representatives:

For most of the population – let’s call them “regular, normal people” – time spent listening to legislation, operatives, and journalists thrash over public policy on cable or a website can often result in something close to a fugue state, induced by the repeated use of words and phrases that have little if any connection to life as it is lived on planet Earth.

Later (p. 129), the authors explain the importance of their glossary by saying that:

Knowing the meanings of such specialized political terms can help cut through spin meant to obscure what’s really going on in a campaign. When politicians use the cliché, “The only poll that counts is the one on Election Day,” they really mean, “I wouldn’t win if the election were held today.”

I am all for educating people about the intricacies of language, especially when that means explaining the ways that politicians use words and phrases to trick people.
I am, however, not sure that all of the terms deserve being placed in this book. I feel like a glossary should include words that are at least nominally used by a group of people. But in their attempt to be current, the authors have included phrases such as hardship porn. This is a phrase coined by Frank Bruni of the New York Times and it only returns two hits on Google News – the July 2015 article in which Bruni coined it and an October 2015 book review in the Missoula Independent. However influential Frank Bruni is, this term has not caught on yet.

This is really nitpicking though (something us academics excel at, thankyouverymuch). I really found this book enjoyable. If you like politics, language, or both, you will probably enjoy it too. You can check out the interactive website here: http://dogwhistlebook.com/ and even suggest you own term.




McCutcheon, Chuck and David Mark. 2014. Dog Whistles, Walk-backs & Washington Handshakes: Decoding the Jargon, Slang, and Bluster of American Political Speech. ForeEdge: New Hampshire.

I assumed that the term dynamic duo must come out of comics. Comic book creators have long been coming up with alliterative epithets for their characters. Superman is the “Big Blue Boy Scout”, Supergirl is the “Maid of Might”, Batman is the “Caped Crusader”, Silver Surfer is the “Sentinel of the Spaceways”, Flash is the “Scarlet Speedster”, Wonder Woman is the “Amazing Amazon”, Spider-man is spectacular, the Avengers are (also) amazing, and the Four are fantastic. You get the point.

But having been around the etymological block a few times, I know that everything in language is older than you think it is. So I thought the phrase dynamic duo might come out of some earlier work. Perhaps it was reappropriated by comic book authors to describe the Dark Knight and the Boy Wonder.

Various dictionaries, however, claim that dynamic duo comes from the famous 1960s Batman TV series starring Adam West and Burt Ward, including NTC’s Dictionary of American Slang and Colloquial Expressions by Richard Spears and the Dictionary of American Slang by Kipfer and Chapman. The Batman TV series premiered in 1966, or 25 years after Robin was created. I found it hard to believe that it would have taken writers that long to come up with dynamic duo, so I decided to dig a little deeper.

Using Google Books, I found a volume of the Michigan Alumnus which includes the phrase. This was written in 1954:

The Michigan Alumnus - Google Books - Dynamic duo_small

So dynamic duo predates the Batman TV show. But when was it first applied to Batman and Robin? For that we have to dive into the comics.

On October 31, 1940, DC Comics published a story called “The Case of the Joker’s Crime Circus” in BATMAN #4. The story was written by Bill Finger and featured Bob Kane, George Roussos and Jerry Robinson on art. On page 7, we see the first time Batman and Robin are referred to as the dynamic duo:


Of course, the term has moved out of the comics and can be applied to any “very special pair of people or things” (Spears 2000). And it may still be true that the Batman TV show is responsible for popularizing the term. But it warms my comic book loving heart to know that Bill Finger came up with dynamic duo.

Holy exciting etymology, Batman! It’s time to update our Bat-tionaries!





Spears, Richard. 2000. NTC’s Dictionary of American Slang and Colloquial Expressions 3rd Edition. NTC Publishing Group: New York.http://www.goodreads.com/book/show/2477414.NTC_s_Dictionary_of_American_Slang_and_Colloquial_Expressions

Kipfer, Barbara Ann and Robert L. Chapman. 2007. Dictionary of American Slang 4th Edition. HarperCollins Publishers. http://www.goodreads.com/book/show/2417989.Dictionary_of_American_Slang

In two recent papers, one by Kloumann et al. (2012) and the other by Dodds et al. (2015), a group of researchers created a corpus to study the positivity of the English language. I looked at some of the problems with those papers here and here. For this post, however, I want to focus on one of the registers in the authors’ corpus – song lyrics. There is a problem with taking language such as lyrics out of context and then judging them based on the positivity of the words in the songs. But first I need to briefly explain what the authors did.

In the two papers, the authors created a corpus based on books, New York Times articles, tweets and song lyrics. They then created a list of the 10,000 most common word types in their corpus and had voluntary respondents rate how positive or negative they felt the words were. They used this information to claim that human language overall (and English) is emotionally positive.

That’s the idea anyway, but song lyrics exist as part of a multimodal genre. There are lyrics and there is music. These two modalities operate simultaneously to convey a message or feeling. This is important for a couple of reasons. First, the other registers in the corpus do not work like song lyrics. Books and news articles are black text on a white background with few or no pictures. And tweets are not always multimodal – it’s possible to include a short video or picture in a tweet, but it’s not necessary (Side note: I would like to know how many tweets in the corpus included pictures and/or videos, but the authors do not report that information).

So if we were to do a linguistic analysis of an artist or a genre of music, we would create a corpus of the lyrics of that artist or genre. We could then study the topics that are brought up in the lyrics, or even common words and expressions (lexical bundles or n-grams) that are used by the artist(s). We could perhaps even look at how the writing style of the artist(s) changed over time.

But if we wanted to perform an analysis of the positivity of the songs in our corpus, we would need to incorporate the music. The lyrics and music go hand in hand – without the music, you only have poetry. To see what I mean, take a look at the following word list. Do the words in this list look particularly positive or negative to you?





































































smell sorry






















If we combine these words as Rivers Cuomo did in his song “Butterfly”, they average out to a positive score of 5.23. Here are the lyrics to that song.

Yesterday I went outside
With my momma’s mason jar
Caught a lovely Butterfly
When I woke up today
And looked in on my fairy pet
She had withered all away
No more sighing in her breast

I’m sorry for what I did
I did what my body told me to
I didn’t mean to do you harm
But everytime I pin down what I think I want
it slips away – the ghost slips away

I smell you on my hand for days
I can’t wash away your scent
If I’m a dog then you’re a bitch
I guess you’re as real as me
Maybe I can live with that
Maybe I need fantasy
A life of chasing Butterfly

I’m sorry for what I did
I did what my body told me to
I didn’t mean to do you harm
But everytime I pin down what I think I want
it slips away – the ghost slips away

I told you I would return
When the robin makes his nest
But I ain’t never comin’ back
I’m sorry, I’m sorry, I’m sorry

Does this look like a positive text to you? Does it look moderate, neither positive nor negative? I would say not. It seems negative to me, a sad song based on the opera Madame Butterfly, in which a man leaves his wife because he never really cared for her. When we include the music into our consideration, the non-positivity of this song is clear.

Let’s take a look at another list. How does this one look?

















































































Based on the ratings in the two papers, this list is slightly more positive, with an average happiness rating of 5.46. When the words were used by Trent Reznor, however, they expressed “a deeply personal meditation on self-hatred” (Huxley 1997: 179). Here are the lyrics for “Closer” by Nine Inch Nails:

You let me violate you
You let me desecrate you
You let me penetrate you
You let me complicate you

Help me
I broke apart my insides
Help me
I’ve got no soul to sell
Help me
The only thing that works for me
Help me get away from myself

I want to fuck you like an animal
I want to feel you from the inside
I want to fuck you like an animal
My whole existence is flawed
You get me closer to god

You can have my isolation
You can have the hate that it brings
You can have my absence of faith
You can have my everything

Help me
Tear down my reason
Help me
It’s your sex I can smell
Help me
You make me perfect
Help me become somebody else

I want to fuck you like an animal
I want to feel you from the inside
I want to fuck you like an animal
My whole existence is flawed
You get me closer to god

Through every forest above the trees
Within my stomach scraped off my knees
I drink the honey inside your hive
You are the reason I stay alive

As Reznor (the songwriter and lyricist) sees it, “Closer” is “supernegative and superhateful” and that the song’s message is “I am a piece of shit and I am declaring that” (Huxley 1997: 179). You can see what he means when you listen to the song (minor NSF warning for the imagery in the video). [1]

Nine Inch Nails: Closer (Uncensored) (1994) from Nine Inch Nails on Vimeo.

Then again, meaning is relative. Tommy Lee has said that “Closer” is “the all-time fuck song. Those are pure fuck beats – Trent Reznor knew what he was doing. You can fuck to it, you can dance to it and you can break shit to it.” And Tommy Lee should know. He played in the studio for NIИ and he is arguably more famous for fucking than he is for playing drums.

Nevertheless, the problem with the positivity rating of songs keeps popping up. The song “Mad World” was a pop hit for Tears for Fears, then reinterpreted in a more somber tone by Gary Jules and Michael Andrews. But it is rated a positive 5.39. Gotye’s global hit about failed relationships, “Somebody That I Used To Know”, is rated a positive 5.33. The anti-war and protest ballad “Eve of Destruction”, made famous by Barry McGuire, rates just barely on the negative side at 4.93. I guess there should have been more depressing references besides bodies floating, funeral processions, and race riots if the song writer really wanted to drive home the point.

For the song “Milkshake”, Kelis has said that it “means whatever people want it to” and that the milkshake referred to in the song is “the thing that makes women special […] what gives us our confidence and what makes us exciting”. It is rated less positive than “Mad World” at 5.24. That makes me want to doubt the authors’ commitment to Sparkle Motion.

Another upbeat jam that the kids listen to is the Ramones’ “Blitzkrieg Bop”. This is the energetic and exciting anthem of punk rock. It’s rated a negative 4.82. I wonder if we should even look at “Pinhead”.

Then there’s the old American folk classic “Where did you sleep last night”, which Nirvana performed a haunting version of on their album MTV Unplugged in New York. The song (also known as “In the Pines” and “Black Girl”) was first made famous by Lead Belly and it includes such catchy lines as

My girl, my girl, don’t lie to me
Tell me where did you sleep last night
In the pines, in the pines
Where the sun don’t ever shine
I would shiver the whole night through


Her husband was a hard working man
Just about a mile from here
His head was found in a driving wheel
But his body never was found

This song is rated a positive 5.24. I don’t know about you but neither the Lead Belly version, nor the Nirvana cover would give me that impression.

Even Pharrell Williams’ hit song “Happy” rates only 5.70. That’s a song so goddamn positive that it’s called “Happy”. But it’s only 0.03 points more positive than Eric Clapton’s “Tears in Heaven”, which is a song about the death of Clapton’s four-year-old son. Harry Chapin’s “Cat’s in the Cradle” was voted the fourth saddest song of all time by readers of Rolling Stone but it’s rated 5.55, while Willie Nelson’s “Always on My Mind” rates 5.63. So they are both sadder than “Happy”, but not by much. How many lyrics must a man research, before his corpus is questioned?

Corpus linguistics is not just gathering a bunch of words and calling it a day. The fact that the same “word” can have several meanings (known as polysemy), is a major feature of language. So before you ask people to rate a word’s positivity, you will want to make sure they at least know which meaning is being referred to. On top of that, words do not work in isolation. Spacing is an arbitrary construct in written language (remember that song lyrics are mostly heard not read). The back used in the Ramones’ lines “Piling in the back seat” and “Pulsating to the back beat” are not about a body part. The Weezer song “Butterfly” uses the word mason, but it’s part of the compound noun mason jar, not a reference to a brick layer. Words are also conditioned by the words around them. A word like eve may normally be considered positive as it brings to mind Christmas Eve and New Year’s Eve, but when used in a phrase like “the eve of destruction” our judgment of it is likely to change. In the corpus under discussion here, eat is rated 7.04, but that doesn’t consider what’s being eaten and so can not account for lines like “Eat your next door neighbor” (from “Eve of Destruction”).

We could go on and on like this. The point is that the authors of both of the papers didn’t do enough work with their data before drawing conclusions. And they didn’t consider that some of the language in their corpus is part of a multimodal genre where there are other things affecting the meaning of the language used (though technically no language use is devoid of context). Whether or not the lyrics of a song are “positive” or “negative”, the style of singing and the music that they are sung to will highly effect a person’s interpretation of the lyrics’ meaning and emotion. That’s just the way that music works.

This doesn’t mean that any of these songs are positive or negative based on their rating, it means that the system used by the authors of the two papers to rate the positivity or negativity of language seems to be flawed. I would have guessed that a rating system which took words out of context would be fundamentally flawed, but viewing the ratings of the songs in this post is a good way to visualize that. The fact that the two papers were published in reputable journals and picked up by reputable publications, such as the Atlantic and the New York Times, only adds insult to injury for the field of linguistics.

You can see a table of the songs I looked at for this post below and an spreadsheet with the ratings of the lyrics is here. I calculated the positivity ratings by averaging the scores for the word tokens in each song, rather than the types.

(By the way, Tupac is rated 4.76. It’s a good thing his attitude was fuck it ‘cause motherfuckers love it.)

Song Positivity score (1–9)
“Happy” by Pharrell Williams 5.70
“Tears in Heaven” by Eric Clapton 5.67
“You Were Always on My Mind” by Willie Nelson 5.63
“Cat’s in the Cradle” by Harry Chapin 5.55
“Closer” by NIN 5.46
“Mad World” by Gary Jules and Michael Andrews 5.39
“Somebody that I Used to Know” by Gotye feat. Kimbra 5.33
“Waitin’ for a Superman” by The Flaming Lips 5.28
“Milkshake” by Kelis 5.24
“Where Did You Sleep Last Night” by Nirvana 5.24
“Butterfly” by Weezer 5.23
“Eve of Destruction” by Barry McGuire 4.93
“Blitzkrieg Bop” by The Ramones 4.82



[1] Also, be aware that listening to these songs while watching their music videos has an effect on the way you interpret them. (Click here to go back up.)


Isabel M. Kloumann, Christopher M. Danforth, Kameron Decker Harris, Catherine A. Bliss, Peter Sheridan Dodds. 2012. “Positivity of the English Language”. PLoS ONE. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029484

Dodds, Peter Sheridan, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, and Christopher M. Danforth. 2015. “Human language reveals a universal positivity bias”. PNAS 112:8. http://www.pnas.org/content/112/8/2389

Huxley, Martin. 1997. Nine Inch Nails. New York: St. Martin’s Griffin.

Last week I wrote a post called “If you’re not a linguist, don’t do linguistics”. This got shared around Twitter quite a bit and made it to the front page of r/linguistics, so a lot of people saw it. Pretty much everyone had good insight on the topic and it generated some great discussion. I thought it would be good to write a follow-up to flesh out my main concerns in a more serious manner (this time sans emoticons!) and to address the concerns some people had with my reasoning.

The paper in question is by Dodds et al. (2015) and it is called “Human language reveals a universal positivity bias”. The certainty of that title is important since I’m going to try to show in this post that the authors make too many assumptions to reliably make any claims about all human language. I’m going to focus on the English data because that is what I am familiar with. But if anyone who is familiar with the data in other languages would like to weigh in, please do so in the comments.

The first assumption made by the authors is that it is possible to make universal claims about language using only written data. This is not a minor issue. The differences between spoken and written language are many and major (Linell 2005). But dealing with spoken data is difficult – it takes much more time and effort to collect and analyze than written data. We can argue, however, that even in highly literate societies, the majority of language use is spoken – and spoken language does not work like written language. This is an assumption that no scholar should ever make. So any research which makes claims about all human language will therefore have to include some form of spoken data. But the data set that the authors draw from (called their corpus) is made from tweets, song lyrics, New York Times articles and the Google Books project. Tweets and song lyrics, let alone news articles or books, do not mimic spoken language in an accurate way. For example, these registers may include the same words as human speech, but certainly not in the same proportion. Written language does not include false starts, nor does it include repetition or elusion in near the same way that spoken language does. Anyone who has done any transcription work will tell you this.

The next assumption made by the authors is that their data is representative of all human language. Representativeness is a major issue in corpus linguistics. When linguists want to investigate a register or variety of language, they build a corpus which is representative of that register or variety by taking a large enough and balanced sample of texts from that register. What is important here, however, is that most linguists do not have a problem with a set of data representing a larger register – so long as that larger register isn’t all human language. For example, if we wanted to research modern English journalism (quite a large register), we would build a corpus of journalism texts from English-speaking countries and we would be careful to include various kinds of journalism – op-eds, sports reporting, financial news, etc. We would not build a corpus of articles from the Podunk Free Press and make claims about all English journalism. But representativeness is a tricky issue. The larger the language variety you are trying to investigate, the more data from that variety you will need in your corpus. Baker (2010: 7) notes that a corpus analysis of one novel is “unlikely to be representative of all language use, or all novels, or even the general writing style of that author”. The English sub-corpora in Dodds et al. exists somewhere in between a fully non-representative corpus of English (one novel) and a fully representative corpus of English (all human speech and writing in English). In fact, in another paper (Dodds et al. 2011), the representativeness of the Twitter corpus is explained as “First, in terms of basic sampling, tweets allocated to data feeds by Twitter were effectively chosen at random from all tweets. Our observation of this apparent absence of bias in no way dismisses the far stronger issue that the full collection of tweets is a non-uniform subsampling of all utterances made by a non-representative subpopulation of all people. While the demographic profile of individual Twitter users does not match that of, say, the United States, where the majority of users currently reside, our interest is in finding suggestions of universal patterns.”. What I think that doozy of a sentence in the middle is saying is that the tweets come from an unrepresentative sample of the population but that the language in them may be suggestive of universal English usage. Does that mean can we assume that the English sub-corpora (specifically the Twitter data) in Dodds et al. is representative of all human communication in English?

Another assumption the authors make is that they have sampled their data correctly. The decisions on what texts will be sampled, as Tognini-Bonelli (2001: 59) points out, “will have a direct effect on the insights yielded by the corpus”. Following Biber (see Tognini-Bonelli 2001: 59), linguists can classify texts into various channels in order to assure that their sample texts will be representative of a certain population of people and/or variety of language. They can start with general “channels” of the language (written texts, spoken data, scripted data, electronic communication) and move on to whether the language is private or published. Linguists can then sample language based on what type of person created it (their age, sex, gender, social-economic situation, etc.). For example, if we made a corpus of the English articles on Wikipedia, we would have a massive amount of linguistic data. Literally billions of words. But 87% of it will have been written by men and 59% of it will have been written by people under the age of 40. Would you feel comfortable making claims about all human language based on that data? How about just all English language encyclopedias?

The next assumption made by the authors is that the relative positive or negative nature of the words in a text are indicative of how positive that text is. But words can have various and sometimes even opposing meanings. Texts are also likely to contain words that are written the same but have different meanings. For example, the word fine in the Dodds et al. corpus, like the rest of the words in the corpus, is just a four letter word – free of context and naked as a jaybird. Is it an adjective that means “good, acceptable, or satisfactory”, which Merriam-Webster says is sometimes “used in an ironic way to refer to things that are not good or acceptable”? Or does it refer to that little piece of paper that the Philadelphia Parking Authority is so (in)famous for? We don’t know. All we know is that it has been rated 6.74 on the positivity scale by the respondents in Dodds et al. Can we assume that all the uses of fine in the New York Times are that positive? Can we assume that the use of fine on Twitter is always or even mostly non-ironic? On top of that, some of the most common words in English also tend to have the most meanings. There are 15 entries for get in the Macmillan Dictionary, including “kill/attack/punish” and “annoy”. Get in Dodds et al. is ranked on the positive side of things at 5.92. Can we assume that this rating carries across all the uses of get in the corpus? The authors found approximately 230 million unique “words” in their Twitter corpus (they counted all forms of a word separately, so banana, bananas, b-a-n-a-n-a-s! would be separate “words”; and they counted URLs as words). So they used the 50,000 most frequent ones to estimate the information content of texts. Can we assume that it is possible to make an accurate claim about how positive or negative a text is based on nothing but the words taken out of context?

Another assumption that the authors make is that the respondents in their survey can speak for the entire population. The authors used Amazon’s Mechanical Turk to crowdsource evaluations for the words in their sub-corpus. 60% of the American people on Mechanical Turk are women and 83.5% of them are white. The authors used respondents located in the United States and India. Can we assume that these respondents have opinions about the words in the corpus that are representative of the entire population of English speakers? Here are the ratings for the various ways of writing laughter in the authors’ corpus:

Laughter tokens Rating
ha 6
hah 5.92
haha 7.64
hahah 7.3
hahaha 7.94
hahahah 7.24
hahahaha 7.86
hahahahaha 7.7
ha 6
hee 5.4
heh 5.98
hehe 6.48
hehehe 7.06

And here is a picture of a character expressing laughter:

Pictured: Good times. Credit: Batman #36, DC Comics, Scott Snyder (wr), Greg Capullo (p), Danny Miki (i), Fco Plascenia (c), Steve Wands (l).

Pictured: Good times. Credit: Batman #36, DC Comics, Scott Snyder (wr), Greg Capullo (p), Danny Miki (i), Fco Plascenia (c), Steve Wands (l).

Can we assume that the textual representation of laughter is always as positive as the respondents rated it? Can we assume that everyone or most people on Twitter use the various textual representations of laughter in a positive way – that they are laughing with someone and not at someone?
Finally, let’s compare some data. The good people at the Corpus of Contemporary American English (COCA) have created a word list based on their 450 million word corpus. The COCA corpus is specifically designed to be large and balanced (although the problem of dealing with spoken language might still remain). In addition, each word in their corpus is annotated for its part of speech, so they can recognize when a word like state is either a verb or a noun. This last point is something that Dodds et al. did not do – all forms of words that are spelled the same are collapsed into being one word. The compilers of the COCA list note that “there are more than 140 words that occur both as a noun and as a verb at least 10,000 times in COCA”. This is the type/token issue that came up in my previous post. A corpus that tags each word for its part of speech can tell the difference between different types of the “same” word (state as a verb vs. state as a noun), while an untagged corpus treats all occurrences of state as the same token. If we compare the 10,000 most common words in Dodds et al. to a sample of the 10,000 most common words in COCA, we see that there are 121 words on the COCA list but not the Dodds et al. list (Here is the spreadsheet from the Dodds et al. paper with the COCA data – pnas.1411678112.sd01 – Dodds et al corpus with COCA). And that’s just a sample of the COCA list. How many more differences would there be if we compared the Dodds et al. list to the whole COCA list?

To sum up, the authors use their corpus of tweets, New York Times articles, song lyrics and books and ask us to assume (1) that they can make universal claims about language despite using only written data; (2) that their data is representative of all human language despite including only four registers; (3) that they have sampled their data correctly despite not knowing what types of people created the linguistic data and only including certain channels of published language; (4) that the relative positive or negative nature of the words in a text are indicative of how positive that text is despite the obvious fact that words can be spelled the same and still have wildly different meanings; (5) that the respondents in their survey can speak for the entire population despite the English-speaking respondents being from only two subsets of two English-speaking populations (USA and India); and (6) that their list of the 10,000 most common words in their corpus (which they used to rate all human language) is representative despite being uncomfortably dissimilar to a well-balanced list that can differentiate between different types of words.

I don’t mean to sound like a Negative Nancy and I don’t want to trivialize the work of the authors in this paper. The corpus that they have built is nothing short of amazing. The amount of feedback they got from human respondents on language is also impressive (to say the least). I am merely trying to point out what we can and can not say based on the data. It would be nice to make universal claims about all human language, but the fact is that even with millions and billions of data points, we still are not able to do so unless the data is representative and sampled correctly. That means it has to include spoken data (preferably a lot of it) and it has to be sampled from all socio-economic human backgrounds.

Hat tip to the commenters on the last post and the redditors over at r/linguistics.


Dodds, Peter Sheridan, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, and Christopher M. Danforth. 2015. “Human language reveals a universal positivity bias”. PNAS 112:8. http://www.pnas.org/content/112/8/2389

Dodds, Peter Sheridan, Kameron Decker Harris, Isabel M. Koumann, Catherine A. Bliss, Christopher M. Danforth. 2011. “Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter”. PLOS One. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0026752#abstract0

Baker, Paul. 2010. Sociolinguistics and Corpus Linguistics. Edinburgh: Edinburgh University Press. http://www.ling.lancs.ac.uk/staff/paulb/socioling.htm

Linell, Per. 2005. The Written Language Bias in Linguistics. Oxon: Routledge.

Mair, Christian. 2015. “Responses to Davies and Fuchs”. English World-Wide 36:1, 29–33. doi: 10.1075/eww.36.1.02mai

Tognini-Bonelli, Elena. 2001. Studies in Corpus Linguistics, Volume 6: Corpus Linguistics as Work. John Benjamins. https://benjamins.com/#catalog/books/scl.6/main

A paper recently published in PNAS claims that human language tends to be positive. This was news enough to make the New York Times. But there are a few fundamental problems with the paper.

Linguistics – Now with less linguists!

The first thing you might notice about the paper is that it was written by mathematicians and computer scientists. I can understand the temptation to research and report on language. We all use it and we feel like masters of it. But that’s what makes language a tricky thing. You never hear people complain about math when they only have a high-school-level education in the subject. The “authorities” on language, however, are legion. My body has, like, a bunch of cells in it, but you don’t see me writing papers on biology. So it’s not surprising that the authors of this paper make some pretty basic errors in doing linguistic research. They should have been caught by the reviewers, but they weren’t. And the editor is a professor of demography and statistics, so that doesn’t help.

Too many claims and not enough data

The article is titled “Human language reveals a universal positivity bias” but what the authors really mean is “10 varieties of languages might reveal something about the human condition if we had more data”. That’s because the authors studied data in 10 different languages and they are making claims about ALL human languages. You can’t do that. There are some 6,000 languages in the world. If you’re going to make a claim about how every language works, you’re going to have to do a lot more than look at only 10 of them. Linguists know this, mathematicians apparently do not.

On top of that, the authors don’t even look at that much linguistic data. They extracted 5,000–10,000 of the most common words from larger corpora. Their combined corpora contain the 100,000 most common words in each of their sub-corpora. That is woefully inadequate. The Brown corpus contains 1 million words and it was made in the 1960s. In this paper, the authors claim that 20,000 words are representative of English. That is, not 20,000 different words, but the 5,000 most common words in each of their English sub-corpora. So 5,000 words each from Twitter, the New York Times, music lyrics, and the Google Books Project are supposed to represent the entire English language. This is shocking… to a linguist. Not so much to mathematicians, who don’t do linguistic research. It’s pretty frustrating, but this paper is a whole lotta ¯\_(ツ)_/¯.

To complete the trifecta of missing linguistic data, take a look at the sources for the English corpora:

Corpus Word count
English: Twitter 5,000
English: Google Books Project 5,000
English: The New York Times 5,000
English: Music lyrics 5,000

If you want to make a general claim about a language, you need to have data that is representative of that language. 5,000 words from Twitter, the New York Times, some books and music lyrics does not cut it. There are hundreds of other ways that language is used, such as recipes, academic writing, blogging, magazines, advertising, student essays, and stereo instructions. Linguists use the terms register and genre to refer to these and they know that you need more than four if you want your data to be representative of the language as a whole. I’m not even going to ask why the authors didn’t make use of publicly available corpora (such as COCA for English). Maybe they didn’t know about them. ¯\_(ツ)_/¯

Say what?

Speaking of registers, the overwhelmingly most common way that language is used is speech. Humans talking to other humans. No matter how many written texts you have, your analysis of ALL HUMAN LANGUAGE is not going to be complete until you address spoken language. But studying speech is difficult, especially if you’re not a linguist, so… ¯\_(ツ)_/¯

The fact of the matter is that you simply cannot make a sweeping claim about human language without studying human speech. It’s like doing math without the numeral 0. It doesn’t work. There are various ways to go about analyzing human speech, and there are ways of including spoken data into your materials in order to make claims about a language. But to not perform any kind of analysis of spoken data in an article about Language is incredibly disingenuous.

Same same but different

The authors claim their data set includes “global coverage of linguistically and culturally diverse languages” but that isn’t really true. Of the 10 languages that they analyze, 6 are Indo-European (English, Portuguese, Russian, German, Spanish, and French). Besides, what does “diverse” mean? We’re not told. And how are the cultures diverse? Because they speak different languages and/or because they live in different parts of the world? ¯\_(ツ)_/¯

The authors also had native speakers judge how positive, negative or neutral each word in their data set was. A word like “happy” would presumably be given the most positive rating, while a word like “frown” would be on the negative end of the scale, and a word like “the” would be rated neutral (neither positive nor negative). The people ranking the words, however, were “restricted to certain regions or countries”. So, not only are 14,000 words supposed to represent the entire Portuguese language, but residents of Brazil are rating them and therefore supposed to be representative of all Portuguese speakers. Or, perhaps that should be residents of Brazil with internet access.

[Update 2, March 2: In the following paragraph, I made some mistakes. I should not have said that ALL linguists believe that rating language is an notoriously poor way of doing an analysis. Obviously I can’t speak for all the linguists everywhere. That would be overgeneralizing, which is kind of what I’m criticizing the original paper for. Oops! :O I also shouldn’t have tied the rating used in the paper and tied it to grammaticality judgments. Grammaticality judgments have been shown to be very, very consistent for English sentences. I am not aware of whether people tend to be as consistent when rating words for how positive, negative, or neutral they are (but if you are, feel free to post in the comments). So I think the criticism still stands. Some say that the 384 English-speaking participants is more than enough to rate a word’s positivity. If people rate words as consistently as they do sentences, then this is true. I’m not as convinced that people do that (until I see some research on it), but I’ll revoke my claim anyway. Either way, the point still stands – the positivity of language does not lie in the relative positive or negative nature of the words in a text (the next point I make below). Thanks to u/rusoved, u/EvM and u/noahpoah on reddit for pointing this out to me.] There are a couple of problems with this, but the main one is that having people rate language is a notoriously poor way of analyzing language (notorious to linguists, that is). If you ask ten people to rate the grammaticality of a sentence on a scale from 1 to 10, you will get ten different answers. I understand that the authors are taking averages of the answers their participants gave, but they only had 384 participants rating the English words. I wouldn’t call that representative of the language. The number of participants for the other languages goes down from there.

A loss for words

A further complication with this article is in how it rates the relative positive nature of words rather than sentences. Obviously words have meaning, but they are not really how humans communicate. Consider the sentence Happiness is a warm gun. Two of the words in that sentence are positive (happiness and warm), while only one is negative (gun). This does not mean it’s a positive sentence. That depends on your view of guns (and possibly Beatles songs). So it is potentially problematic to look at how positive or negative the words in a text are and then say that the text as a whole (or the corpus) presents a positive view of things.

Lost in Google’s Translation

The last problem I’ll mention concerns the authors’ use of Google Translate. They write

We now examine how individual words themselves vary in their average happiness score between languages. Owing to the scale of out corpora, we were compelled to use an online service, choosing Google Translate. For each of the 45 language pairs, we translated isolated words from one language to the other and then back. We then found all word pairs that (i) were translationally stable, meaning the forward and back translation returns the original word, and (ii) appeared in our corpora in each language.

This is ridiculous. As good as Google Translate may be in helping you understand a menu in another country, it is not a good translator. Asya Pereltsvaig writes that “Google Translate/Conversation do not translate. They match. More specifically, they match (bits of) the original text with best translations, where ‘best’ means most frequently found in a large corpus such as the World Wide Web.” And she has caught Google Translate using English as an intermediate language when translating from one language to another. That means that when going between two languages that are not English (say French and Russian), Google Translate will first translate the word into English and then into target language. This represents a methodological problem for the article in that using the online Google Translate actually makes their analysis untrustworthy.


It’s unfortunate that this paper made it through to publication and it’s a shame that it was (positively) reported on by the New York Times. The paper should either be heavily edited or withdrawn. I’m doubtful that will happen.


Update: In the fourth paragraph of this post (the one which starts “On top of that…”), there was some type/token confusion concerning the corpora analyzed. I’ve made some minor edits to it to clear things up. Hat tip to Ben Zimmer on Twitter for pointing this out to me.

Update (March 17, 2015): I wrote a more detailed post (more references, less emoticons) on my problems with the article in question. You can find that here.

%d bloggers like this: