Every kind of language has rules

Yes, even the ones that you don’t like. Here’s a quote from Spoken Soul by John Russell Rickford and Russell John Rickford (2000: 92). It’s perfect in expressing the point that all language varieties have rules:

Every human language studied to date – whether loved or hated, prestigious or not – has regularities or rules of this type [i.e. conventional and systematic ways of pronouncing, modifying, and combining words]. A moment’s reflection would show why this is so. Without regularities, a language variety could not be successfully acquired or used in everyday life, and this applies to Spoken Soul, or Ebonics, as much as to the “Received Pronunciation,” or “BBC English,” of the British upper crust. Characterizations of the former as careless or lazy, and of the latter as careful or refined, are subjective social and political evaluations that reflect prejudices and preconceptions about the people who usually speak each variety.

That is so good. The book that it appears in is about Black English (also called African American Vernacular English), so of course Rickford and Rickford had to address the (uninformed) idea that Black English is just “English without rules.” It’s not and it never was.

You don’t get to claim that some specific group(s) of people don’t have any rules to the way they speak. Because if you claim that, it will say more about your judgment of those people than it will about your assessment of their language. (Well, it will also say that you’re not very good at making assessments about language.)

Every language variety follows systematic rules. Every single one. Not some. Not most. All of them. They may not follow the same rules as each other, but they follow rules nonetheless.

Call them what they want

There was an op-ed by Abigail Shrier in the Wall Street Journal (it’s paywalled, but no need to click, I’ve copied the relevant bits below) on August 29, 2018. It’s about what a terrible thing it is to make public employees use the preferred pronouns of the public individuals that they are serving. Basically, it’s about how people should be able to call others “he” or “she” even if the person that they are talking to prefers a different pronoun, such as “they”.

I only want to point out two problems with this person’s argument. First, the writer says:

Typically, in America, when groups disagree, we leave them to employ the vocabularies that reflect their values. My “affirmative action” is your “racial preferences.” One person’s “fetus” is another’s “baby boy.” This is as it should be; an entire worldview is packed into the word “fetus.” Another is contained in the reference to one person as “them” or “they.” For those with a religious conviction that sex is both biological and binary, God’s purposeful creation, denial of this involves sacrilege no less than bowing to idols in the town square. When the state compels such denial among religious people, it clobbers the Constitution’s guarantee of free exercise of religion, lending government power to a contemporary variant on forced conversion.

But that’s not how it works. If you work for the government and you want to use slurs to refer to people, too bad. You can’t do that. I’m sure Richard Spencer (the head racist du jour) or David Duke (your parents’ racist du jour) would argue that it is part of their “worldview” to call black and brown and gay people all the horrible things that they call them. But fuck that nonsense. We don’t let them use the words that they want. We shun them for it. And if they work for the government, we penalize them for it (yeah, I know, things are pretty bad right now, but if you’re arguing that the racists currently in the US government should be allowed to keep on being racist, then you’re wrong).

Second, the writer backs up a linguistic argument* by referencing Locke. Philip Locke, you ask? The linguist who wrote University Grammar: A University Course? Haha. No. John Locke, the [checks notes] philosopher from the [checks notes again] 17th century. I wonder if anything has changed in linguistics since then. Guess not!

Here’s what it boils down to: you don’t get to call anyone anything you want without any repercussions. Sorry! (not sorry) Can’t bring yourself to use a person’s preferred pronoun because of your bigoted worldview? Change your worldview. Or just call them by their name FFS. This isn’t that difficult and you don’t need to write an op-ed about it, Abigail.

Ok, one final point. The writer says:

In most contexts, I would have no problem addressing others in any manner they chose.

That sounds an awful lot like “I’m not a racist, but…”

 

*Sure, the argument is about culture and worldviews and society – but wrapped up in all of that is language. And the article is specifically about words.

Wonder Woman speaks a creole

So, what language does Wonder Woman speak as an L1? What’s the language she spoke growing up on Paradise Island, aka Themyscira? That’s a good question and The World’s Greatest Detective™ is on the case: Continue reading “Wonder Woman speaks a creole”

Patriotic grammar scolds puh-lease

A recent article (blog post?) by Mary Wilson in Slate discusses the language used by the Russian trolls who were indicted for subverting the 2016 US presidential election. But perhaps unsurprisingly in an online article about grammar, the writer gets grammar totally wrong. Let’s take a look at the grammar “mistakes” that the writer points out.

One political ad placed online by the Russians apparently read, “Hillary is a Satan, and her crimes and lies had proved just how evil she is.” Just a Satan, not the? Is there a class of Satans of which Hillary was just one example? If so, why capitalize the S? [italics original]

1. “a Satan”. Fine, but Mary Wilson suggests using “the Satan”. Sorry, I meant to write the Mary Wilson suggests using “the Satan”. See how weird that sounds? That’s because proper nouns do not usually take any articles. In fact, adding the definite article is what would make this construction seem like there is a class of satans. Compare: That’s not the Satan I was referring to. Maybe that’s what Wilson was going for, but I doubt it.

In one email to a Trump campaign official, a disguised Russian agent reportedly wrote: “We gained a huge lot of followers and decided to somehow help Mr. Trump get elected.” Is a huge lot a Walmart-size amount? Costco? Not to mention the awkwardly deployed somehow.

2. I agree huge lot is not a common construction, but what is grammatically wrong with it? Not to mention “the awkwardly deployed somehow” has nothing to do with grammar.

As noted in the Washington Post last year, “A revealing characteristic of the Russian language, the absence of the definite and indefinite article, is evident in statements such as ‘out of cemetery’ and ‘burqa is a security risk.’” But, the article goes on to say, these mistakes are harder to take notice of given how sloppily written the average social media discourse is.

3. This whole paragraph. A revealing characteristic of the Russian language is the Russian language. The sentence should read “a revealing characteristic of English mistakes made by people whose L1 does not have articles is the misuse of articles in English. Russian is one such language, but there are thousands more.” This one is also on the Washington Post. The next sentence describes social media discourse as “sloppily written”. This is a bunch of shit. Language written online isn’t supposed to follow standard English norms. That’s part of what makes it funner than standard written English. People know that they don’t have to follow the rules of standard English when they write online, so they don’t. But somehow – somehow! – they are still understood. Could it be that the rules of standard English aren’t as important to clarity and understanding as grammar scolds would have us believe?

The Mary Wilson tells us that these grammar “mistakes” imply “that we were wrong to ever let it become uncool to fixate on bad grammar and slack syntax, no matter what the venue”. If it’s uncool to fixate on bad grammar, that’s because many of the grammar scolds don’t know what the hell they’re talking about. They’ve commandeered the word grammar to mean “any stylistic feature that I internalized in high school, in either speech or writing, and have decided to apply system-wide across the language”. It’s a catch-all condemnation for people who want to point out their superiority. Don’t believe the hype.

Wilson ends the post by saying that paying attention to sentence fragments and dangling participles is “patriotic”. I wonder why she didn’t mention sentence fragments and dangling participles in her scolding of the Russian trolls. Is it because sentence fragments and dangling participles are not part of grammar? It is.

How do I like this language stuff?

A funny story about the Labov book, Dialect Diversity in America, which I just reviewed. When my parents were visiting last summer, my mom was looking through my bookshelf for something to read. I first gave her The 1% and the Rest of Us (or maybe Nickle and Dimed) and she burned right through it because she’s even more of a socialist than I am. Then I gave her Labov’s book. I know she has a passing interest in language and that she has read other linguistics books aimed at the general public. But while I was in the other room, and she was sitting at the table with my wife, this is what I hear:

My mom: I don’t know how Joseph likes this stuff. Phonetics…?

My wife: Yeah, but he loves it.

My mom: Yes, he does. It’s tough to get through though.

Me: I CAN HEAR YOU!

My wife and my mom: We know!

🙂

My mom read the whole book and she ended up enjoying the final chapters – which deal with history, politics, and social change – more than she did the earlier ones. She couldn’t stop talking about those chapters. I was like, “But Mom, what about that (ING) variable and that Northern Cities Shift?!”

Linguistics is for shimonrikbliks

This morning, my six-year-old decided he would come up with a new word. The word he invented was shimonrikblik [ ʃimoʊnɹɪkblɪk ]. I asked him what it meant and he thought for a while and said, “It’s another word for linguistics.” I love him. Then he told me I have to tell all my “linguistics people”, so I put it out on Twitter.

https://twitter.com/EvilJoeMcVeigh/status/964084153560989696

I started thinking about the definition he gave shimonrikblik. We already have a word for “linguistics” (hint: it’s “linguistics”). But language being what it is, this does not mean that we can’t have two words for the same thing. Many stuffy and prescriptive style guides will tell you that you shouldn’t use neologisms (new words) for things that we already have words for. Fortunately, few people read these guides and even fewer follow this particular piece of advice. What the guides usually forget to mention is that different words for the same concept can be used in different variations and registers of one language. For example:

soccer and football

regardless and irregardless

you (plural) and y’all, youse, yinz, etc.

The first of these is pretty much a difference between national variations: UK people use football, US and Australian(?) people use soccer, but there are variations within these countries. The second is a difference in register, with regardless being used in formal registers (and edited writing) and irregardless being used in informal registers (and casual spoken language). The last of these is again a difference in register, probably again mostly between formal/informal speech and writing, but this time there are many words used to address a group of people directly with a pronoun. Your choice of pronoun will depend on your regional dialect and the situation. You probably have at least two of these pronouns in your vocabulary. And I’m just talking about English. Other languages will do things differently.

For example, here’s a really interesting thing about synonyms in some of the languages of Australia.

In most [Aboriginal societies of Australia], an individual’s name would not be used after their death. Furthermore, in many of them, words which sounded similar to that individual’s name were also prohibited. This practice would clearly present many inconveniences if there were not some way of replacing the banned vocabulary. The usual practice, resting on the widespread multilingualism that was a standard feature of traditional Aboriginal society in Australia, was to adopt the translational equivalent of the prohibited word from a neighboring language, and to use it until the old word became reusable (Introducing Semantics by Nick Riemer, 2010, p. 154).

The words above are just what I was able to come up with off the top of my head. There are many more and the degree to which a pair of words is synonymous will depend on things such as context and senses. For example, football is not entirely synonymous with soccer because there is a sense of football which means the game that is played in America (FLY EAGLES FLY!!!) But maybe shimonrikblik has a chance of taking off? It’s the word children use for linguistics? It’s how linguists casually refer to their (awesome) field? It’s linguistics, but with a child-like innocence?

What kind of dialect do you drive?

On the Vocal Fries podcast, Professor Carmen Fought made a wonderful analogy about accents. Prof. Fought said:

Everybody who speaks a language speaks a dialect of that language. So you speak a dialect, I speak a dialect; a dialect is not a bad thing, it’s something you can’t help. It’s like the make and model of a car: like, you have a Honda, but then it has to have a model like a Civic or an Accord. You can’t just say, “Oh no, no, no, I just have a Honda. It doesn’t have a model.” It’s the same thing. You can’t say “I speak a language. I don’t speak a dialect.” No. Everyone speaks a dialect.

I really like this analogy and I’m going to use it in the classroom. You should go listen to the whole episode (and all the other episodes!) here: https://vocalfriespod.fireside.fm/9. The episode’s topic is the Chicano English dialect. The analogy comes about 14:30 minutes in.

What it really sounds like to be American: A response to NPR’s Code Switch

NPR’s Code Switch did an interview about language a few months ago and it stayed on my mind because of how bad it was. I gave it a re-listen and I’d like to point out just why it’s so bad. You can listen to the episode below. It’s episode 42 and it’s called “Not-So-Simple Questions From Code Switch Listeners”. The interview in question starts at the 14:47 mark. The hosts, Gene Demby and Shereen Marisol Meraji, talk to Brent Blair about what it sounds like to be American. I couldn’t find a transcript of the interview, so I made my own, which you can find here. I’ll summarize Blair’s points below and briefly point out why they are wrong. The linguistics behind each of the topics that I discuss below is complex, but I will try to keep things simple in order to keep things short.

1. We understand this quote unquote “American dialect” or “Received American Pronunciation” based on culture and media: what sells.

No, we don’t. We (I mean linguists, people who study dialects) understand American dialects (plural) based on how the dialects sound. Non-linguists (and linguists when they’re not studying dialects) understand dialects through an array of socio-economic and linguistic factors.

“Received American Pronunciation” is not a thing. Blair is mixing up General American and Received Pronunciation, the accents with the highest prestige in the US and the UK, respectively. Many national newscasters in the US use General American on air (for example, Brian Williams). In the UK, Received Pronunciation is used by the Royal Family and members of parliament (with exceptions, of course). Mixing up the names of these two dialects is so incredibly basic that it’s hard to believe someone would make it. It’s like someone talking about the Boston Yankees baseball team. Or the band Led Sabbath. Or President Abraham E. Lee. The term General American is not without its problems.

2. What we understand as the American dialect comes from the West Coast, specifically Hollywood, and what Hollywood has considered the standard American dialect. This dialect is “vanilla” – its features do not include “twisty or harsh R sounds or twangy stuff or dropped AH” (quotes from Blair).

It’s probably not surprising that a theater professor would think that Hollywood is responsible for our thoughts on American dialects. Blair is almost correct on this – the dialect used in many popular movies is indeed General American. It doesn’t come from Hollywood, though. The dialect known as General American comes from the eastern part of the US, and it is often considered the dialect of the Midwestern region of the United States, not California. General American is believed to not have any regional or ethnic features, but obviously this is nonsense. It is a mish-mash of various dialects. It’s also (as far as I can tell) not really used in dialect studies anymore.

Map of the dialects of North America. From The Atlas of North American English by Labov, Ash and Boberg (2006; Map 11.15).
Map of the dialects of North America. From The Atlas of North American English by Labov, Ash and Boberg (2006; Map 11.15).

The terms “vanilla”, “twisty”, “harsh R”, “twangy”, and “dropped AH” are not used in dialect studies. These terms are problematic. For example, the dialect that Blair is calling standard, the one from Hollywood, uses an R sound. This is one of the ways that linguists describe dialects: whether they include a post-vocalic R or not. Linguists use the terms rhotic to describe dialects which pronounce the R when it comes after a vowel, and non-rhotic to describe dialects which do not pronounce post-vocalic Rs. The Boston dialect is classically non-rhotic, with Hahvahd Yahd (Harvard Yard) being a common term used by people imitating the dialect (Notice that the Boston dialect doesn’t drop all of its Rs, just the ones which come after a vowel and before a consonant. No one in Boston goes to watch the Pat_iots or B_uins play). So, do rhotic dialects have “harsh R sounds”? I don’t know because I don’t know what the hell that means. What does “twangy” mean? What dialect sounds “twangy”? Does Nelly sound “Twangy” (he’s from St. Louis)? Does Taylor Swift (she’s from eastern Pennsylvania)? Can I say that this whole interview sounds “twangy” or should I use the more technical term: shitty?

3. Regionalisms in dialects are disappearing rapidly. Today a person from Atlanta, Georgia, sounds like a person from California. You can’t tell the difference between people from Houston, Chicago and New York. On the contrary, dialects in rural areas are still diverse.

Blair couldn’t be more wrong about this. Literally the first page of William Labov’s Dialect Diversity in America says “People tend to believe that dialect differences in American English are disappearing, especially given our exposure to a fairly uniform broadcast standard in the mass media. One can find this point of view in almost any discussion of American dialects […] This overwhelming common opinion is simply and jarringly wrong.” THE FIRST GODDAMN PAGE. Of a book that is sure to turn up in any Amazon or Google search on dialects in America. There is no way that Blair’s name showed up in a Google search of dialects in America.

Even though the Code Switch hosts didn’t need to read past the second page of Labov’s book to get better info than Blair gave them, if they had made it to page 35, they would have read “The dialects of Chicago, Philadelphia, Pittsburgh, and Los Angeles are now more different from each other than they were 50 or 100 years ago […] On the other hand, dialects of many smaller cities have receded in favor of the new regional patterns.” Again, exactly the opposite of what Blair told them. Labov also does something which Blair does not: he backs up his claims with (decades of) research. I guess they do linguistics differently in the field of theater studies.

As if that wasn’t enough, here’s a story from NPR about dialects NOT disappearing!

4. Globalization, commercialism, and our careers have made us say “We all want to sound the same”.

K.

5. This “vanilla” Californian dialect, or this blending of dialects, and/or the disappearance of regionalisms is not due to class or race, but access and power. (It’s hard to tell what they are talking about here. They use the term “placeless”.)

Things kind of break down around point 5. Blair has dug himself into a hole and he can’t get out. He talks about how people of color are only allowed to use the Vanilla-fornian dialect based on the culture that is employing them and their relationship to systems of power, but it is unclear what he means and he is unable to explain. He only offers an immediate anecdote – the interviewer Meraji is able to say “Latino” with a Puerto Rican accent on NPR, so maybe she would allow herself to use more Spanish on air in the future. But Spanish isn’t a dialect. Meraji would allow herself to speak Spanish on NPR if she knew her audience would understand her. Blair wraps it all up with something truly bizarre when he says, “So for me, when we’re accent stereotyping, it just means we haven’t fallen in love enough with that community to understand its diversity and its complexity”. I don’t know what the hell this guy is talking about.

Pointing fingers

So who’s at fault here? I think partial blame falls on both sides.

First, Blair should be blamed for not saying no to the interview. If NPR called me up and asked me to talk about theater studies, I would say no. Because I’m not a theater scholar or professional. If someone called you up and said “Hey, we want to talk about theoretical mathematics on the radio,” would you say “Sure! I took math in high school. Let’s do this.”? No, of course you wouldn’t. But they called Blair up and he said, “Ummmm, I speak a language. Get me on the phone!” And then he proved that he knows about as much about language and dialects as I do about theater studies. It’s not that Blair can’t know anything about dialects in America, it’s that he showed he doesn’t know anything about dialects in America. If he had gotten everything right, I wouldn’t be writing this blog post.

Some of the blame also goes to the people at Code Switch though. If they wanted to talk about language and dialects, why didn’t they call a linguist? Why did they think calling a theater professor, who as far as I can tell has not written anything on language, would be ok? In an earlier part of this episode, the hosts have a discussion about the magical negro and they talk to Ebony Elizabeth Thomas, a professor and researcher who has published on representations of people of color in various media. Thomas is at the University of Pennsylvania, the same university as Labov, who I quoted above. She literally could have transferred them over to his office. Or they could have talked to Walt Wolfram or Natalie Schilling or John Baugh. Any of these people would have been far better than Blair.

Ok, I’ve been pretty hard on everyone in this interview. You may be thinking, jeez, this guy just doesn’t like it when people talk about language. That’s not the case. I don’t like it when prominent news organizations talk about language and get it so wrong (I see you, The New Yorker). If you want to hear a really great interview on language and linguistics, go listen to this Top of Mind interview (download it here). The host, Julie Rose, and the guests talk about filler words (um, uh, you know, etc.), which is – like dialects – a linguistic topic with a divide between what the public thinks and what linguists have discovered. To discuss this topic, the host invited two linguists who have researched filler words, Alexandra D’Arcy and Jena Barchas-Lichtenstein. I hope other interviewers listen to this and learn how to discuss language on air.

If you are interested in learning more about dialects in America and/or dialect discrimination, follow the links behind the researchers’ names in the previous two paragraphs. Most of them have written books and articles aimed at the general public. Walt Wolfram even has a movie about African American speech coming out and it sounds amazing. I’m not saying that all of the things you will read are going to be positive – discrimination based on language happens and it is terrible. But the research put out by these and other linguists is fascinating and it can actually do what the NPR Code Switch interview attempted to do: make you more informed about language.

Hat tip to Nicole Holliday on Twitter for pointing me to this Code Switch episode. Holliday would also have been good for this interview.

Update 14 June 2017:

Almost immediately after posting this article and sharing it on Twitter, Gene Demby reached out. Gene is one of the hosts of NPR’s Code Switch. According to him, this episode “was the source of much consternation”. Gene wanted to talk to a linguist but was overruled by an editor. He has also said the Code Switch will do better in the future and that they have an episode about African American Vernacular English (AAVE) coming up. I’d like to thank Gene for clearing things up and I look forward to that episode.

Also related to this post, Kevin Calcamp reached out to say that Blair’s views are not representative of the study of linguistics in theater and performance studies. Kevin says that theater/performance scholars have a good understanding of linguistics. I believe him. He also pointed out the complicated nature and the various ways of incorporating dialects into theater/performance studies (follow the tweet below to see more). Thanks, Kevin, for explaining things.

Book Review: American English Compendium by Marv Rubinstein

As a dictionary of English vocabulary and phrases, the American English Compendium by Marv Rubinstein is satisfactory. It is 500 pages long so it covers a lot of ground. As a book of American English or Americanisms, this book is not what it seems. A brief glance at any of the pages will make you question if the entries really are words or phrases that are exclusive to American English. And a comparison to another source will most likely show that they are not. As a commentary on language, however, this book is terrible.

American English Compendium
Cover of American English Compendium by Marv Rubinstein. Published by Rowman & Littlefield. Cover design by Neil Cotterill.

The problems start on the first page of Chapter 1. The author defends the use of the term American English by proclaiming it is better than British English:

Dynamic. Versatile. Imaginative. Capable of capturing fine nuances. All these terms can truthfully be used to describe the American language. “Don’t you mean the ‘English language’?” some readers may ask. No, I mean the American language. Over many years, American English has vastly expanded and changed, a transmutation that has left it only loosely connected to its mother tongue, British English. (p. 3)

Although no one would (or should) argue that American English is a term that needs to be defended, the imaginary readers in this passage come off as more knowledgeable about language than the author. Are we really to believe American English is the only variation of English that is “dynamic” or “imaginative” or “capable of capturing fine nuances”? The problem gets compounded when the author recognizes the influence of American English in England, but seems to suggest that the reverse is not happening:

[W]hile there are numerous localisms [in countries where English is the primary language], more and more the terminology, idioms, slang, and colloquialisms smack of American English. Even in England this is slowly but surely happening. (p. 3)

And it only get stranger from there. On the next page we are told:

Things have changed so much, and the use of American English in international communications has grown so much, one can now safely say that most English speakers use (to a greater or lesser degree) Americanized English – that is, the American language. And rightly so. The American language is so much richer and more adventurous. British English neve stood a chance. (p. 4, emphasis mine)

Excuse me, Mr. Rubinstein, but H. G. Wells, J. K. Rowling, Grant Morrison, Agatha Christie and a thousand other British writers would like a word.

After this “proof” that ‘Murican English is better than British English, readers are given a “microcosm of what is happening” (p. 4) in the world. Rubinstein relates a story from an article by New York Times columnist and economist Thomas Friedman about how a senior Moroccan official is sending his kids to an American school even though he was educated in a French school. Rubinstein uses this story to claim that

There are now several American schools in Casablanca, each with a long waiting list. In addition, English (primarily American English) courses are springing up all over that country. If this is happening in Morocco, a country with long-lasting French connections and traditions, it is undoubtedly happening everywhere. The American language is becoming ubiquitous. (p. 5)

But it needs to be noted that Friedman does not claim that these English-language schools which are supposedly popping up all over Casablanca are teaching American English. Nor are readers given any proof that Casablanca is an example of what is happening around the world. I am very hesitant to believe it is. While it’s a cute story, this kind of claim needs to be backed up with evidence. How do we know that the English being taught in these schools is strictly British or American or some variation of English as an international language? We have to take the Rubinstein’s word for it, but as we have seen with his dismissal of British English, he is not to be trusted when it comes to linguistics commentary.

Further down the page, in a section titled The Richness of the American Language, Rubinstein claims that “much of the richness of the American language lies in the fact that it has absorbed words and expressions from at least fifty other languages.” (p. 5) He lists some examples, but completely fails to acknowledge the fact that many of them, such as brogue and orangutan and typhoon, were originally borrowed into British English and then used by Americans.

Rubinstein then presumes readers will ask how the American language differs from other languages, which obviously also use foreign words and phrases. But the answer given is just as confused as the question. The author states that “there is no question that American English has been like a sponge absorbing and modifying words from many other languages” (p. 7) without realizing (or reporting) that this is true of English in general, not American English in particular. This is actually true of languages in general, although English does appear to be particularly greedy when it comes to borrowing words from other languages.

Later, there is a fairly reasonable, but short and undefinitive, discussion of “Black English” (African American Vernacular English). The section unfortunately ends with this quote: “Educated African Americans, of course, use standard American English” (pp. 11–12). Well, good for them.
Things get really bonkers in the section on compounding, which includes this howler:

Compound words exist in almost all languages, but never anywhere near the extent that they do in American English. […] during the last few decades, compounding has reached epidemic proportions. The vast majority of compound words are of relatively recent origin languagewise (p. 15)

This is nonsense. Does the author know how any other languages work? Finnish compounds words much more than English does. In fact, the syntax of Finnish demands it, unlike in English where compounding is very often a matter of style. And how do we know that the “vast majority” of compound words are not old? Let’s say “the last few decades” goes back to 1960. Do you really think words such as outcast, outdoors, outlook, output, overcome, overdoes, overdue, oversee, oddball, goofball, downfall, and downhill (all words supplied by the author) were made compound words after 1960?

Here are some other WTFs in this book along with the thoughts I had after reading them:

In general [the English speakers of Australia, Canada, Guyana, India, Ireland, New Zealand, and South Africa] all understand each other, but, as you have seen in the previous chapter on American and British English, there are substantial differences. The same can be said of the English used in the other countries listed above. With a few exceptions, Canadian English consists of a blending of American and British English, but the other English-speaking countries have all developed their own unique and distinctive expressions (including slang and colloquialisms). (p. 267)

Hahahahaha! Fuck you, Canada! Get your own expressions, eh!

 

English is an Anglo-Saxon language with roots in Latin, the Romance Languages, and German. [No.] This means that most, if not all, English words are variations of foreign words, and such words have legitimately entered the language. (p. 281)

WHAT THE FUCK DOES THIS MEAN?!

 

The Oxford English Dictionary prides itself on keeping up to date, and it does pretty well (but not perfect) with including new words in its latest editions. Unfortunately, libraries with limited budgets these days do not always have the most recent revisions. Your best bet for researching neologisms is probably the Internet – for example, Google. (p. 403)

Because the OED is the only dictionary in the world. I’ve said it before and I’ll say it again: In linguistics research there is only the OED and Google. It’s a wonder we get anything done.

 

Chairman has become chairperson and has been further reduced to chair. But many gender-based terms remain unresolved. While, for example, policeman easily becomes police officer, other words and phrases resist change. One almost invariably hears expressions such as “Everyone to their own taste. [What? Who invariably hears this?] Grammatically incorrect [Nope!] but why risk offending potential female customers of advertised products? [Bitches be trippin’, amiright?] However, when a woman mans the controls of an aircraft, should the term be changed even though it denotes action, not identity? What should we now call a “manhole cover”? [Serious questions, you guys.] Note that we no longer have actresses; they all insist on being called actors. [How dare they?!] (p. 13)

Based on the claims about language alone, I would not recommend this book. I don’t know how someone writes a book about language and gets so much wrong. The word and phrase entries may be useful, but any online dictionary will have most if not all of them. Go there instead or get a proper reference book from a respected dictionary.

My Corpus Brings All the Boys to the Yard

In two recent papers, one by Kloumann et al. (2012) and the other by Dodds et al. (2015), a group of researchers created a corpus to study the positivity of the English language. I looked at some of the problems with those papers here and here. For this post, however, I want to focus on one of the registers in the authors’ corpus – song lyrics. There is a problem with taking language such as lyrics out of context and then judging them based on the positivity of the words in the songs. But first I need to briefly explain what the authors did.

In the two papers, the authors created a corpus based on books, New York Times articles, tweets and song lyrics. They then created a list of the 10,000 most common word types in their corpus and had voluntary respondents rate how positive or negative they felt the words were. They used this information to claim that human language overall (and English) is emotionally positive.

That’s the idea anyway, but song lyrics exist as part of a multimodal genre. There are lyrics and there is music. These two modalities operate simultaneously to convey a message or feeling. This is important for a couple of reasons. First, the other registers in the corpus do not work like song lyrics. Books and news articles are black text on a white background with few or no pictures. And tweets are not always multimodal – it’s possible to include a short video or picture in a tweet, but it’s not necessary (Side note: I would like to know how many tweets in the corpus included pictures and/or videos, but the authors do not report that information).

So if we were to do a linguistic analysis of an artist or a genre of music, we would create a corpus of the lyrics of that artist or genre. We could then study the topics that are brought up in the lyrics, or even common words and expressions (lexical bundles or n-grams) that are used by the artist(s). We could perhaps even look at how the writing style of the artist(s) changed over time.

But if we wanted to perform an analysis of the positivity of the songs in our corpus, we would need to incorporate the music. The lyrics and music go hand in hand – without the music, you only have poetry. To see what I mean, take a look at the following word list. Do the words in this list look particularly positive or negative to you?

a

ain’t

all

and

as

away

back

bitch

body

breast

but

butterfly

can

can’t

caught

chasing

comin’

days

did

didn’t

do

dog

down

everytime

fairy

fantasy

for

ghost

guess

had

hand

harm

her

his

i

i’m

if

in

it

looked

lovely

jar

makes

mason

life

live

maybe

me

mean

momma’s

more

my

need

nest

never

no

of

on

outside

pet

pin

real

return

robin

scent

she

sighing

slips

smell sorry

that

the

then

think

to

today

told

up

want

wash

went

what

when

with

withered

woke

would

yesterday

you

you’re

your

If we combine these words as Rivers Cuomo did in his song “Butterfly”, they average out to a positive score of 5.23. Here are the lyrics to that song.

Yesterday I went outside
With my momma’s mason jar
Caught a lovely Butterfly
When I woke up today
And looked in on my fairy pet
She had withered all away
No more sighing in her breast

I’m sorry for what I did
I did what my body told me to
I didn’t mean to do you harm
But everytime I pin down what I think I want
it slips away – the ghost slips away

I smell you on my hand for days
I can’t wash away your scent
If I’m a dog then you’re a bitch
I guess you’re as real as me
Maybe I can live with that
Maybe I need fantasy
A life of chasing Butterfly

I’m sorry for what I did
I did what my body told me to
I didn’t mean to do you harm
But everytime I pin down what I think I want
it slips away – the ghost slips away

I told you I would return
When the robin makes his nest
But I ain’t never comin’ back
I’m sorry, I’m sorry, I’m sorry

Does this look like a positive text to you? Does it look moderate, neither positive nor negative? I would say not. It seems negative to me, a sad song based on the opera Madame Butterfly, in which a man leaves his wife because he never really cared for her. When we include the music into our consideration, the non-positivity of this song is clear.

[youtube https://www.youtube.com/watch?v=rCoGkMlfz9I]
Let’s take a look at another list. How does this one look?

above

absence

alive

an

animal

apart

are

away

become

brings

broke

can

closer

complicate

desecrate

down

drink

else

every

everything

existence

faith

feel

flawed

for

forest

from

fuck

get

god

got

hate

have

help

hive

honey

i

i’ve

inside

insides

is

isolation

it

it’s

knees

let

like

make

me

my

myself

no

of

off

only

penetrate

perfect

reason

scraped

sell

sex

smell

somebody

soul

stay

stomach

tear

that

the

thing

through

to

trees

violate

want

whole

within

works

you

your

Based on the ratings in the two papers, this list is slightly more positive, with an average happiness rating of 5.46. When the words were used by Trent Reznor, however, they expressed “a deeply personal meditation on self-hatred” (Huxley 1997: 179). Here are the lyrics for “Closer” by Nine Inch Nails:

You let me violate you
You let me desecrate you
You let me penetrate you
You let me complicate you

Help me
I broke apart my insides
Help me
I’ve got no soul to sell
Help me
The only thing that works for me
Help me get away from myself

I want to fuck you like an animal
I want to feel you from the inside
I want to fuck you like an animal
My whole existence is flawed
You get me closer to god

You can have my isolation
You can have the hate that it brings
You can have my absence of faith
You can have my everything

Help me
Tear down my reason
Help me
It’s your sex I can smell
Help me
You make me perfect
Help me become somebody else

I want to fuck you like an animal
I want to feel you from the inside
I want to fuck you like an animal
My whole existence is flawed
You get me closer to god

Through every forest above the trees
Within my stomach scraped off my knees
I drink the honey inside your hive
You are the reason I stay alive

As Reznor (the songwriter and lyricist) sees it, “Closer” is “supernegative and superhateful” and that the song’s message is “I am a piece of shit and I am declaring that” (Huxley 1997: 179). You can see what he means when you listen to the song (minor NSF warning for the imagery in the video). [1]

[vimeo 3554226 w=500 h=377]

Nine Inch Nails: Closer (Uncensored) (1994) from Nine Inch Nails on Vimeo.

Then again, meaning is relative. Tommy Lee has said that “Closer” is “the all-time fuck song. Those are pure fuck beats – Trent Reznor knew what he was doing. You can fuck to it, you can dance to it and you can break shit to it.” And Tommy Lee should know. He played in the studio for NIИ and he is arguably more famous for fucking than he is for playing drums.

Nevertheless, the problem with the positivity rating of songs keeps popping up. The song “Mad World” was a pop hit for Tears for Fears, then reinterpreted in a more somber tone by Gary Jules and Michael Andrews. But it is rated a positive 5.39. Gotye’s global hit about failed relationships, “Somebody That I Used To Know”, is rated a positive 5.33. The anti-war and protest ballad “Eve of Destruction”, made famous by Barry McGuire, rates just barely on the negative side at 4.93. I guess there should have been more depressing references besides bodies floating, funeral processions, and race riots if the song writer really wanted to drive home the point.

For the song “Milkshake”, Kelis has said that it “means whatever people want it to” and that the milkshake referred to in the song is “the thing that makes women special […] what gives us our confidence and what makes us exciting”. It is rated less positive than “Mad World” at 5.24. That makes me want to doubt the authors’ commitment to Sparkle Motion.

Another upbeat jam that the kids listen to is the Ramones’ “Blitzkrieg Bop”. This is the energetic and exciting anthem of punk rock. It’s rated a negative 4.82. I wonder if we should even look at “Pinhead”.

Then there’s the old American folk classic “Where did you sleep last night”, which Nirvana performed a haunting version of on their album MTV Unplugged in New York. The song (also known as “In the Pines” and “Black Girl”) was first made famous by Lead Belly and it includes such catchy lines as

My girl, my girl, don’t lie to me
Tell me where did you sleep last night
In the pines, in the pines
Where the sun don’t ever shine
I would shiver the whole night through

And

Her husband was a hard working man
Just about a mile from here
His head was found in a driving wheel
But his body never was found

This song is rated a positive 5.24. I don’t know about you but neither the Lead Belly version, nor the Nirvana cover would give me that impression.

Even Pharrell Williams’ hit song “Happy” rates only 5.70. That’s a song so goddamn positive that it’s called “Happy”. But it’s only 0.03 points more positive than Eric Clapton’s “Tears in Heaven”, which is a song about the death of Clapton’s four-year-old son. Harry Chapin’s “Cat’s in the Cradle” was voted the fourth saddest song of all time by readers of Rolling Stone but it’s rated 5.55, while Willie Nelson’s “Always on My Mind” rates 5.63. So they are both sadder than “Happy”, but not by much. How many lyrics must a man research, before his corpus is questioned?

Corpus linguistics is not just gathering a bunch of words and calling it a day. The fact that the same “word” can have several meanings (known as polysemy), is a major feature of language. So before you ask people to rate a word’s positivity, you will want to make sure they at least know which meaning is being referred to. On top of that, words do not work in isolation. Spacing is an arbitrary construct in written language (remember that song lyrics are mostly heard not read). The back used in the Ramones’ lines “Piling in the back seat” and “Pulsating to the back beat” are not about a body part. The Weezer song “Butterfly” uses the word mason, but it’s part of the compound noun mason jar, not a reference to a brick layer. Words are also conditioned by the words around them. A word like eve may normally be considered positive as it brings to mind Christmas Eve and New Year’s Eve, but when used in a phrase like “the eve of destruction” our judgment of it is likely to change. In the corpus under discussion here, eat is rated 7.04, but that doesn’t consider what’s being eaten and so can not account for lines like “Eat your next door neighbor” (from “Eve of Destruction”).

We could go on and on like this. The point is that the authors of both of the papers didn’t do enough work with their data before drawing conclusions. And they didn’t consider that some of the language in their corpus is part of a multimodal genre where there are other things affecting the meaning of the language used (though technically no language use is devoid of context). Whether or not the lyrics of a song are “positive” or “negative”, the style of singing and the music that they are sung to will highly effect a person’s interpretation of the lyrics’ meaning and emotion. That’s just the way that music works.

This doesn’t mean that any of these songs are positive or negative based on their rating, it means that the system used by the authors of the two papers to rate the positivity or negativity of language seems to be flawed. I would have guessed that a rating system which took words out of context would be fundamentally flawed, but viewing the ratings of the songs in this post is a good way to visualize that. The fact that the two papers were published in reputable journals and picked up by reputable publications, such as the Atlantic and the New York Times, only adds insult to injury for the field of linguistics.

You can see a table of the songs I looked at for this post below and an spreadsheet with the ratings of the lyrics is here. I calculated the positivity ratings by averaging the scores for the word tokens in each song, rather than the types.

(By the way, Tupac is rated 4.76. It’s a good thing his attitude was fuck it ‘cause motherfuckers love it.)

Song Positivity score (1–9)
“Happy” by Pharrell Williams 5.70
“Tears in Heaven” by Eric Clapton 5.67
“You Were Always on My Mind” by Willie Nelson 5.63
“Cat’s in the Cradle” by Harry Chapin 5.55
“Closer” by NIN 5.46
“Mad World” by Gary Jules and Michael Andrews 5.39
“Somebody that I Used to Know” by Gotye feat. Kimbra 5.33
“Waitin’ for a Superman” by The Flaming Lips 5.28
“Milkshake” by Kelis 5.24
“Where Did You Sleep Last Night” by Nirvana 5.24
“Butterfly” by Weezer 5.23
“Eve of Destruction” by Barry McGuire 4.93
“Blitzkrieg Bop” by The Ramones 4.82

 

Footnotes

[1] Also, be aware that listening to these songs while watching their music videos has an effect on the way you interpret them. (Click here to go back up.)

References

Isabel M. Kloumann, Christopher M. Danforth, Kameron Decker Harris, Catherine A. Bliss, Peter Sheridan Dodds. 2012. “Positivity of the English Language”. PLoS ONE. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029484

Dodds, Peter Sheridan, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, and Christopher M. Danforth. 2015. “Human language reveals a universal positivity bias”. PNAS 112:8. http://www.pnas.org/content/112/8/2389

Huxley, Martin. 1997. Nine Inch Nails. New York: St. Martin’s Griffin.