Feeds:
Posts
Comments

Archive for the ‘linguistics’ Category

NPR’s Code Switch did an interview about language a few months ago and it stayed on my mind because of how bad it was. I gave it a re-listen and I’d like to point out just why it’s so bad. You can listen to the episode below. It’s episode 42 and it’s called “Not-So-Simple Questions From Code Switch Listeners”. The interview in question starts at the 14:47 mark. The hosts, Gene Demby and Shereen Marisol Meraji, talk to Brent Blair about what it sounds like to be American. I couldn’t find a transcript of the interview, so I made my own, which you can find here. I’ll summarize Blair’s points below and briefly point out why they are wrong. The linguistics behind each of the topics that I discuss below is complex, but I will try to keep things simple in order to keep things short.

1. We understand this quote unquote “American dialect” or “Received American Pronunciation” based on culture and media: what sells.

No, we don’t. We (I mean linguists, people who study dialects) understand American dialects (plural) based on how the dialects sound. Non-linguists (and linguists when they’re not studying dialects) understand dialects through an array of socio-economic and linguistic factors.

“Received American Pronunciation” is not a thing. Blair is mixing up General American and Received Pronunciation, the accents with the highest prestige in the US and the UK, respectively. Many national newscasters in the US use General American on air (for example, Brian Williams). In the UK, Received Pronunciation is used by the Royal Family and members of parliament (with exceptions, of course). Mixing up the names of these two dialects is so incredibly basic that it’s hard to believe someone would make it. It’s like someone talking about the Boston Yankees baseball team. Or the band Led Sabbath. Or President Abraham E. Lee. The term General American is not without its problems.

2. What we understand as the American dialect comes from the West Coast, specifically Hollywood, and what Hollywood has considered the standard American dialect. This dialect is “vanilla” – its features do not include “twisty or harsh R sounds or twangy stuff or dropped AH” (quotes from Blair).

It’s probably not surprising that a theater professor would think that Hollywood is responsible for our thoughts on American dialects. Blair is almost correct on this – the dialect used in many popular movies is indeed General American. It doesn’t come from Hollywood, though. The dialect known as General American comes from the eastern part of the US, and it is often considered the dialect of the Midwestern region of the United States, not California. General American is believed to not have any regional or ethnic features, but obviously this is nonsense. It is a mish-mash of various dialects. It’s also (as far as I can tell) not really used in dialect studies anymore.

Map of the dialects of North America. From The Atlas of North American English by Labov, Ash and Boberg (2006; Map 11.15).

Map of the dialects of North America. From The Atlas of North American English by Labov, Ash and Boberg (2006; Map 11.15).

The terms “vanilla”, “twisty”, “harsh R”, “twangy”, and “dropped AH” are not used in dialect studies. These terms are problematic. For example, the dialect that Blair is calling standard, the one from Hollywood, uses an R sound. This is one of the ways that linguists describe dialects: whether they include a post-vocalic R or not. Linguists use the terms rhotic to describe dialects which pronounce the R when it comes after a vowel, and non-rhotic to describe dialects which do not pronounce post-vocalic Rs. The Boston dialect is classically non-rhotic, with Hahvahd Yahd (Harvard Yard) being a common term used by people imitating the dialect (Notice that the Boston dialect doesn’t drop all of its Rs, just the ones which come after a vowel and before a consonant. No one in Boston goes to watch the Pat_iots or B_uins play). So, do rhotic dialects have “harsh R sounds”? I don’t know because I don’t know what the hell that means. What does “twangy” mean? What dialect sounds “twangy”? Does Nelly sound “Twangy” (he’s from St. Louis)? Does Taylor Swift (she’s from eastern Pennsylvania)? Can I say that this whole interview sounds “twangy” or should I use the more technical term: shitty?

3. Regionalisms in dialects are disappearing rapidly. Today a person from Atlanta, Georgia, sounds like a person from California. You can’t tell the difference between people from Houston, Chicago and New York. On the contrary, dialects in rural areas are still diverse.

Blair couldn’t be more wrong about this. Literally the first page of William Labov’s Dialect Diversity in America says “People tend to believe that dialect differences in American English are disappearing, especially given our exposure to a fairly uniform broadcast standard in the mass media. One can find this point of view in almost any discussion of American dialects […] This overwhelming common opinion is simply and jarringly wrong.” THE FIRST GODDAMN PAGE. Of a book that is sure to turn up in any Amazon or Google search on dialects in America. There is no way that Blair’s name showed up in a Google search of dialects in America.

Even though the Code Switch hosts didn’t need to read past the second page of Labov’s book to get better info than Blair gave them, if they had made it to page 35, they would have read “The dialects of Chicago, Philadelphia, Pittsburgh, and Los Angeles are now more different from each other than they were 50 or 100 years ago […] On the other hand, dialects of many smaller cities have receded in favor of the new regional patterns.” Again, exactly the opposite of what Blair told them. Labov also does something which Blair does not: he backs up his claims with (decades of) research. I guess they do linguistics differently in the field of theater studies.

As if that wasn’t enough, here’s a story from NPR about dialects NOT disappearing!

4. Globalization, commercialism, and our careers have made us say “We all want to sound the same”.

K.

5. This “vanilla” Californian dialect, or this blending of dialects, and/or the disappearance of regionalisms is not due to class or race, but access and power. (It’s hard to tell what they are talking about here. They use the term “placeless”.)

Things kind of break down around point 5. Blair has dug himself into a hole and he can’t get out. He talks about how people of color are only allowed to use the Vanilla-fornian dialect based on the culture that is employing them and their relationship to systems of power, but it is unclear what he means and he is unable to explain. He only offers an immediate anecdote – the interviewer Meraji is able to say “Latino” with a Puerto Rican accent on NPR, so maybe she would allow herself to use more Spanish on air in the future. But Spanish isn’t a dialect. Meraji would allow herself to speak Spanish on NPR if she knew her audience would understand her. Blair wraps it all up with something truly bizarre when he says, “So for me, when we’re accent stereotyping, it just means we haven’t fallen in love enough with that community to understand its diversity and its complexity”. I don’t know what the hell this guy is talking about.

Pointing fingers

So who’s at fault here? I think partial blame falls on both sides.

First, Blair should be blamed for not saying no to the interview. If NPR called me up and asked me to talk about theater studies, I would say no. Because I’m not a theater scholar or professional. If someone called you up and said “Hey, we want to talk about theoretical mathematics on the radio,” would you say “Sure! I took math in high school. Let’s do this.”? No, of course you wouldn’t. But they called Blair up and he said, “Ummmm, I speak a language. Get me on the phone!” And then he proved that he knows about as much about language and dialects as I do about theater studies. It’s not that Blair can’t know anything about dialects in America, it’s that he showed he doesn’t know anything about dialects in America. If he had gotten everything right, I wouldn’t be writing this blog post.

Some of the blame also goes to the people at Code Switch though. If they wanted to talk about language and dialects, why didn’t they call a linguist? Why did they think calling a theater professor, who as far as I can tell has not written anything on language, would be ok? In an earlier part of this episode, the hosts have a discussion about the magical negro and they talk to Ebony Elizabeth Thomas, a professor and researcher who has published on representations of people of color in various media. Thomas is at the University of Pennsylvania, the same university as Labov, who I quoted above. She literally could have transferred them over to his office. Or they could have talked to Walt Wolfram or Natalie Schilling or John Baugh. Any of these people would have been far better than Blair.

Ok, I’ve been pretty hard on everyone in this interview. You may be thinking, jeez, this guy just doesn’t like it when people talk about language. That’s not the case. I don’t like it when prominent news organizations talk about language and get it so wrong (I see you, The New Yorker). If you want to hear a really great interview on language and linguistics, go listen to this Top of Mind interview (download it here). The host, Julie Rose, and the guests talk about filler words (um, uh, you know, etc.), which is – like dialects – a linguistic topic with a divide between what the public thinks and what linguists have discovered. To discuss this topic, the host invited two linguists who have researched filler words, Alexandra D’Arcy and Jena Barchas-Lichtenstein. I hope other interviewers listen to this and learn how to discuss language on air.

If you are interested in learning more about dialects in America and/or dialect discrimination, follow the links behind the researchers’ names in the previous two paragraphs. Most of them have written books and articles aimed at the general public. Walt Wolfram even has a movie about African American speech coming out and it sounds amazing. I’m not saying that all of the things you will read are going to be positive – discrimination based on language happens and it is terrible. But the research put out by these and other linguists is fascinating and it can actually do what the NPR Code Switch interview attempted to do: make you more informed about language.

Hat tip to Nicole Holliday on Twitter for pointing me to this Code Switch episode. Holliday would also have been good for this interview.

Update 14 June 2017:

Almost immediately after posting this article and sharing it on Twitter, Gene Demby reached out. Gene is one of the hosts of NPR’s Code Switch. According to him, this episode “was the source of much consternation”. Gene wanted to talk to a linguist but was overruled by an editor. He has also said the Code Switch will do better in the future and that they have an episode about African American Vernacular English (AAVE) coming up. I’d like to thank Gene for clearing things up and I look forward to that episode.

Also related to this post, Kevin Calcamp reached out to say that Blair’s views are not representative of the study of linguistics in theater and performance studies. Kevin says that theater/performance scholars have a good understanding of linguistics. I believe him. He also pointed out the complicated nature and the various ways of incorporating dialects into theater/performance studies (follow the tweet below to see more). Thanks, Kevin, for explaining things.

Advertisements

Read Full Post »

Mary Norris’s book Between You & Me: Confessions of a Comma Queen (2015, Norton) is part autobiography, part style guide. Norris has been an editor at The New Yorker magazine for many years and her voice can be heard through the text, which makes parts of this book an enjoyment to read, especially when she tells stories about her life. She says in the intro that her book is “for all of you who want to feel better about your grammar” (p. 14), which is an unfortunate dedication since the book goes off the rails when Norris discusses grammar and linguistics. In these sections, Norris doesn’t just make herself look bad, but she also ropes in the rest of the editorial staff at The New Yorker.

Between You & Me: Confessions of a Comma Queen by Mary Norris (2015, Norton)

Early on, Norris discusses the importance of dictionaries to editing. She also, however, walks right into a mine field when she discusses her and The New Yorker’s preference for a dictionary published in the 1930s over nearly all others:

If we cannot find something in the Little Red Web [Merriam-Webster’s Collegiate Dictionary 2003], our next resort is Webster’s New International Dictionary (Unabridged), Second Edition, which we call Web II. First published in 1934, it was the Great American Dictionary and is still an object of desire: 3,194 pages long, with leisurely definitions and detailed illustrations. It was supplanted in 1961 by Webster’s Third, whose editors, led by Philip Gove, caused a huge ruckus in the dictionary world by including commonly used words without warning people about which ones would betray their vulgar origins. (p. 18)

Norris is selling Gove and the other editors of Merriam-Webster’s short here. Gove actually wrote that “We must see to it that a mid-twentieth-century dictionary gives evidence of having been written by editors who lived in the twentieth century” (quote from The Story of Ain’t by David Skinner, p. 205) and what Gove did (besides dropping sick burns) was help systematize the way that dictionaries qualified words for their “vulgar” natures. Gove also saw to it that the quotes used to illustrate the meanings of words were neither archaic nor unnatural, i.e. contemporary quotes rather than contrived sentences written by the dictionary makers. But Gove’s actions caused a lot of uptight social commentators to get their knickers in a bunch, as Norris briefly explains:

On the publication of this dictionary, which we call Web 3, a seismic shift occurred between prescriptivists (who tell you what to do) and descriptivists (who describe what people say, without judging it). In March of 1962, The New Yorker, a bastion of prescriptivism, published an essay by Dwight MacDonald [who was not a linguist, nor a language scholar – JM] that attacked the dictionary and its linguistic principles: ‘The objection is not to recording the facts of actual usage. It is to failing to give the information that would enable the reader to decide which usage he wants to adopt.’ (p. 18)

It is no more surprising that Norris sticks by MacDonald’s essay than it is that MacDonald went to The New Yorker to voice his complaint. But romanticizing the fact that Norris and her fellow editors use a dictionary from the 1930s (Webster’s Second) over more modern ones doesn’t look prescriptivist, it looks downright foolish. Norris drives the point home:

Since the great dictionary war of the early sixties, there has been an institutional distrust of Web 3. It’s good for some scientific terms, we say, patronizingly. Its look is a lot cleaner than that of Web II. Lexicology aside, it is just not as beautiful. I would not haul a Web 3 home. You can even tell by the way it is abbreviated in our offices that it is less distinguished: Webster’s Second gets the Roman numeral, as if it were royalty, but Webster’s Third must make do with a plain old Arabic numeral. (p. 19)

This is nonsense. The editors at The New Yorker are prioritizing a dictionary from 1934 because it “enables the reader to decide which usage he wants to adopt”. Think about that for a second. Who in their right mind wants their writing to sound like it was published in 1934? The New Yorker is not a “bastion of prescriptivism”, it is an ancient ruin of unfounded notions about language.

MacDonald can maybe be excused for the incorrect ideas in his article. They were, after all, popular at the time. But Norris doesn’t get off so easy. She wrote her book in the 2010s, well after the ideas in MacDonald and W2 were shown to be incorrect. Think about what she is doing here. She using a 50-year-old article with incorrect ideas about language to defend her use of an 80-year-old dictionary. If your doctor recommended that you start smoking Camels because a commercial in the 1950s said they activate your T-zone, you would find another doctor.

Later in the book, Norris visits the offices of Merriam-Webster and says “These people are having far too much fun to be lexicographers” (p. 29). This is perhaps true, and she might even believe it, but I doubt she likes any of the advice that the MW editors give online or in their videos.

Bad Grammar

Every chapter in Norris’s book starts with a personal story and moves into a topic of English grammar or style. In Chapter 2, titled “That witch!”, Norris discusses relative clauses. She gives some OK advice about how to distinguish whether the clause is restrictive or non-restrictive, but then makes some major mistakes on what to do after that:

If the phrase or clause introduced by a relative pronoun – “that” or “which” – is essential to the meaning of the sentence, “that” is preferred, and it is not separated from its antecedent by a comma. (p. 40)

I suppose Norris means that that is preferred in The New Yorker, but it sounds like she means that is preferred across the English language, which simply isn’t true. Anyone who has spent any time hanging out with the English language would know this. Perhaps she means that that is preferred by people (such as editors at The New Yorker?) who wish they could dictate which relative pronoun should be used in all cases across the English language. Norris then gives us a half-baked explanation of what’s going with that and which in relative clauses:

If people are nervous, they sometimes use “which” when “that” would do. Politicians often say “which” instead of “that”, to sound important. A writer may say “which” instead of “that” – it’s no big deal. It would be much worse to say “that” instead of “which.” Apparently the British use “which” more and do not see anything wrong with it. Americans have agreed to use “that” when the clause is restrictive and to use “which,” set off with commas, when the clause is nonrestrictive. It works pretty well. (p. 41)

What? No. There is so much wrong with this paragraph. First, what the hell does Norris mean by the first two sentences? Is she a professional on spoken English now? The third sentence gives it away – writers don’t “say” things, they write things. But Norris doesn’t realize that she has blurred the line between spoken and written language so much that she’s erased it. This paragraph means that an admittedly prescriptivist editor of written language – who prefers a dictionary from 1934 – can’t tell the obvious difference between spoken and written English and that we are supposed to take for granted her claims about ALL spoken English, based on… something. Another thing that is wrong with this paragraph is that it is demonstrably wrong that Americans have “agreed to use ‘that’” with restrictive relative clauses. This was dictated by copy editors in the beginning of the 20th century! This hope/wish/desire to separate which and that comes from Fowler (1926), who wrote “The two kinds of relative clauses, to one of which that and the other to which which is appropriate, are the defining [restrictive] and the non-defining [non-restrictive]; and if writers would agree to regard that as the defining relative pronoun, and which as the non-defining, there would be much gain both in lucidity and in ease. Some there are who follow this principle now; but it would be idle to pretend that it is practice either of most or of the best writers.” (Fowler’s Dictionary of Modern English Usage, 4th ed., 2015, edited by J. Butterfield, p. 809) Even Fowler gave up on this that/which nonsense. You would think Norris would recognize this because of her preference for early 20th century English reference works. No one cares about this that/which distinction anymore, if they ever did. It wasn’t just the British who saw nothing wrong with using which in nonrestrictive relative clauses. Americans have also never cared about this when they were speaking naturally*.

Norris also has a chapter on pronouns, in which she wastes four pages (pp. 60-63) blabbering about pronouns before we get to the point of the chapter, i.e. the (supposed) problem of English’s (supposed) lack of a gender-neutral third person singular pronoun. The chapter ends with a heartfelt and well written personal story about Norris having to switch the pronouns she used for a family member who transitioned. Norris quite deftly shows how personal our pronouns can be and this part of the chapter is definitely worth reading. What comes before it, however, are a bunch of pronoun howlers.

One of the stranger ones is when Norris claims that “There is only one documented instance of a gender-neutral pronoun springing from actual speech, and that is “yo,” which ‘spontaneously appeared in Baltimore city schools in the early-to-mid 2000s.’ (p. 66) What? Does Norris actually believe this? The research cited on yo is from Stotko and Troyer, but they do not claim that yo is the only documented instance of a gender-neutral pronoun springing from actual speech (Stotko, Elaine M. and Margaret Troyer. 2007. “A New gender-neutral pronoun in Baltimore, Maryland: A preliminary study”. American Speech 82(3): 262–279. https://dx.doi.org/10.1215/00031283-2007-012).

Then Norris drops the bomb:

I hate to say it, but the colloquial use of “their” when you mean “his or her” is just wrong. (p. 69)

Ugh, where to start? Literally right before this sentence, Norris said that having singular you and plural you is fine. But then she says that singular they is not because… reasons? Norris actually tries to claim that the epicene he would be invisible if we didn’t “make such a fuss” about it. Guess what? It isn’t and we do. Does Norris really think that the epicene he is only visible because people complain about it? She has it backwards. The epicene he is complained about because it is so damn visible. And are we really to believe that he would be invisible to Norris? She devoted an entire chapter in her book to pronouns. Also, singular they isn’t colloquial (although I’m willing to bet that the editors of The New Yorker have a different definition of the term “colloquial” – one from the 1930s perhaps). It has been used across all types of texts and registers and first appeared 800 years ago. (Wait, is it possible that singular they SPINGS FROM ACTUAL SPEECH?! Omg you guys!!1!) Basically, if you have a problem with singular they, maybe it’s time to get over it. Or, if you’re going to complain about singular they, maybe you shouldn’t use it in your writing. That’s right, Norris uses singular they in this book:

A notice from the editor, William Shawn, went up on the bulletin board, saying that anyone whose work was not “essential” could go home. Nobody wanted to think they were not essential. (p. 11)

smh

The discussion of pronoun usage gets more convoluted after this. On the very next page (p. 70), after telling us that a writer was wrong for not using the epicene he, Norris says that a The New Yorker staff writer was correct in using singular they. So what the hell is going on here? I don’t know and I’m starting to not care.

Chapter 4 – “Between you and me”

This might be the most confusing chapter in terms of grammar. Norris writes:

The most important verb is the verb “to be” in all its glory: am, are, is, were, will be, has been. (p. 84)

So will be and has been are part of the verb BE? Uhh… how? And why isn’t being in that list, or (by Norris’ logic) have been? No one knows.

The rest of this chapter goes from bad to worse. Immediately after this quote, Norris discusses nouns, rather than nouns phrases, even though she uses noun phrases rather than single-word nouns (such as copy editor and my plumber). In a later admission that there are several copulative verbs, Norris says that “It is because these verbs are copulative and not merely transitive that we say something ‘tastes good’ (an adjective), not ‘well’ (an adverb): the verb is throwing the meaning back onto the noun”. What does this mean? Norris is also incorrect when she says that “nouns are modified by adjectives, not adverbs”. Noun phrases are modified by other noun phrases (a no-frills airline, sign language) as well as adverb phrases (the then President, a through road). Those examples from Downing & Locke (2006: 436), but from The New Yorker we have “Danny Hartzell backed a Budget rental truck up to a no-frills apartment building…” from a piece called “Empty Wallets” by George Packer in the July 25, 2011 issue, perhaps edited by Norris. But this isn’t even a matter of modification. In Norris’s example (“Something tastes good”), the adjective phrase good does not modify the noun phrase something, but rather functions as a complement in the sentence. Essentially, the subject (which may be a noun phrase or may be something else) requires a complement when a copulative verb is used. And there is no reason that adverb phrases cannot act as complements after copulative verbs (They’re off!, I am through with you, That is quite all right).

In the following paragraph, Norris writes “One might reasonably ask, if we can use the objective for the subjective, as in ‘It’s me again,’ why can’t we use the subjective for the objective?” But again this is confusing and it’s hard to tell whether Norris believes that me is the subject in her example sentence (hint: it’s not, it’s what some grammars call an extraposed subject, but I can see how Norris would be confused – The New Yorker has proven its ineptitude when it comes to describing sentences of this type. See Downing & Locke 2006: 47–48, 261).

In discussing grammar, Norris also tells stories about working at The New Yorker. It’s hard to describe how shocking some of these are, so I’ll let Norris tell it:

Lu Burke once ridiculed a new copy editor who had come from another publication for taking the hyphen out of “pan-fry.” “But it’s in [Webster’s dictionary],” the novice chirped. “What are you even looking in the dictionary for?” Lu said, and I wish there were a way of styling that sentence so that you could see it getting louder and more incredulous toward the end. She spoke it in a crescendo, like Ralph Kramden, on The Honeymooners, saying, “Because I’ve got a BIG MOUTH!” Without the hyphen, “panfry” looks like “pantry.” “Panfree!” Lu guffawed, and said it again. “Panfree!” The copy editor was just following the rules, but Lu said she had no “word sense.” Lu was especially scornful of unnecessary hyphens in adverbs like “feet first” and “head on.” Of course, “head on” is hyphenated as an adjective in front of a noun – “The editors met in a head-on collision” – but in context there is no way of misreading “The editors clashed head on in the hall.” The novice argued that “head on” was ambiguous without the hyphen. Lu was incredulous. “Head on what?” she howled, over and over, as if it were an uproarious punch line. Eventually, that copy editor went back to where she had come from. “It’s as if I tried to become a nun and failed,” she confided. It did sometimes feel as if we belonged to some strange cloistered order, the Sisters of the Holy Humility of Hyphens. (p. 116)

Some strange cloistered order? Jesus Christ, working at The New Yorker sounds fucking miserable. “Pan-fry” needs a hyphen because, what, the readers of The New Yorker are so fucking dumb that they would think it means “panfree”? Probably not, but what a great excuse for one of the editors to be a total dick to an employee, huh? Hahaha, good times!

Here is the sentence in question, from a 1977 issue of The New Yorker:

“It’s heartening to see that a restaurant in a national park is going to take the time to pan-fry some chicken,” I told Tom.

Whoa! Good thing that hyphen was there or I would’ve thought this guy was taking time to panfree some chicken and WHAT THE FUCK WHY WOULD I THINK THAT.

Incredibly, the hits keep on coming in the next paragraph:

The writer-editor Veronica Geng once physically restrained me from looking in the dictionary for the word “hairpiece,” because she was afraid that the dictionary would make it two words and that I would follow it blindly. As soon as she left the office, I did look it up, and it was two words, but I respected her word sense and left it alone. (p. 117)

Ok, now respect the word sense of writers who use(d) singular they.

And if you’re wondering why The New Yorker still writes “teen-ager”:

Not everyone at The New Yorker is devoted to the diaeresis [the two little dots that The New Yorker – and only The New Yorker – places over the word cooperate]. Some have wondered why it’s still hanging around. Style does change sometimes. […]

Lu Burke used to pester the style editor Hobie Weekes, who had been at the magazine since 1928, to get rid of the diaeresis. Like Mr. Hyphen, Lu was a modern independent-minded reader, and she didn’t need to have her vowels micromanaged. Once, in the elevator, Weekes seemed to be weakening. He told her he was on the verge of changing that style and would be sending out a memo soon. And then he died.

This was in 1978. No one has had the nerve to raise the subject since. (pp. 123–124)

Kee-rist, I’m surprised they don’t write “base-ball” and “to-morrow” and “bull-shit”.

A chapter about pencil sharpening. Seriously.

Chapter 10 (“Ballad of a Pencil Junkie”) is some sort of dime store pencil porn as Norris describes pencils in such detail that only an actual pencil would find it interesting. I kept thinking that I would rather have pencils in my eyes, but then I came across the best line in the entire book:

David Rees specializes in the artisanal sharpening of No. 2 pencils: for a fee (at first, it was fifteen dollars, but like everything else, the price of sharpening pencils has gone up), he will hand-sharpen your pencil and return it to you (along with the shavings), its point sheathed in vinyl tubing. (p. 182)

Dafuq?

Conclusion

The New Yorker hardly needs help in showing people that it has a very tenuous grasp of English grammar [links to LangLog and Arnold Zwicky]. They demonstrate that in their pages whenever the topic of grammar comes up). Apparently, decades of publishing some of the greatest writers has not helped anyone at the magazine to learn how English grammar works. Unfortunately, Norris’s book does nothing to help The New Yorker’s reputation when it comes to grammar. On top of that, some of the stories she tells about working at The New Yorker are pretty horrifying. If you are able to separate or skip over the discussions of grammar, this book may be enjoyable for you. It’s an easy read, but I couldn’t force myself to like it.

 

Footnotes:

* Not to mention Norris doesn’t even follow her own advice –

p. 15: “It is one of those words which defy the old “i before e except after c” rule”

p. 54: “The piece also had numbers in it – that is, numerals – which I instinctively didn’t touch”

And she quotes A. A. Milne doing it: “If the English language had been properly organized … there would be a word which meant both ‘he’ and ‘she’” (p. 64)

And Henry James: “Poor Catherine was conscious of her freshness; it gave her a feeling about the future which rather added to the weight upon her mind.” (p. 143)

And Mark Twain: “It was what I thought when I stood before ‘The Last Supper’ and heard men apostrophizing wonders and beauties and perfections which had faded out of the picture and gone a hundred years before they were born.” (pp. 147-48)

You could argue that these are all old/dead writers and that no one should write like that anymore, but again, The New Yorker magazine, as well as the author of Between You & Me, prefers to use a dictionary from 193fucking4.

Read Full Post »

The following is a sentence on an exam I gave my student this semester. It’s a lyric from the totally awesome band The Go-Go’s (who are too punk rock to care about using your lame apostrophes correctly). Read it and decide which part of speech you think sealed is: verb or adjective?

In the jealous games people play, our lips are sealed.

I first thought that sealed is clearly an adjective and that it functions as the subject complement of the sentence (a subject complement is an element required by copular verbs, such as be and seem, which does not encode a different kind of participant to the subject in the phrase in the way that an object does). But many of my students analyzed it as a verb. This calls for some weekend grammar research (while listening to the Go-Go’s of course)!

On the exam, students had to mark the function (subject, predicate, object, etc.) of each clause in the sentence. In the grammar that we’re using (English Grammar: A University Course, 2nd ed., 2006, by Downing and Locke), only verb phrases can be included in the predicate. This means that if sealed is a verb, the phrase consists of only a subject (Our lips) and a predicate (are sealed).

Two dictionaries list sealed as an adjective: the OED and Macmillan Dictionary. The OED’s citation which mirrors this construction is a bit out of date though. It comes from the 1611 printing of the King James Bible: And the vision of all is become vnto you, as the wordes of a booke that is sealed. Macmillan Dictionary only offers “a sealed box/bag/envelope” as an example. Four other dictionaries (Merriam-Webster’s, Dictionary.com, Oxford Learner’s Dictionary, and Oxford Dictionaries) do not list sealed as an adjective, only as a transitive verb (i.e. it needs an object). Strangely, Oxford Learner’s Dictionary has this example sentence under the second entry for seal as a verb:

The organs are kept in sealed plastic bags.

In this case, sealed is definitely an adjective modifying a noun (plastic bags). This must be an oversight by the editors. More importantly, though, is the fact that sealed in Our lips are sealed does not have an object. What gives?

Well, sealed is more of a participial adjective than anything else (some grammars use the terms verbal adjective or attributive verb). It’s an adjective that has been derived from a verb. Participial adjectives look like verbs but they function grammatically like adjectives. I know. Welcome to the Twilight Zone. These are the cases which really show that there are not sharp limits between the parts of speech, but rather very hazy boundaries. Sometimes it is easy to tell whether the word in question is a verb or an adjective. For example:

This is the sealed envelope that you mailed. = adjective

I sealed the envelope with a kiss. = verb

Other times – such as the one under discussion here – things are not so clear cut. Downing & Locke (p. 479) say that “past participles may often have either an adjectival or a verbal interpretation. In The flat was furnished, the participle [furnished] may be understood either as part of a passive verb form or as the adjectival subject complement of the copula was.” This means that sealed could be a passive verb that is simply missing its object. The object is presumably missing because we know that the person who owns the lips is the one who seals them, so it would sound ridiculous to say Our lips are sealed by us (although maybe not as ridiculous as the similar phrase My lips are sealed by me).

I want to argue that sealed is definitely an adjective, but like so much else in linguistics, it is hard to be definite about this. The verb analysis works just as well and sealed might be semantically closer to a verb in that we can think about the sealing of lips as resulting from an action taken. If we compare it to Our lips are chapped there isn’t as clear of an action present, except maybe the action of the weather. But I don’t like talking about verbs as action words.

For what’s it worth, 19 out of 25 people in my Twitter poll said that sealed is an adjective.

On the exam, I accepted both adjective/subject complement and verb/predicator. This made my students happy. Talking about sealed for 20 minutes in class did not make them so happy.

Read Full Post »

Abby Kaplan begins her book by explaining its two purposes. First, the book is meant for “debunking language myths” such as those about linguistic sex differences and text messaging. Second, Kaplan’s book is about “how to study language”, or to reveal insights on what linguists do (p. 2). This has my interest piqued. There is no shortage of downright nonsense about language in the news, social media, and bookstores and so Kaplan’s book, which is suited to combat that nonsense, is therefore a welcome addition to the shelf.

Consider Kaplan’s thesis for the book:

This book is about two things […] First, it is about popular beliefs about language: the conventional wisdom on topics from linguistic sex differences to the effects of text messaging. Sometimes, of course, popular opinion has things more or less right – but it’s more interesting to examine cases where “what everyone knows” is wrong, and so we will put a special focus on debunking language myths. […] Second, this book is about how to study language – not in the sense that it will train you to do linguistic analysis for yourself, but in the sense that it provides a glimpse of the kinds of things linguists do. (p. 2)

Kaplan’s thesis on “popular beliefs about language” vs. “how to study language” aims to strike a balance between what people think they know about language and how we (or linguists) can figure out what is really going on. Such a thesis may sound heavier than usual for a book aimed at the general public, but Kaplan’s writing makes this book very approachable. In fact, Kaplan’s goal of the book has me hoping that journalists will read it: “The goal is for you to you to become an informed consumer of social science research with an appreciation of how the scientific process works” (p. 2).

Kaplan picks up this theme of the gibberish that is published about language by claiming “The world is full of self-appointed experts who feel free to make pronouncements on language with little or no supporting evidence” (p. 3). She is certainly correct there. One of the problems that I don’t often see reported is that linguistics is a tricky subject. Everyone speaks a language and many people feel justified in making claims about language. This doesn’t happen with other scientific subjects. No one makes claims about mathematics because they took algebra in high school. But some people who had a strict English teacher, or who got straight A’s in English class, feel it is their right to pass judgement on what is the appropriate use of language and what it not. One of the first assignments that I give my first-year students is to have them write about their linguistic pet peeves because I want them to let go of those notions before they start to learn that studying the modern use of language is not like paleontologists studying a T-rex from its skeleton, but rather like studying a living T-rex up close and without tranquilizer darts. That said, there are people who feel comfortable having learned the “rules” of their language and who do not want to be told different. I’d like to think that there are more people who learned the “rules” but are willing to keep learning more, even though they did not pursue a degree in linguistics. Kaplan’s book is for them.

kaplan-cover

The cover of Women Talk More than Men: … And Other Myths about Language Explained.

I thoroughly enjoyed every chapter in this book, but I want to highlight a few that I thought were especially good.

Chapter 1 – “A dialect is a collection of mistakes”

Perhaps it’s good to start with a discussion of dialects (a topic that everyone seems to have an opinion on) and the Ebonics debate, an occurrence which received an incredible amount of input from non-linguists or language professionals, a.k.a. people who don’t know what they’re talking about. Kaplan quotes some people who say that African American English (AAE), also called Black English, is a way of speaking in which you “you can say pretty much what you please, as long as you’re careful to throw in a lot of ‘bes’ and leave off final consonants” (p. 11). In my opinion, Kaplan is too easy on the writers who spout this nonsense (which is akin to the nonsense on the Urban Dictionary, a source that Kaplan also quotes), but the rest of the chapter is a detailed analysis of why non-standard dialects follow specific rules, just like Standard English does.

Kaplan offers a very good insight in this chapter. She writes:

There is one final point to be made here. Linguists argue that no variety of a language is linguistically superior to any other; every dialect of every language follows regular grammatical rules and is capable of fulfilling the communicative needs of its speakers. This is true even for languages and dialects that are widely thought to be crude or unsophisticated: as soon as linguists start studying what speakers actually do, we discover that these languages are just as rule-governed as any other.

But linguists also recognize that not all dialects of a given language are socially equal. Standard English is no better or worse than AAE in many social situations. Whether we like it or not, it’s a fact of life that a person who speaks Standard English will find it much easier to excel in the academic world or get certain kinds of jobs than a person who speaks only AAE. Thus, there are good pragmatic and ethical arguments for helping speakers of non-standard dialects learn Standard English too, while acknowledging that it’s only by historical accident that this particular variety is the prestigious one. (p. 20)

This is a point that you will not find in most books or articles about language that are written by non-language scholars. There is an idea (a very old idea) that the standard variety of the language is The One True Way™. Kaplan does a good job explaining that this idea, like the idea that a dialect is a collection of mistakes, is misleading. She also notes that it is “a simplification to talk about a single ‘Standard English’” (p. 11), since there are different standards in different English-speaking countries. There are also different standards among different genres of writing and speaking.

Chapter 5 – “Children have to be taught language”

Every chapter of Kaplan’s book starts with a myth about language in the title. Kaplan explains what is behind the myth and gives background information from language studies. She then offers summaries of case studies which have been done to investigate the myth. This chapter on child language acquisition is about how children who receive the most language input, i.e. those who are taught language, are likely to do better in life and it references the celebrated research done by Hart and Risley (1995), which supposedly found that children from low-income families have lower IQ scores because “low-income parents talk to their children much less, and in different ways, than high-income parents do” (p. 83). But Kaplan also highlights an important distinction in studies of this kind:

Look again at the list of things that parents can apparently do to boost their children’s language development: talk a lot, directly to the child; use a large vocabulary; treat the child as a conversational partner and engage with her intensively; ask her lots of questions; use indirect requests instead of giving demands; and so on. This picture looks suspiciously like the western mainstream middle-class model of parenting – which […] is far from universal. Not only that, but this is exactly the social group to which researchers on child language acquisition are most likely to belong. (p. 89)

Kaplan shows that measuring a child’s linguistic ability based solely on how many words they say while a researcher around is perhaps wrong-headed. Different cultures and social groups place different restrictions on how much children are allowed to talk around adults/strangers/researchers. Likewise, researchers from different socio-economic and cultural groups may place value on objects and experiences that are unfamiliar to children from different groups. The study of language is not as straightforward as it seems. Kaplan again shows a good insight when she writes about our biases and problems in language studies:

The point here is not that Hart and Risley had it backwards, that the parenting practices they thought were good are actually bad. Rather, the point is that any time we try to study parents and children – including their use of language – our research is inevitably influenced by culture-specific assumptions about the kinds of things parents and children ought to do. It’s all too easy to study parents and children in our own culture and conclude that we’ve learned something about parents and children everywhere. (p. 92)

I was a bit disappointed that a discussion of Motherese was left out of this chapter. Motherese is the idea that the primary caregiver(s) explicitly correct their child’s language mistakes, thus giving instructions on what is acceptable in their language. Motherese was perhaps most famously put forth by Steven Pinker in The Language Instinct. Pinker argued that Motherese is “folklore” and that its non-existence proves that humans have Universal Grammar (Motherese is wrapped up in the Poverty of the Stimulus argument). Kaplan claims that we can answer the question of whether Motherese exists, or whether “parents systematically and explicitly correct their children’s grammar mistakes” with “a resounding ‘no’” (p. 104). She says no study has ever found this, but Sampson (2005) references a study which showed that the speech directed at children by caregivers is more “proper” (i.e. free of grammatical errors) than linguists assume, especially Pinker and other believers in Universal Grammar. I concede that taking on Universal Grammar is a lot to ask out of one chapter of one book, but I would have liked to see this debate at least mentioned. Kaplan does address the poverty of the stimulus argument and makes a very pertinent point about how it’s a theory on child language acquisition which was put forth by someone (Chomsky), who is not an expert in child language acquisition. She writes:

The poverty of the stimulus remains a controversial hypothesis, and some linguists have argued that Chomsky (who is not a specialist in child language acquisition) underestimated how much information is in the speech that a young child typically hears. (p. 93, bolding mine)

The shade, it is thrown.

In discussing the speech that children overhear, Kaplan has a very nice side-note which I think anyone who has been around children can appreciate. It shows that this book is also fun: “It seems entirely reasonable that children would pay more attention when they are being spoken to directly, but it’s also clear that children ‘eavesdrop’ as well. (If you doubt this, try swearing within earshot of a two-year-old.)” (p. 91).

Chapter 6 – “Adults can’t learn a new language”

Kaplan’s chapter here does a very good job of discussing the myth that there is a critical period in language learning, or an unspecified age sometime before adulthood after which it is impossible for people to become fluent in a second language. Kaplan frames this question very well, or shows how linguists should frame the myth, by writing:

But our anecdotal impressions may not be accurate; it’s true that many adults struggle with a second language, but it’s also true that many adults become competent and fluent speakers of a language they first learned late in life. In addition, even if children really are better on average at learning a second language than adults are, that fact by itself doesn’t prove that there is a critical period for second-language acquisition: children and adults are different in many ways, and it could be that adults have trouble with new languages for some reason other than just their age. (p. 115)

This explanation is an example of the insightful ways that Kaplan approaches the linguistic myths in the book. And this explanation is especially pertinent here since the critical period myth comes directly from linguists. It is unfortunate, however, that in this chapter Kaplan does not define what “native proficiency” means and does not tell us how the studies mentioned define the term. To reach the proficiency of a native speaker was once ultimate goal for second-language learners, but that idea has fallen under question since “native” speakers do not always serve as exemplars of their language and since speaking like a native is not desirable in all situations. For example, when two or more non-native English speakers with different first languages are working together, an international variety of English might be preferable.

Chapter 8 – “Women talk more than men”

It is easy to see why Chapter 8 gives the book its title. This chapter, on the myth that women talk more than men, is probably the most insightful chapter in the book, perhaps because the myth is so misleading. For example, Kaplan shows how even if we were to observe that women talked more than men, this would leave us with a host of additional questions and few answers:

Suppose you conducted an experiment and found that women were more likely to say um more than men. Does this mean that women are more insecure than men? Or that they’re more thoughtful and take more time deciding what to say next? How much do the results depend on the design of the experiment? For example, was the data collected in a lab setting, or from a corpus of spontaneous conversation? If it was in a lab setting, could that task have biased the results? Were the subjects discussing a topic that men might traditionally be expected to know more about? Were the subjects giving monologues, conversing in pairs, or talking in small groups? Were they talking with others of the same sex or the opposite sex?

As we will see, factors like these have a huge effect on how men and women speak. (p. 155)

Kaplan explains various ideas from different cultures about how men and women speak. And she astutely points out the what is really behind these ideas:

By this point, contemporary western ideas about women’s superior verbal skills are starting to look anomalous. Obviously societies vary in what they believe about women’s speech: according to the medieval song discussed above, women are gossipy and unable to keep secrets; according to Jespersen, women are languid and insipid; according to rural Malagasy communities, women are unskilled and blunt. What all of these beliefs have in common is not the specific characteristics that are attributed to women, but the idea that women are inferior to men. Where assertiveness and directness are highly valued, those behaviors are considered to be characteristic of men; where indirectness and self-effacement are highly valued, those behaviors are attributed to men. (p. 162)

I like that Kaplan discusses the ways that women’s speech is viewed in other places in the world, but I appreciate it even more that this book – which is written in English and is from contemporary western society – shows that the ideas in our own culture about how women speak are deficient. I have a sneaking suspicion that the talk of places and languages in far off lands would fall on deaf ears for general readers, so it is very good that Kaplan contextualizes our own views of language.

Chapter 9 – Texting makes you illiterate

This is a myth that linguists have been at pains to debunk in recent years because texting and microblogging have become so popular. Along with the rise of these technologies and platforms has, unfortunately, also come the Chicken Little language commentary, which screams that texting is ruining the English language. The most infamous propagator of such hysteria is perhaps Lynne Truss, author of Eats, Shoots and Leaves, a book which starts of bemoaning the harm caused to English by texting and then goes on moaning for over 100 pages. So it was nice to see that this is one of the best chapters in Kaplan’s book. Kaplan begins by explaining that texting is a form of language unto itself and that there are valid reasons for why it will most likely not influence other forms of language:

When we use some technology to transmit language, its form isn’t neutral: it shapes how we say things, and therefore also potentially what we say. It matters, for example, that writing (but not speech) is permanent, that it can be revised and edited, and that it carries only limited information about tone of voice. Telephone and radio transmit audio but not video; the listener has access to the voice but not the nonverbal cues. Telegrams used to be priced by the word, which encouraged senders to use as few words as possible in what became the classic ‘telegraphic’ style. (p. 190)

Many language commenters often do not realize these facts and think that the way people tweet is the way that they will write letters to the editor, or job applications, or whatever. But there is little reason to assume this is the case (and the language commenters rarely present evidence to support such an assumption). In addition, Kaplan makes another important point that is overlooked by people who adhere to this myth: the abbreviations used in texting (and tweeting, chatting, etc.) serve a meaningful sociolinguistic function besides saving space or time. The proof of this is that some of the abbreviations actually take more time and space to compose then writing the words out, and yet people still use them.

Later in the chapter, Kaplan gets to what’s at the heart of this chapter’s myth: people don’t like texting because it’s not proper English. She writes:

Despite the similarity between some types of hieroglyphic writing and some types of text message abbreviations, I have yet to hear a modern commentator decry hieroglyphics with the same fervor that is applied to texting. It’s hard to avoid the impression that these abbreviations are condemned, not because they’re inherently bad, but because they simply do not happen to be part of standard written English. (p. 198)

Well, sure, but Ancient Egypt used hieroglyphics and look what happened to them.

/s

Conclusion

This is one of the best books on language and linguistics that I have ever read. It is wide-ranging and well-written. It offers more in terms of actual data than the usual language books aimed at the general public, but it is not so technical as to be inaccessible to non-linguists. It’s like a peek behind the curtain of linguistics and shows the sticky nature of seemingly simple (but wrong) ideas such as “a dialect is a collection of mistakes”, “the most beautiful language is X” and especially “women talk more than men”. For each myth, Kaplan has built a response based on solid linguistic sources. In each chapter, she also offers a bullet point summary, and list of points for further reflection on the topic, a concise and explanatory list of references for further reading, and a bibliography. If any of the topics covered in this book leave you hoping for more, you will not be let down. I highly recommend reading this book.

You can see other reviews of this book on Stan Carey’s blog Sentence First and Lauren Gawne’s blog Superlinguo. Both of them enjoyed the book. You can also read a blog post by Kaplan on the myths and facts of “uptalk” in English.

Women Talk More than Men …And Other Myths about Language Explained is available in paperback (ISBN: 9781107446908) for $24.99 (UK£15.99) and in hardcover (ISBN: 9781107084926) for $94.99 (UK£59.99). CUP kindly sent me a copy of Kaplan’s book for this review.

Read Full Post »

As a dictionary of English vocabulary and phrases, the American English Compendium by Marv Rubinstein is satisfactory. It is 500 pages long so it covers a lot of ground. As a book of American English or Americanisms, this book is not what it seems. A brief glance at any of the pages will make you question if the entries really are words or phrases that are exclusive to American English. And a comparison to another source will most likely show that they are not. As a commentary on language, however, this book is terrible.

American English Compendium

Cover of American English Compendium by Marv Rubinstein. Published by Rowman & Littlefield. Cover design by Neil Cotterill.


The problems start on the first page of Chapter 1. The author defends the use of the term American English by proclaiming it is better than British English:

Dynamic. Versatile. Imaginative. Capable of capturing fine nuances. All these terms can truthfully be used to describe the American language. “Don’t you mean the ‘English language’?” some readers may ask. No, I mean the American language. Over many years, American English has vastly expanded and changed, a transmutation that has left it only loosely connected to its mother tongue, British English. (p. 3)

Although no one would (or should) argue that American English is a term that needs to be defended, the imaginary readers in this passage come off as more knowledgeable about language than the author. Are we really to believe American English is the only variation of English that is “dynamic” or “imaginative” or “capable of capturing fine nuances”? The problem gets compounded when the author recognizes the influence of American English in England, but seems to suggest that the reverse is not happening:

[W]hile there are numerous localisms [in countries where English is the primary language], more and more the terminology, idioms, slang, and colloquialisms smack of American English. Even in England this is slowly but surely happening. (p. 3)

And it only get stranger from there. On the next page we are told:

Things have changed so much, and the use of American English in international communications has grown so much, one can now safely say that most English speakers use (to a greater or lesser degree) Americanized English – that is, the American language. And rightly so. The American language is so much richer and more adventurous. British English neve stood a chance. (p. 4, emphasis mine)

Excuse me, Mr. Rubinstein, but H. G. Wells, J. K. Rowling, Grant Morrison, Agatha Christie and a thousand other British writers would like a word.

After this “proof” that ‘Murican English is better than British English, readers are given a “microcosm of what is happening” (p. 4) in the world. Rubinstein relates a story from an article by New York Times columnist and economist Thomas Friedman about how a senior Moroccan official is sending his kids to an American school even though he was educated in a French school. Rubinstein uses this story to claim that

There are now several American schools in Casablanca, each with a long waiting list. In addition, English (primarily American English) courses are springing up all over that country. If this is happening in Morocco, a country with long-lasting French connections and traditions, it is undoubtedly happening everywhere. The American language is becoming ubiquitous. (p. 5)

But it needs to be noted that Friedman does not claim that these English-language schools which are supposedly popping up all over Casablanca are teaching American English. Nor are readers given any proof that Casablanca is an example of what is happening around the world. I am very hesitant to believe it is. While it’s a cute story, this kind of claim needs to be backed up with evidence. How do we know that the English being taught in these schools is strictly British or American or some variation of English as an international language? We have to take the Rubinstein’s word for it, but as we have seen with his dismissal of British English, he is not to be trusted when it comes to linguistics commentary.

Further down the page, in a section titled The Richness of the American Language, Rubinstein claims that “much of the richness of the American language lies in the fact that it has absorbed words and expressions from at least fifty other languages.” (p. 5) He lists some examples, but completely fails to acknowledge the fact that many of them, such as brogue and orangutan and typhoon, were originally borrowed into British English and then used by Americans.

Rubinstein then presumes readers will ask how the American language differs from other languages, which obviously also use foreign words and phrases. But the answer given is just as confused as the question. The author states that “there is no question that American English has been like a sponge absorbing and modifying words from many other languages” (p. 7) without realizing (or reporting) that this is true of English in general, not American English in particular. This is actually true of languages in general, although English does appear to be particularly greedy when it comes to borrowing words from other languages.

Later, there is a fairly reasonable, but short and undefinitive, discussion of “Black English” (African American Vernacular English). The section unfortunately ends with this quote: “Educated African Americans, of course, use standard American English” (pp. 11–12). Well, good for them.
Things get really bonkers in the section on compounding, which includes this howler:

Compound words exist in almost all languages, but never anywhere near the extent that they do in American English. […] during the last few decades, compounding has reached epidemic proportions. The vast majority of compound words are of relatively recent origin languagewise (p. 15)

This is nonsense. Does the author know how any other languages work? Finnish compounds words much more than English does. In fact, the syntax of Finnish demands it, unlike in English where compounding is very often a matter of style. And how do we know that the “vast majority” of compound words are not old? Let’s say “the last few decades” goes back to 1960. Do you really think words such as outcast, outdoors, outlook, output, overcome, overdoes, overdue, oversee, oddball, goofball, downfall, and downhill (all words supplied by the author) were made compound words after 1960?

Here are some other WTFs in this book along with the thoughts I had after reading them:

In general [the English speakers of Australia, Canada, Guyana, India, Ireland, New Zealand, and South Africa] all understand each other, but, as you have seen in the previous chapter on American and British English, there are substantial differences. The same can be said of the English used in the other countries listed above. With a few exceptions, Canadian English consists of a blending of American and British English, but the other English-speaking countries have all developed their own unique and distinctive expressions (including slang and colloquialisms). (p. 267)

Hahahahaha! Fuck you, Canada! Get your own expressions, eh!

 

English is an Anglo-Saxon language with roots in Latin, the Romance Languages, and German. [No.] This means that most, if not all, English words are variations of foreign words, and such words have legitimately entered the language. (p. 281)

WHAT THE FUCK DOES THIS MEAN?!

 

The Oxford English Dictionary prides itself on keeping up to date, and it does pretty well (but not perfect) with including new words in its latest editions. Unfortunately, libraries with limited budgets these days do not always have the most recent revisions. Your best bet for researching neologisms is probably the Internet – for example, Google. (p. 403)

Because the OED is the only dictionary in the world. I’ve said it before and I’ll say it again: In linguistics research there is only the OED and Google. It’s a wonder we get anything done.

 

Chairman has become chairperson and has been further reduced to chair. But many gender-based terms remain unresolved. While, for example, policeman easily becomes police officer, other words and phrases resist change. One almost invariably hears expressions such as “Everyone to their own taste. [What? Who invariably hears this?] Grammatically incorrect [Nope!] but why risk offending potential female customers of advertised products? [Bitches be trippin’, amiright?] However, when a woman mans the controls of an aircraft, should the term be changed even though it denotes action, not identity? What should we now call a “manhole cover”? [Serious questions, you guys.] Note that we no longer have actresses; they all insist on being called actors. [How dare they?!] (p. 13)

Based on the claims about language alone, I would not recommend this book. I don’t know how someone writes a book about language and gets so much wrong. The word and phrase entries may be useful, but any online dictionary will have most if not all of them. Go there instead or get a proper reference book from a respected dictionary.

Read Full Post »

In two recent papers, one by Kloumann et al. (2012) and the other by Dodds et al. (2015), a group of researchers created a corpus to study the positivity of the English language. I looked at some of the problems with those papers here and here. For this post, however, I want to focus on one of the registers in the authors’ corpus – song lyrics. There is a problem with taking language such as lyrics out of context and then judging them based on the positivity of the words in the songs. But first I need to briefly explain what the authors did.

In the two papers, the authors created a corpus based on books, New York Times articles, tweets and song lyrics. They then created a list of the 10,000 most common word types in their corpus and had voluntary respondents rate how positive or negative they felt the words were. They used this information to claim that human language overall (and English) is emotionally positive.

That’s the idea anyway, but song lyrics exist as part of a multimodal genre. There are lyrics and there is music. These two modalities operate simultaneously to convey a message or feeling. This is important for a couple of reasons. First, the other registers in the corpus do not work like song lyrics. Books and news articles are black text on a white background with few or no pictures. And tweets are not always multimodal – it’s possible to include a short video or picture in a tweet, but it’s not necessary (Side note: I would like to know how many tweets in the corpus included pictures and/or videos, but the authors do not report that information).

So if we were to do a linguistic analysis of an artist or a genre of music, we would create a corpus of the lyrics of that artist or genre. We could then study the topics that are brought up in the lyrics, or even common words and expressions (lexical bundles or n-grams) that are used by the artist(s). We could perhaps even look at how the writing style of the artist(s) changed over time.

But if we wanted to perform an analysis of the positivity of the songs in our corpus, we would need to incorporate the music. The lyrics and music go hand in hand – without the music, you only have poetry. To see what I mean, take a look at the following word list. Do the words in this list look particularly positive or negative to you?

a

ain’t

all

and

as

away

back

bitch

body

breast

but

butterfly

can

can’t

caught

chasing

comin’

days

did

didn’t

do

dog

down

everytime

fairy

fantasy

for

ghost

guess

had

hand

harm

her

his

i

i’m

if

in

it

looked

lovely

jar

makes

mason

life

live

maybe

me

mean

momma’s

more

my

need

nest

never

no

of

on

outside

pet

pin

real

return

robin

scent

she

sighing

slips

smell sorry

that

the

then

think

to

today

told

up

want

wash

went

what

when

with

withered

woke

would

yesterday

you

you’re

your

If we combine these words as Rivers Cuomo did in his song “Butterfly”, they average out to a positive score of 5.23. Here are the lyrics to that song.

Yesterday I went outside
With my momma’s mason jar
Caught a lovely Butterfly
When I woke up today
And looked in on my fairy pet
She had withered all away
No more sighing in her breast

I’m sorry for what I did
I did what my body told me to
I didn’t mean to do you harm
But everytime I pin down what I think I want
it slips away – the ghost slips away

I smell you on my hand for days
I can’t wash away your scent
If I’m a dog then you’re a bitch
I guess you’re as real as me
Maybe I can live with that
Maybe I need fantasy
A life of chasing Butterfly

I’m sorry for what I did
I did what my body told me to
I didn’t mean to do you harm
But everytime I pin down what I think I want
it slips away – the ghost slips away

I told you I would return
When the robin makes his nest
But I ain’t never comin’ back
I’m sorry, I’m sorry, I’m sorry

Does this look like a positive text to you? Does it look moderate, neither positive nor negative? I would say not. It seems negative to me, a sad song based on the opera Madame Butterfly, in which a man leaves his wife because he never really cared for her. When we include the music into our consideration, the non-positivity of this song is clear.


Let’s take a look at another list. How does this one look?

above

absence

alive

an

animal

apart

are

away

become

brings

broke

can

closer

complicate

desecrate

down

drink

else

every

everything

existence

faith

feel

flawed

for

forest

from

fuck

get

god

got

hate

have

help

hive

honey

i

i’ve

inside

insides

is

isolation

it

it’s

knees

let

like

make

me

my

myself

no

of

off

only

penetrate

perfect

reason

scraped

sell

sex

smell

somebody

soul

stay

stomach

tear

that

the

thing

through

to

trees

violate

want

whole

within

works

you

your

Based on the ratings in the two papers, this list is slightly more positive, with an average happiness rating of 5.46. When the words were used by Trent Reznor, however, they expressed “a deeply personal meditation on self-hatred” (Huxley 1997: 179). Here are the lyrics for “Closer” by Nine Inch Nails:

You let me violate you
You let me desecrate you
You let me penetrate you
You let me complicate you

Help me
I broke apart my insides
Help me
I’ve got no soul to sell
Help me
The only thing that works for me
Help me get away from myself

I want to fuck you like an animal
I want to feel you from the inside
I want to fuck you like an animal
My whole existence is flawed
You get me closer to god

You can have my isolation
You can have the hate that it brings
You can have my absence of faith
You can have my everything

Help me
Tear down my reason
Help me
It’s your sex I can smell
Help me
You make me perfect
Help me become somebody else

I want to fuck you like an animal
I want to feel you from the inside
I want to fuck you like an animal
My whole existence is flawed
You get me closer to god

Through every forest above the trees
Within my stomach scraped off my knees
I drink the honey inside your hive
You are the reason I stay alive

As Reznor (the songwriter and lyricist) sees it, “Closer” is “supernegative and superhateful” and that the song’s message is “I am a piece of shit and I am declaring that” (Huxley 1997: 179). You can see what he means when you listen to the song (minor NSF warning for the imagery in the video). [1]

Nine Inch Nails: Closer (Uncensored) (1994) from Nine Inch Nails on Vimeo.

Then again, meaning is relative. Tommy Lee has said that “Closer” is “the all-time fuck song. Those are pure fuck beats – Trent Reznor knew what he was doing. You can fuck to it, you can dance to it and you can break shit to it.” And Tommy Lee should know. He played in the studio for NIИ and he is arguably more famous for fucking than he is for playing drums.

Nevertheless, the problem with the positivity rating of songs keeps popping up. The song “Mad World” was a pop hit for Tears for Fears, then reinterpreted in a more somber tone by Gary Jules and Michael Andrews. But it is rated a positive 5.39. Gotye’s global hit about failed relationships, “Somebody That I Used To Know”, is rated a positive 5.33. The anti-war and protest ballad “Eve of Destruction”, made famous by Barry McGuire, rates just barely on the negative side at 4.93. I guess there should have been more depressing references besides bodies floating, funeral processions, and race riots if the song writer really wanted to drive home the point.

For the song “Milkshake”, Kelis has said that it “means whatever people want it to” and that the milkshake referred to in the song is “the thing that makes women special […] what gives us our confidence and what makes us exciting”. It is rated less positive than “Mad World” at 5.24. That makes me want to doubt the authors’ commitment to Sparkle Motion.

Another upbeat jam that the kids listen to is the Ramones’ “Blitzkrieg Bop”. This is the energetic and exciting anthem of punk rock. It’s rated a negative 4.82. I wonder if we should even look at “Pinhead”.

Then there’s the old American folk classic “Where did you sleep last night”, which Nirvana performed a haunting version of on their album MTV Unplugged in New York. The song (also known as “In the Pines” and “Black Girl”) was first made famous by Lead Belly and it includes such catchy lines as

My girl, my girl, don’t lie to me
Tell me where did you sleep last night
In the pines, in the pines
Where the sun don’t ever shine
I would shiver the whole night through

And

Her husband was a hard working man
Just about a mile from here
His head was found in a driving wheel
But his body never was found

This song is rated a positive 5.24. I don’t know about you but neither the Lead Belly version, nor the Nirvana cover would give me that impression.

Even Pharrell Williams’ hit song “Happy” rates only 5.70. That’s a song so goddamn positive that it’s called “Happy”. But it’s only 0.03 points more positive than Eric Clapton’s “Tears in Heaven”, which is a song about the death of Clapton’s four-year-old son. Harry Chapin’s “Cat’s in the Cradle” was voted the fourth saddest song of all time by readers of Rolling Stone but it’s rated 5.55, while Willie Nelson’s “Always on My Mind” rates 5.63. So they are both sadder than “Happy”, but not by much. How many lyrics must a man research, before his corpus is questioned?

Corpus linguistics is not just gathering a bunch of words and calling it a day. The fact that the same “word” can have several meanings (known as polysemy), is a major feature of language. So before you ask people to rate a word’s positivity, you will want to make sure they at least know which meaning is being referred to. On top of that, words do not work in isolation. Spacing is an arbitrary construct in written language (remember that song lyrics are mostly heard not read). The back used in the Ramones’ lines “Piling in the back seat” and “Pulsating to the back beat” are not about a body part. The Weezer song “Butterfly” uses the word mason, but it’s part of the compound noun mason jar, not a reference to a brick layer. Words are also conditioned by the words around them. A word like eve may normally be considered positive as it brings to mind Christmas Eve and New Year’s Eve, but when used in a phrase like “the eve of destruction” our judgment of it is likely to change. In the corpus under discussion here, eat is rated 7.04, but that doesn’t consider what’s being eaten and so can not account for lines like “Eat your next door neighbor” (from “Eve of Destruction”).

We could go on and on like this. The point is that the authors of both of the papers didn’t do enough work with their data before drawing conclusions. And they didn’t consider that some of the language in their corpus is part of a multimodal genre where there are other things affecting the meaning of the language used (though technically no language use is devoid of context). Whether or not the lyrics of a song are “positive” or “negative”, the style of singing and the music that they are sung to will highly effect a person’s interpretation of the lyrics’ meaning and emotion. That’s just the way that music works.

This doesn’t mean that any of these songs are positive or negative based on their rating, it means that the system used by the authors of the two papers to rate the positivity or negativity of language seems to be flawed. I would have guessed that a rating system which took words out of context would be fundamentally flawed, but viewing the ratings of the songs in this post is a good way to visualize that. The fact that the two papers were published in reputable journals and picked up by reputable publications, such as the Atlantic and the New York Times, only adds insult to injury for the field of linguistics.

You can see a table of the songs I looked at for this post below and an spreadsheet with the ratings of the lyrics is here. I calculated the positivity ratings by averaging the scores for the word tokens in each song, rather than the types.

(By the way, Tupac is rated 4.76. It’s a good thing his attitude was fuck it ‘cause motherfuckers love it.)

Song Positivity score (1–9)
“Happy” by Pharrell Williams 5.70
“Tears in Heaven” by Eric Clapton 5.67
“You Were Always on My Mind” by Willie Nelson 5.63
“Cat’s in the Cradle” by Harry Chapin 5.55
“Closer” by NIN 5.46
“Mad World” by Gary Jules and Michael Andrews 5.39
“Somebody that I Used to Know” by Gotye feat. Kimbra 5.33
“Waitin’ for a Superman” by The Flaming Lips 5.28
“Milkshake” by Kelis 5.24
“Where Did You Sleep Last Night” by Nirvana 5.24
“Butterfly” by Weezer 5.23
“Eve of Destruction” by Barry McGuire 4.93
“Blitzkrieg Bop” by The Ramones 4.82

 

Footnotes

[1] Also, be aware that listening to these songs while watching their music videos has an effect on the way you interpret them. (Click here to go back up.)

References

Isabel M. Kloumann, Christopher M. Danforth, Kameron Decker Harris, Catherine A. Bliss, Peter Sheridan Dodds. 2012. “Positivity of the English Language”. PLoS ONE. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029484

Dodds, Peter Sheridan, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, and Christopher M. Danforth. 2015. “Human language reveals a universal positivity bias”. PNAS 112:8. http://www.pnas.org/content/112/8/2389

Huxley, Martin. 1997. Nine Inch Nails. New York: St. Martin’s Griffin.

Read Full Post »

Last week I wrote a post called “If you’re not a linguist, don’t do linguistics”. This got shared around Twitter quite a bit and made it to the front page of r/linguistics, so a lot of people saw it. Pretty much everyone had good insight on the topic and it generated some great discussion. I thought it would be good to write a follow-up to flesh out my main concerns in a more serious manner (this time sans emoticons!) and to address the concerns some people had with my reasoning.

The paper in question is by Dodds et al. (2015) and it is called “Human language reveals a universal positivity bias”. The certainty of that title is important since I’m going to try to show in this post that the authors make too many assumptions to reliably make any claims about all human language. I’m going to focus on the English data because that is what I am familiar with. But if anyone who is familiar with the data in other languages would like to weigh in, please do so in the comments.

The first assumption made by the authors is that it is possible to make universal claims about language using only written data. This is not a minor issue. The differences between spoken and written language are many and major (Linell 2005). But dealing with spoken data is difficult – it takes much more time and effort to collect and analyze than written data. We can argue, however, that even in highly literate societies, the majority of language use is spoken – and spoken language does not work like written language. This is an assumption that no scholar should ever make. So any research which makes claims about all human language will therefore have to include some form of spoken data. But the data set that the authors draw from (called their corpus) is made from tweets, song lyrics, New York Times articles and the Google Books project. Tweets and song lyrics, let alone news articles or books, do not mimic spoken language in an accurate way. For example, these registers may include the same words as human speech, but certainly not in the same proportion. Written language does not include false starts, nor does it include repetition or elusion in near the same way that spoken language does. Anyone who has done any transcription work will tell you this.

The next assumption made by the authors is that their data is representative of all human language. Representativeness is a major issue in corpus linguistics. When linguists want to investigate a register or variety of language, they build a corpus which is representative of that register or variety by taking a large enough and balanced sample of texts from that register. What is important here, however, is that most linguists do not have a problem with a set of data representing a larger register – so long as that larger register isn’t all human language. For example, if we wanted to research modern English journalism (quite a large register), we would build a corpus of journalism texts from English-speaking countries and we would be careful to include various kinds of journalism – op-eds, sports reporting, financial news, etc. We would not build a corpus of articles from the Podunk Free Press and make claims about all English journalism. But representativeness is a tricky issue. The larger the language variety you are trying to investigate, the more data from that variety you will need in your corpus. Baker (2010: 7) notes that a corpus analysis of one novel is “unlikely to be representative of all language use, or all novels, or even the general writing style of that author”. The English sub-corpora in Dodds et al. exists somewhere in between a fully non-representative corpus of English (one novel) and a fully representative corpus of English (all human speech and writing in English). In fact, in another paper (Dodds et al. 2011), the representativeness of the Twitter corpus is explained as “First, in terms of basic sampling, tweets allocated to data feeds by Twitter were effectively chosen at random from all tweets. Our observation of this apparent absence of bias in no way dismisses the far stronger issue that the full collection of tweets is a non-uniform subsampling of all utterances made by a non-representative subpopulation of all people. While the demographic profile of individual Twitter users does not match that of, say, the United States, where the majority of users currently reside, our interest is in finding suggestions of universal patterns.”. What I think that doozy of a sentence in the middle is saying is that the tweets come from an unrepresentative sample of the population but that the language in them may be suggestive of universal English usage. Does that mean can we assume that the English sub-corpora (specifically the Twitter data) in Dodds et al. is representative of all human communication in English?

Another assumption the authors make is that they have sampled their data correctly. The decisions on what texts will be sampled, as Tognini-Bonelli (2001: 59) points out, “will have a direct effect on the insights yielded by the corpus”. Following Biber (see Tognini-Bonelli 2001: 59), linguists can classify texts into various channels in order to assure that their sample texts will be representative of a certain population of people and/or variety of language. They can start with general “channels” of the language (written texts, spoken data, scripted data, electronic communication) and move on to whether the language is private or published. Linguists can then sample language based on what type of person created it (their age, sex, gender, social-economic situation, etc.). For example, if we made a corpus of the English articles on Wikipedia, we would have a massive amount of linguistic data. Literally billions of words. But 87% of it will have been written by men and 59% of it will have been written by people under the age of 40. Would you feel comfortable making claims about all human language based on that data? How about just all English language encyclopedias?

The next assumption made by the authors is that the relative positive or negative nature of the words in a text are indicative of how positive that text is. But words can have various and sometimes even opposing meanings. Texts are also likely to contain words that are written the same but have different meanings. For example, the word fine in the Dodds et al. corpus, like the rest of the words in the corpus, is just a four letter word – free of context and naked as a jaybird. Is it an adjective that means “good, acceptable, or satisfactory”, which Merriam-Webster says is sometimes “used in an ironic way to refer to things that are not good or acceptable”? Or does it refer to that little piece of paper that the Philadelphia Parking Authority is so (in)famous for? We don’t know. All we know is that it has been rated 6.74 on the positivity scale by the respondents in Dodds et al. Can we assume that all the uses of fine in the New York Times are that positive? Can we assume that the use of fine on Twitter is always or even mostly non-ironic? On top of that, some of the most common words in English also tend to have the most meanings. There are 15 entries for get in the Macmillan Dictionary, including “kill/attack/punish” and “annoy”. Get in Dodds et al. is ranked on the positive side of things at 5.92. Can we assume that this rating carries across all the uses of get in the corpus? The authors found approximately 230 million unique “words” in their Twitter corpus (they counted all forms of a word separately, so banana, bananas, b-a-n-a-n-a-s! would be separate “words”; and they counted URLs as words). So they used the 50,000 most frequent ones to estimate the information content of texts. Can we assume that it is possible to make an accurate claim about how positive or negative a text is based on nothing but the words taken out of context?

Another assumption that the authors make is that the respondents in their survey can speak for the entire population. The authors used Amazon’s Mechanical Turk to crowdsource evaluations for the words in their sub-corpus. 60% of the American people on Mechanical Turk are women and 83.5% of them are white. The authors used respondents located in the United States and India. Can we assume that these respondents have opinions about the words in the corpus that are representative of the entire population of English speakers? Here are the ratings for the various ways of writing laughter in the authors’ corpus:

Laughter tokens Rating
ha 6
hah 5.92
haha 7.64
hahah 7.3
hahaha 7.94
hahahah 7.24
hahahaha 7.86
hahahahaha 7.7
ha 6
hee 5.4
heh 5.98
hehe 6.48
hehehe 7.06

And here is a picture of a character expressing laughter:

Pictured: Good times. Credit: Batman #36, DC Comics, Scott Snyder (wr), Greg Capullo (p), Danny Miki (i), Fco Plascenia (c), Steve Wands (l).

Pictured: Good times. Credit: Batman #36, DC Comics, Scott Snyder (wr), Greg Capullo (p), Danny Miki (i), Fco Plascenia (c), Steve Wands (l).

Can we assume that the textual representation of laughter is always as positive as the respondents rated it? Can we assume that everyone or most people on Twitter use the various textual representations of laughter in a positive way – that they are laughing with someone and not at someone?
Finally, let’s compare some data. The good people at the Corpus of Contemporary American English (COCA) have created a word list based on their 450 million word corpus. The COCA corpus is specifically designed to be large and balanced (although the problem of dealing with spoken language might still remain). In addition, each word in their corpus is annotated for its part of speech, so they can recognize when a word like state is either a verb or a noun. This last point is something that Dodds et al. did not do – all forms of words that are spelled the same are collapsed into being one word. The compilers of the COCA list note that “there are more than 140 words that occur both as a noun and as a verb at least 10,000 times in COCA”. This is the type/token issue that came up in my previous post. A corpus that tags each word for its part of speech can tell the difference between different types of the “same” word (state as a verb vs. state as a noun), while an untagged corpus treats all occurrences of state as the same token. If we compare the 10,000 most common words in Dodds et al. to a sample of the 10,000 most common words in COCA, we see that there are 121 words on the COCA list but not the Dodds et al. list (Here is the spreadsheet from the Dodds et al. paper with the COCA data – pnas.1411678112.sd01 – Dodds et al corpus with COCA). And that’s just a sample of the COCA list. How many more differences would there be if we compared the Dodds et al. list to the whole COCA list?

To sum up, the authors use their corpus of tweets, New York Times articles, song lyrics and books and ask us to assume (1) that they can make universal claims about language despite using only written data; (2) that their data is representative of all human language despite including only four registers; (3) that they have sampled their data correctly despite not knowing what types of people created the linguistic data and only including certain channels of published language; (4) that the relative positive or negative nature of the words in a text are indicative of how positive that text is despite the obvious fact that words can be spelled the same and still have wildly different meanings; (5) that the respondents in their survey can speak for the entire population despite the English-speaking respondents being from only two subsets of two English-speaking populations (USA and India); and (6) that their list of the 10,000 most common words in their corpus (which they used to rate all human language) is representative despite being uncomfortably dissimilar to a well-balanced list that can differentiate between different types of words.

I don’t mean to sound like a Negative Nancy and I don’t want to trivialize the work of the authors in this paper. The corpus that they have built is nothing short of amazing. The amount of feedback they got from human respondents on language is also impressive (to say the least). I am merely trying to point out what we can and can not say based on the data. It would be nice to make universal claims about all human language, but the fact is that even with millions and billions of data points, we still are not able to do so unless the data is representative and sampled correctly. That means it has to include spoken data (preferably a lot of it) and it has to be sampled from all socio-economic human backgrounds.

Hat tip to the commenters on the last post and the redditors over at r/linguistics.

References

Dodds, Peter Sheridan, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, and Christopher M. Danforth. 2015. “Human language reveals a universal positivity bias”. PNAS 112:8. http://www.pnas.org/content/112/8/2389

Dodds, Peter Sheridan, Kameron Decker Harris, Isabel M. Koumann, Catherine A. Bliss, Christopher M. Danforth. 2011. “Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter”. PLOS One. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0026752#abstract0

Baker, Paul. 2010. Sociolinguistics and Corpus Linguistics. Edinburgh: Edinburgh University Press. http://www.ling.lancs.ac.uk/staff/paulb/socioling.htm

Linell, Per. 2005. The Written Language Bias in Linguistics. Oxon: Routledge.

Mair, Christian. 2015. “Responses to Davies and Fuchs”. English World-Wide 36:1, 29–33. doi: 10.1075/eww.36.1.02mai

Tognini-Bonelli, Elena. 2001. Studies in Corpus Linguistics, Volume 6: Corpus Linguistics as Work. John Benjamins. https://benjamins.com/#catalog/books/scl.6/main

Read Full Post »

Older Posts »

%d bloggers like this: