Some strange language claims in Kaplan’s book on monsters

Matt Kaplan’s book Medusa’s Gaze and Vampire’s Bite is about the science behind monsters, or how we can trace the origins of some of our most classic horrible creatures. The book does a good job in that regard, but it also makes some interesting claims about language. One of these seems to be a simple slip up, while a second follows some unfortunate tropes of describing languages that aren’t in the Germanic or Romance families. The third one is a side note about a claim made by Carl Sagan and it’s very interesting. Let’s look at these in turn.

Continue reading “Some strange language claims in Kaplan’s book on monsters”

Strange etymologies are afoot at Psychology Today

Last week I was on the twitters talking about “untranslatable” words. The idea was about Dr. Tim Lomas’ work on “untranslatable words,” or his term for how some languages have words that don’t have exact equivalents in other languages (but usually English). Right around the same time I posted my blog post, Lomas wrote an article in Psychology Today. Let’s have a look at it. If you want to see my thoughts on “untranslatable” words, go see my post on it and then come back.

Lomas claims that many concepts are non-English in origin. What this means is that the words used to describe these concepts are from other languages. I think this is opening a whole can of worms, but I’m willing to go with the idea that concepts can be “from another language”. For a bit. Let’s move on.

To prove his point, Lomas analyzes an article on positive psychology by Seligman and Csikszentmihalyi (2000). He looks for the etymology of every word in the text.

According to Lomas, there are:

1333 distinct lexemes

‘Native’ English wordsbelonging either to the Germanic language from which English emerged, or originating as neologisms in English itselfcomprise only 39.4% of the sample (and 38% of the psychological words). Thus, over 60% of the general words (and 62% of psychological words) are loanwords, borrowed from other languages at some point in the development of English.

First, Lomas has a strange definition for “‘native’ English words”. Which “Germanic language” does he mean? Proto-Germanic? One of the other West Germanic languages? Old English? It’s also strange because Lomas’ definition means that these words are not native English words: they, table, blue, and orange. [Britney Spears gif says “huh?!” Oprah gif says “hrmmm?!”]

Lomas also doesn’t say exactly how he counted the words in the C&S article. He says that there are 1,333 “distinct lexemes”. The term lexeme is used in linguistics to talk about all the inflected forms of a word: singular and plural forms for nouns, present and past tense forms for verbs, etc. So runner and runners would be a part of the same lexeme RUNNER, and run, runs, ran, running are a part of RUN. Lexemes are also sometimes called “lemmas” in linguistics.

If Lomas really went through every single word in the article, then he spent a whole lotta time on this. The C&S article is 8,124 words long (not including the References section). He doesn’t say how he did the work, but I used some corpus linguistics methods and got different results. I checked the C&S article against the Someya lemma list in AntConc and found 1,750 lemmas, or 417 more lexemes than Lomas found. This is a large difference and I’m not sure how to explain it. Maybe Lomas didn’t divide his words based on parts of speech? So he counted ran and runner as part of the same lexeme? I don’t know.

Second, let’s look at counting the words in language. Lomas seems to do a straight count. That means one instance of one form of a lexeme is equal to all the other instances. For Lomas, it doesn’t matter how many times a word occurs. In corpus linguistics, however, frequency is a big deal. I’m not going to go through the theoretical points here, but basically if a word is more frequent then it is more important or worthy of being looked at (hehe, fight me, corpus linguists).

So, Lomas claims that only 39% of the lexemes in the article are “native English words”. I took the lexemes in the article and ranked them based on frequency (using AntConc). Then I went through the 100 most frequent lexemes on the list and looked at their etymology. My numbers look much different than Lomas’. I found that 85% of the 100 most frequent lexemes are English in origin. That is, the 100 most frequent lexemes occur a total of 4,440 times in the article (so the lexeme the occurs 442 times, the lexeme of occurs 308 times, the lexeme BE occurs 300 times, and so on) and of these occurrences, 3,767 are English words. This isn’t particularly intriguing – you’ll probably find a similar percentage with any text in English. [See the bottom of this post for my data.]

Looking at this from another angle, we could treat each of the 100 most frequent lexemes as equal – forgetting about how often they occur. Then we find that 70 of them are English, while 30 of them come from another language. This is closer to Lomas’ numbers, but still pretty far off: 70 of the 100 most common lexemes in the article are still English words.

Of course, words in language do not really occur in the way that we’re looking at them. The most common word is the with 442 instances, but the first 442 words of the article are not all the. The word the is sprinkled around the article (you know, where the grammar of English calls for it). I’m not sure how to get to Lomas’ numbers. We could assume that every lexeme outside the 100 most frequent were non-English, but that only gets us down to 46% of the words in the article as being English lexemes. Lomas’ ratio was 40% English to 60% non-English.

Later in the article, Lomas says that 234 words were treated as English in origin in his analysis. But this means that only 17% of the words in his counting are English in origin (234/1,333=0.17). What’s going on here? If 39.4% of the lexemes in the article are English in origin, and there are 1,333 total lexemes in the article (according to Lomas), then there should be 525 English words. Where he gets 234, I don’t know. Let’s move on.

Lomas’ includes two graphs to visualize his findings but they’re pretty weird. The graph below “shows the influx of words according to the language of origin (with the century in which they entered English as stacks within them)”. Look at the third column.

Lomas_PT_graph_1

English words entered English? I don’t get it. Or Germanic words from before the 12th century are not English words? What’s going on here? I guess in Lomas’ counting, Germanic and English lexemes are English lexemes, but then he splits them up in the graph? Are the words me, myself and I not English words? It seems very strange to me to cut things up like this and I would like to see his list of etymologies, or his rationale for doing so.

Agree to disagree?

But there are places that I can agree with Lomas. At the end of the article, he writes:

In these ways does our understanding of life become complexified and enriched. In that respect, one can make the case that English-speaking psychology would do well to more consciously and actively engage with other languages and cultures. Its understanding of the mind has benefited greatly from English incorporating loanwords over the centuries. If one accepts that premise, it follows that psychology would continue to develop from this kind of cross-cultural engagement and borrowing – including, of course, through collaboration with scholars from non-English speaking cultures themselves. One such way in which the field might develop is through inquiring into untranslatable words, since these constitute clear candidates for borrowing (given that they lack an exact equivalent in English). I myself have sought to promote this kind of endeavor, with my ongoing creation of a cross-cultural lexicography of untranslatable words relating to well-being.

I definitely agree with the first part of this. We should engage with speakers of other languages and people from other cultures (although Lomas’ wording seems to present all English speakers as a monolithic culture). I find it hard for anyone to not accept the premise that English (not just “English-speaking psychology”) has benefited greatly from incorporating loanwords. That’s kind of just a fact of language – borrowing words is one of the things that living languages do and so English is still a living language partly for this reason. But I totally agree that people should collaborate with people from different cultures (although again, Lomas’ wording blurs the distinction between language and culture too much for me and again presents English speakers as one culture).

When Lomas goes into the sales pitch in the second to last sentence, I can’t sign on, particularly based on what I’ve seen of his research into “untranslatable” words (in my last post and in this one and in a later one to come).

Lomas’ claims are true – we should reach out to people who speak other languages. But he should perhaps recognize that the reason that English has so many words from Latin and Ancient Greek is because these were once prestigious languages (and to a large extent still are in academia). It wasn’t because the Latin-speaking or Greek-speaking cultures had anything more special than other cultures, but it was believed that by using these languages people would be more civilized. Of course, we know what happened to the Latin-speaking and (Ancient) Greek-speaking cultures. They dead.

But we in English-speaking cultures could just as easily have adapted Finnish words to use in the fields of psychology and linguistics, but Finnish was never considered a prestigious language. Or consider German: once German raised its standing, we got words from German to describe abstract concepts because the texts describing them were written in German and people were supposed to know German to engage in the debate.

There’s more to say about all this and I’ll be back at cha with a later post. I’ll link to it when I write it.

 

Data

Spreadsheet with my analysis. The first sheet is the Someya lemma list analysis. I counted words from Anglo-Norman as not being English. I’m including the 3rd person plural pronouns (they, them, their, themselves) as being English. Illness counts as English. The second sheet uses AntConc’s Word List tool, so it’s not a lexeme/lemma analysis, it treats every “word” as separate (that is, was, am, and is are separate words, not part of the lexeme BE).

Link to download the C&S article as a plain text file (.txt) which was used with AntConc in the analysis. The References section is excluded. And here’s a link to download a POS-tagged version of the article (using CLAWS7).

25 words That Do Mean What You Think They Do

An article in Mental Floss called “25 Words That Don’t Mean What You Think They Do” attempts to educate readers on the One True Meaning™ of words in a listicle. It’s written by Paul Anthony Jones, who runs the Haggard Hawks account on Twitter and has written several books on language. I like Haggard Hawks and I enjoyed Jones’s interview on BBC’s Radio 4. That’s what makes this article so puzzling. It takes a prescriptivist stance in the meaning of words, claiming for the most part that what the words in the list originally meant is what they mean now. I find this position wrongheaded and contradictory. Words change meaning, which I’m sure Jones has no problem acknowledging, but to insist that their original meaning (or some former meaning) is the only one that’s correct is like claiming that women shouldn’t have the right to vote because, well… they used to not have the right to vote. Things change, and you either change with them or you will be left out. Language is no different in this regard.

What’s especially strange about this position (and the Mental Floss article) is that the history of English undermines the argument itself. For example, is there a certain date we can look back to when a word’s meaning was “correct”? The word deer originally meant any animal that was hunted. Are we using it wrong when we refer to what everyone knows of as a deer? No. Likewise, the word nice originally meant foolish. Now it means nice. There are scores more words like this in English. So why do some words deserve a place on lists like the one in the Mental Floss article while others do not?

Speaking of undermining the argument, the Mental Floss article references dictionaries which directly undermine the article’s claims. Lexicographers today use corpora (databases of language) to determine the meaning of words. When there are several meanings, dictionaries usually list them in descending order of how frequently each is used. Not every dictionary does this, but Merriam-Webster does and that’s the one that the Mental Floss references. (Macmillan does too)

Let’s take a look at the words in the list and see what’s going on. To be perfectly clear, this article claims that “in the dictionary […] there are plenty of words being misused and misinterpreted”. Dictionaries are written by lexicographers and their first job is to discover what words mean. So this article is basically saying that lexicographers aren’t doing their job. The first salvo made in the article is an attack on the figurative use of literally. It’s not on the listicle (thankfully), so I’m not going to cover it. You can see how the “misuse” of figurative literally has been confirmed to death here and here.

I won’t go through the whole listicle. Some of the entries on it are correct. For example, the first item on the listicle says “barter doesn’t mean haggle”, which it doesn’t, but it’s still unclear who is using barter to mean haggle. The numbers below refer to the numbers from the listicle.

2. Bemused doesn’t mean amused

Strictly speaking, bemused and amused don’t mean the same thing. Although the use of bemused to mean “wryly amused” is so widespread nowadays that it has found its way into the dictionary, bemused actually means “dazed,” “bewildered,” or “addled.”

Here we see the article contradicting itself by linking to a dictionary which defines bemused as “having or showing feelings of wry amusement especially from something that is surprising or perplexing”.

3. Depreciate doesn’t mean “deprecate”

Here the Mental Floss article acknowledges that self-deprecating = self-depreciating, but it links to a site called Grammarist and claims that self-deprecating is 40 times more common than self-depreciating. I couldn’t find out who runs Grammarist and they do not say where they get their figures from. But in the Corpus of Contemporary American English, the ratio of self-deprecating to self-depreciating is 512:2. That makes it 256 times more common.

4. Dilemma doesn’t mean quandary

Ugh, why do we have to do this? The writer claims that dilemma must be a choice between only two alternatives because di– means “two”. This is nonsense and MW even says so. The word disperse has the same di– prefix, so must it mean spreading things into only TWO directions? No. Prefixes from other dead languages do not determine today’s meaning of a word. That shit is bananas.

5. Disinterested does not mean “uninterested”

Except sometimes to totally does.

6. Electrocute does not mean “to get an electric shock”

Lol wut?

** By the way, the meaning “give an electric shock to” was recorded the year after the original meaning was recorded (source: OED). This word couldn’t even hold on to its meaning for a year. Sad!

9. Flaunt does not mean “flout”

K.

15. Nonplussed does not mean “not bothered”

“Many people use nonplussed to mean ‘unperturbed’ or ‘unaffected’”.

Well, that settles it then. I guess you better update your lexicon or you’re going to be left out of the conversation because the people using nonplussed to mean “not bothered” are not going to get the memo. Unless you already know that nonplussed can be used to mean “not bothered”… Wait, you do? Well, then I guess everything is sorted.

16. Oblivious doesn’t mean “unaware”

Or at least, it didn’t originally.

Aaaaand we’re back to appealing to antiquity, that old etymological fallacy. Now please explain what deer, nice, silly, and A THOUSAND OTHER WORDS mean.

17. Peruse doesn’t mean “browse”

perusing something actually means studying it in great detail.

Technically, peruse originally meant “to use up”, so you’re both wrong. If we’re going to be pedantic, why not go all the way?

As the OED notes, peruse has been used as a “broad synonym for read” since the goddamn 16th Century! (curse word mine, but it’s totally implied by the OED):

“Modern dictionaries and usage guides, perhaps influenced by the word’s earlier history in English, have sometimes claimed that the only ‘correct’ usage is in reference to reading closely or thoroughly (cf. senses 4a, 4b). However, peruse has been a broad synonym for read since the 16th cent., encompassing both careful and cursory reading; Johnson defined and used it as such. The implication of leisureliness, cursoriness, or haste is therefore not a recent development, although it is usually found in less formal contexts and is less frequent in earlier use (see quot. 1589 for an early example). The specific sense of browsing or skimming emerged relatively recently, generally in ironic or humorous inversion of the formal sense of thoroughness.” (OED, peruse)

You should definitely peruse this Mental Floss article and not take in the details.

17. Plethora doesn’t mean “a lot of”

Forgive me, El Guapo. I know that I, Jefe, do not have your superior intellect and education. But could it be that once again, you are angry at something else, and are looking to take it out on me?

//

I don’t have any more time for this. Remember when I said that it was unclear who is using barter to mean haggle? Well, that’s one way that language changes. If enough people use barter to mean haggle – and everyone understands what is meant – then barter means haggle. Just like how enough people use(d) literally to be an an intensifier like really and now literally is an intensifier, in addition to its other meanings. Lexicographers are just doing their job by updating the dictionary to include new meanings.

Stop trying to force people into using words the way you think they should be used, especially if you know what people mean when they use those words! Instead, let’s celebrate that we are witnessing language change happen.

Holy History, Batman! The Origin of Dynamic Duo!

I assumed that the term dynamic duo must come out of comics. Comic book creators have long been coming up with alliterative epithets for their characters. Superman is the “Big Blue Boy Scout”, Supergirl is the “Maid of Might”, Batman is the “Caped Crusader”, Silver Surfer is the “Sentinel of the Spaceways”, Flash is the “Scarlet Speedster”, Wonder Woman is the “Amazing Amazon”, Spider-man is spectacular, the Avengers are (also) amazing, and the Four are fantastic. You get the point.

But having been around the etymological block a few times, I know that everything in language is older than you think it is. So I thought the phrase dynamic duo might come out of some earlier work. Perhaps it was reappropriated by comic book authors to describe the Dark Knight and the Boy Wonder.

Various dictionaries, however, claim that dynamic duo comes from the famous 1960s Batman TV series starring Adam West and Burt Ward, including NTC’s Dictionary of American Slang and Colloquial Expressions by Richard Spears and the Dictionary of American Slang by Kipfer and Chapman. The Batman TV series premiered in 1966, or 25 years after Robin was created. I found it hard to believe that it would have taken writers that long to come up with dynamic duo, so I decided to dig a little deeper.

Using Google Books, I found a volume of the Michigan Alumnus which includes the phrase. This was written in 1954:

The Michigan Alumnus - Google Books - Dynamic duo_small

So dynamic duo predates the Batman TV show. But when was it first applied to Batman and Robin? For that we have to dive into the comics.

On October 31, 1940, DC Comics published a story called “The Case of the Joker’s Crime Circus” in BATMAN #4. The story was written by Bill Finger and featured Bob Kane, George Roussos and Jerry Robinson on art. On page 7, we see the first time Batman and Robin are referred to as the dynamic duo:

Batman_4_1940_dynamic_duo

Of course, the term has moved out of the comics and can be applied to any “very special pair of people or things” (Spears 2000). And it may still be true that the Batman TV show is responsible for popularizing the term. But it warms my comic book loving heart to know that Bill Finger came up with dynamic duo.

Holy exciting etymology, Batman! It’s time to update our Bat-tionaries!

 

[Update – Jan. 8, 2018] Holy cats, Batman! Foiled again! Thanks to commenter Jack Smith below, we now know that the term dynamic duo goes back to at least 1910. I don’t know how I missed this one. It was used in an article (or op-ed?) titled “Who’s Who – And Why” in the Saturday Evening Post, vol. 183, is. 2. I couldn’t find out the writer’s name, but the article is from thirty years before Bill Finger used dynamic duo in the pages on BATMAN. Heck, it’s from four years before Bill Finger was even born! You can see a screenshot of the page below, but you should really go check out the article here. It’s fascinating. As it turns out, the original dynamic duo were politicians Theodore Roosevelt and Chase Osborn. So there you have it – proof that the 26th president of the United States and the 27th governor of Michigan swung from rooftops at night and brought justice to the city. Fact.

The Saturday Evening Post article from 1910 which uses “dynamic duo”.

 

References

Spears, Richard. 2000. NTC’s Dictionary of American Slang and Colloquial Expressions 3rd Edition. NTC Publishing Group: New York.http://www.goodreads.com/book/show/2477414.NTC_s_Dictionary_of_American_Slang_and_Colloquial_Expressions

Kipfer, Barbara Ann and Robert L. Chapman. 2007. Dictionary of American Slang 4th Edition. HarperCollins Publishers. http://www.goodreads.com/book/show/2417989.Dictionary_of_American_Slang