Last week I was on the twitters talking about “untranslatable” words. The idea was about Dr. Tim Lomas’ work on “untranslatable words,” or his term for how some languages have words that don’t have exact equivalents in other languages (but usually English). Right around the same time I posted my blog post, Lomas wrote an article in Psychology Today. Let’s have a look at it. If you want to see my thoughts on “untranslatable” words, go see my post on it and then come back.

Lomas claims that many concepts are non-English in origin. What this means is that the words used to describe these concepts are from other languages. I think this is opening a whole can of worms, but I’m willing to go with the idea that concepts can be “from another language”. For a bit. Let’s move on.

To prove his point, Lomas analyzes an article on positive psychology by Seligman and Csikszentmihalyi (2000). He looks for the etymology of every word in the text.

According to Lomas, there are:

1333 distinct lexemes

‘Native’ English wordsbelonging either to the Germanic language from which English emerged, or originating as neologisms in English itselfcomprise only 39.4% of the sample (and 38% of the psychological words). Thus, over 60% of the general words (and 62% of psychological words) are loanwords, borrowed from other languages at some point in the development of English.

First, Lomas has a strange definition for “‘native’ English words”. Which “Germanic language” does he mean? Proto-Germanic? One of the other West Germanic languages? Old English? It’s also strange because Lomas’ definition means that these words are not native English words: they, table, blue, and orange. [Britney Spears gif says “huh?!” Oprah gif says “hrmmm?!”]

Lomas also doesn’t say exactly how he counted the words in the C&S article. He says that there are 1,333 “distinct lexemes”. The term lexeme is used in linguistics to talk about all the inflected forms of a word: singular and plural forms for nouns, present and past tense forms for verbs, etc. So runner and runners would be a part of the same lexeme RUNNER, and run, runs, ran, running are a part of RUN. Lexemes are also sometimes called “lemmas” in linguistics.

If Lomas really went through every single word in the article, then he spent a whole lotta time on this. The C&S article is 8,124 words long (not including the References section). He doesn’t say how he did the work, but I used some corpus linguistics methods and got different results. I checked the C&S article against the Someya lemma list in AntConc and found 1,750 lemmas, or 417 more lexemes than Lomas found. This is a large difference and I’m not sure how to explain it. Maybe Lomas didn’t divide his words based on parts of speech? So he counted ran and runner as part of the same lexeme? I don’t know.

Second, let’s look at counting the words in language. Lomas seems to do a straight count. That means one instance of one form of a lexeme is equal to all the other instances. For Lomas, it doesn’t matter how many times a word occurs. In corpus linguistics, however, frequency is a big deal. I’m not going to go through the theoretical points here, but basically if a word is more frequent then it is more important or worthy of being looked at (hehe, fight me, corpus linguists).

So, Lomas claims that only 39% of the lexemes in the article are “native English words”. I took the lexemes in the article and ranked them based on frequency (using AntConc). Then I went through the 100 most frequent lexemes on the list and looked at their etymology. My numbers look much different than Lomas’. I found that 85% of the 100 most frequent lexemes are English in origin. That is, the 100 most frequent lexemes occur a total of 4,440 times in the article (so the lexeme the occurs 442 times, the lexeme of occurs 308 times, the lexeme BE occurs 300 times, and so on) and of these occurrences, 3,767 are English words. This isn’t particularly intriguing – you’ll probably find a similar percentage with any text in English. [See the bottom of this post for my data.]

Looking at this from another angle, we could treat each of the 100 most frequent lexemes as equal – forgetting about how often they occur. Then we find that 70 of them are English, while 30 of them come from another language. This is closer to Lomas’ numbers, but still pretty far off: 70 of the 100 most common lexemes in the article are still English words.

Of course, words in language do not really occur in the way that we’re looking at them. The most common word is the with 442 instances, but the first 442 words of the article are not all the. The word the is sprinkled around the article (you know, where the grammar of English calls for it). I’m not sure how to get to Lomas’ numbers. We could assume that every lexeme outside the 100 most frequent were non-English, but that only gets us down to 46% of the words in the article as being English lexemes. Lomas’ ratio was 40% English to 60% non-English.

Later in the article, Lomas says that 234 words were treated as English in origin in his analysis. But this means that only 17% of the words in his counting are English in origin (234/1,333=0.17). What’s going on here? If 39.4% of the lexemes in the article are English in origin, and there are 1,333 total lexemes in the article (according to Lomas), then there should be 525 English words. Where he gets 234, I don’t know. Let’s move on.

Lomas’ includes two graphs to visualize his findings but they’re pretty weird. The graph below “shows the influx of words according to the language of origin (with the century in which they entered English as stacks within them)”. Look at the third column.


English words entered English? I don’t get it. Or Germanic words from before the 12th century are not English words? What’s going on here? I guess in Lomas’ counting, Germanic and English lexemes are English lexemes, but then he splits them up in the graph? Are the words me, myself and I not English words? It seems very strange to me to cut things up like this and I would like to see his list of etymologies, or his rationale for doing so.

Agree to disagree?

But there are places that I can agree with Lomas. At the end of the article, he writes:

In these ways does our understanding of life become complexified and enriched. In that respect, one can make the case that English-speaking psychology would do well to more consciously and actively engage with other languages and cultures. Its understanding of the mind has benefited greatly from English incorporating loanwords over the centuries. If one accepts that premise, it follows that psychology would continue to develop from this kind of cross-cultural engagement and borrowing – including, of course, through collaboration with scholars from non-English speaking cultures themselves. One such way in which the field might develop is through inquiring into untranslatable words, since these constitute clear candidates for borrowing (given that they lack an exact equivalent in English). I myself have sought to promote this kind of endeavor, with my ongoing creation of a cross-cultural lexicography of untranslatable words relating to well-being.

I definitely agree with the first part of this. We should engage with speakers of other languages and people from other cultures (although Lomas’ wording seems to present all English speakers as a monolithic culture). I find it hard for anyone to not accept the premise that English (not just “English-speaking psychology”) has benefited greatly from incorporating loanwords. That’s kind of just a fact of language – borrowing words is one of the things that living languages do and so English is still a living language partly for this reason. But I totally agree that people should collaborate with people from different cultures (although again, Lomas’ wording blurs the distinction between language and culture too much for me and again presents English speakers as one culture).

When Lomas goes into the sales pitch in the second to last sentence, I can’t sign on, particularly based on what I’ve seen of his research into “untranslatable” words (in my last post and in this one and in a later one to come).

Lomas’ claims are true – we should reach out to people who speak other languages. But he should perhaps recognize that the reason that English has so many words from Latin and Ancient Greek is because these were once prestigious languages (and to a large extent still are in academia). It wasn’t because the Latin-speaking or Greek-speaking cultures had anything more special than other cultures, but it was believed that by using these languages people would be more civilized. Of course, we know what happened to the Latin-speaking and (Ancient) Greek-speaking cultures. They dead.

But we in English-speaking cultures could just as easily have adapted Finnish words to use in the fields of psychology and linguistics, but Finnish was never considered a prestigious language. Or consider German: once German raised its standing, we got words from German to describe abstract concepts because the texts describing them were written in German and people were supposed to know German to engage in the debate.

Spreadsheet with my analysis. The first sheet is the Someya lemma list analysis. I counted words from Anglo-Norman as not being English. I’m including the 3rd person plural pronouns (they, them, their, themselves) as being English. Illness counts as English. The second sheet uses AntConc’s Word List tool, so it’s not a lexeme/lemma analysis, it treats every “word” as separate (that is, was, am, and is are separate words, not part of the lexeme BE).

Direct object or prepositional object?

This sentence is in the exercises for one of my grammar classes:

My wife always has a good cry over a wedding.

For the assignment, students need to analyze the syntactic elements of the sentence (subject, predicator, objects, etc.). The answer key has Subject(My wife) Adverbial(always) Predicator(has) Direct object(a good cry) Locative complement(over a wedding). But recently a student analyzed the last clause (over a wedding) as a prepositional object. This got me interested. It turns out the answer key is wrong (maybe you already knew that), but the student might be right. Here’s why.  Continue reading “Direct object or prepositional object?”

Who cares about Latin plurals?

Apparently a lot of people do. You know this. You’ve probably heard something along the lines of what is said in the following tweet:

Mike Pope had a nice response:

But this got me thinking: It’s a bit of slippery slope to say that we have to follow the pluralization rules for Latin with (some) Latin words. Why stop with Latin? English has taken words from other languages as well. And why stop at pluralization? Latin has endings for when a word was used as a subject or object (if my rudimentary Latin is correct). So why not bring those along too? I wrote a joking response to point this out:

As fate would have it, James Harbeck published an article on this very topic on the very same day that these tweets appeared. And Mike Pope published a similar blog post a while ago. I’m not going to restate what they say – you should go read their posts. Instead, I’d like to second what Dr Sarah Shulist responded with and add to it:

The reason that we are told to follow the Latin’s pluralization methods for words from Latin is because Latin has long been held in high prestige by educators and others who wield power in society and language learning. That’s it. If Finnish was held in as high regard as Latin, then we would have people saying it’s incorrect to use saunas because the plural form in Finnish is saunat. But Finnish is not held in the same regard as Latin. Same goes for almost every other language.

But when you think about it, requiring people to use Latin plurals is actually pretty… lazy. We’re talking about noun morphology and in English there are really only a few things we can do to words that are nouns. I know I’m oversimplifying things here, but stay with me. We can:

  • make nouns plural (hero >> heroes)
  • add a genitive marker (hero >> hero’s)
  • add prefixes and suffixes (superhero, heroism, etc.)

Is anyone arguing for applying the Latin genitive to words from Latin? Of course not. Because the prescription that you must use Latin plurals with words from Latin isn’t about grammar at all. It’s about language policing and linguistic discrimination. It’s about putting other people down for following English grammar instead of Latin grammar WHEN THEY’RE SPEAKING ENGLISH. And like most forms of discrimination, it’s lazy thinking. It is only one aspect of noun morphology applied to only some words from pretty much only one language.

To be clear: I’m not saying that it’s discriminatory to use a word from another language and not follow the morphology of that language. It’s kind of the opposite of that. To say that people must follow the pluralization morphology of Latin when they use a word from Latin is classist. When people are speaking English, there is nothing wrong with them using plain old English morphology to pluralize nouns. And, yes, that holds for words from Latin too. It’s possible that people don’t realize that they’re practicing linguistic discrimination when they play the pedant card with words from Latin, but that’s not an excuse. Maybe next time point out that the hill they are dying on isn’t so much a mighty mountain as it is a puny pismire hill.

Anyway, by far the most pragmatic reply was from Marie Georghiou:

Marie wins.

Dialect Surveys of American English and World Englishes

In my review of Joshua Katz’s book Speaking American, I mentioned that a new dialect survey was up. Much of the data in Katz’s book was drawn from an online dialect survey done by Bert Vaux and Scott Golder. Here’s Ben Zimmer giving credit where credit’s due.

Vaux is now conducting the Cambridge Online Survey of World Englishes with Marius L. Jøhndal. If you’re interested in world Englishes, head on over to that site, where you can also see the results without taking the survey.

Vaux also has a new survey of American English dialects available at The survey takes about 10 minutes, depending on how many questions you choose to answer and how long you spend looking at the heat maps it shows you. There are some very fun questions in there.

This ain’t your family member’s thing

I know of the phrase This ain’t your [family member]’s X, but I’m not sure where it came from and who the family member should be. Your grandma? Your daddy? Your granddaddy? I decided to do a quick Duck Duck Go search on some of these that sounded natural. Take what you will from the search results.

“this ain’t your daddy’s”

this ain’t your daddy’s big band

this ain’t your daddy’s Eagles

this ain’t your daddy’s !Q

these ain’t your daddy’s “This ain’t your daddy’s” jokes

“this ain’t your mama’s”

this ain’t your mama’s peach pie recipe

this ain’t your mama (church)’s church

this ain’t your mama’s recipe

“this ain’t your grandma’s”

this ain’t your grandma’s artwork

this ain’t your grandma’s ‘dick’

this ain’t your grandma’s teddy bear

this ain’t your grandma’s postum

this ain’t your grandma’s soap anymore – or is it?

this ain’t your grandma’s bingo

this ain’t your grandma’s SETI

“this ain’t your grandpa’s”

this ain’t your grandpa’s AR-15

this ain’t your grandpa’s DHEA

this ain’t your grandpa’s ceramic bong

this ain’t your grandpa’s laptop

this ain’t your grandpa’s AKIDO

this ain’t your grandpa’s sex toy

If anyone knows where this phrase comes from, please leave a comment below. The OED has an example of it from 2000 under the entry for hot-rodding (“This ain’t your granddad’s classic car book.”), but it must be older than that. COHA has hits for “this ain’t your”, but none followed by a word for a family member. Google Ngrams is no help (surprise!). Each of my searches used a parent or grandparent, so I guess the family member referred to has to be one that is necessarily older in order for the phrase to sound natural. But I bet variations could be used depending on what the “thing” is – “This ain’t your kid’s cartoon” could be used for animated shows and movies that are aimed strictly at adults, such as Big Mouth and Sausage Party. But what sounds natural to you?

I, me and Oxford Dictionaries

I’m sure I’ve tweeted about this already, but the Oxford Dictionaries’ advice on the usage of pronouns just came across my interwebs again (they sent out this quiz in their email newsletter). It’s hard to imagine how a dictionary’s website gets this so wrong, but let’s go through it to see what’s up.

In their advice article “‘I’ or ‘me’?”, Oxford Dictionaries claims that in coordinated constructions where a pronoun and a proper name form the subject of a sentence, the pronoun used must be the subjective form of the pronoun (also called the nominative form). What this means is that in a sentence like “John and I went to the GWAR concert”, it is incorrect to use me instead of I. Let’s leave aside the fact that everyone everywhere naturally uses me in sentences like this. Let’s instead think about the advice that Oxford Dictionaries is giving. We’ll use the sentence that they use: Clare and I are going for a coffee. According to Oxford, it’s not just the subjective pronoun I that must be used in this sentence, only subjective pronouns must be used when the pronoun helps form the subject of a sentence. But how does this work? See if any of the sentences below sound odd to you.

  1. Clare and I are going for a coffee
  2. Clare and me are going for a coffee
  3. Clare and you are going for a coffee
  4. Clare and you are going for a coffee
  5. Clare and she are going for a coffee OR Clare and he are going for a coffee
  6. Clare and her are going for a coffee OR Clare and him are going for a coffee
  7. Clare and they are going for a coffee
  8. Clare and them are going for a coffee

If you’re like me, the first four sound fine (obviously, there’s no difference between the subjective and objective form of the 2nd person pronoun, they’re both you). The fifth one, however, sounds a bit stuffy compared to the sixth one (stuffy is a totally legit linguistics term). And the seventh is bordering on unacceptable. Does Oxford really think that Clare and they are going for a coffee is correct, while Clare and them are going for a coffee is not? Maybe? They didn’t use that sentence as an example. They focused instead on the 1st person pronoun – where there is more variation.

This topic boils down to a few things. First, English tends to favor me as the default pronoun in all cases except for when the pronoun stands alone as the subject. There is such a strong tendency to use me in all cases that this form is sometimes referred to as the oblique form, meaning that in addition to being the object, it fulfills other roles in sentences. And so English quite naturally uses the me form in coordinated structures, or phrases where there’s a pronoun and something else joined together with the word and:

John and me went to the GWAR concert.

Me and the bouncer got into an arm wrestling match.

Me and this other guy partied with GWAR after the show.

Second, using the subjective pronoun I in coordinated constructions isn’t wrong. English allows for both constructions and the choice of which one to use usually breaks along formality of the occasion – John and I seems more formal, while John and me seems more informal. But there is evidence of both structures throughout history in many different styles of writing. The John and I form is dictated by prescriptivist grammarians (and apparently some dictionaries), while the John and me form is proscribed, despite being used by everyone. In constructions with the first person singular pronoun, you can’t go grammatically wrong choosing I or me. But notice, however, that me is more versatile in where it can be placed:

Clare and me are going for a coffee

Me and Clare are going for a coffee

Clare and I are going for a coffee

*I and Clare are going for a coffee

As we have seen, in constructions with the 3rd person pronouns, things are potentially more cut and dry. With the 3rd person singular, it seems we should use the objective forms (him, her) for all but the most formal registers. With the 3rd person plural, however, it seems we should always use the objective form them.

Finally, there is a piece of advice out there that I’ve seen in a lot of places. It goes like this:

In coordinated constructions (noun + pronoun), take out the noun and leave the pronoun. This will show you which case you want.

This advice is dumb. Why would I take something out of a sentence to decide how I should say the rest of the sentence after I put that thing back in the sentence?! This makes no sense at all. This advice is only given with coordinated subjects because it makes it seem like the subjective pronoun is always correct. Here’s Oxford using it at the end of their article:

An easy way of making sure you’ve chosen the right pronoun is to see whether the sentence reads properly if you remove the additional pronoun:

I am going for a coffee. ✗ Me am going for a coffee.

And here’s the Purdue Online Writing Lab:

In compound structures, where there are two pronouns or a noun and a pronoun, drop the other noun for a moment. Then you can see which case you want.

Not: Bob and me travel a good deal.
(Would you say, “me travel”?)

But what happens when I take the pronoun out of the sentence? I’m left with Bob travel a good deal. 😐

Y U NO give better advice, grammer peeple?

Ok, I’m being awful hard on Oxford Dictionaries. The thing is, their advice column could have been cleared up with a line that explained they were talking about Standard English only. Or that outside of standard written and spoken English, people are more likely to come across the form X and me. The X and me construction is so common in informal written and spoken English that using X and I may be out of place. Non-standard and informal English are the default forms of the language, whether they are written or spoken, so users of English will hear/read these forms most often in day to day circumstances. The split in choosing I or me along formal/informal or standard/non-standard lines isn’t a lot of linguistic knowledge for people to understand. They shouldn’t be forced into thinking there is only One True Way to use pronouns in English.

I might post more on this later and include the advice given by other style guides, grammars and dictionaries. If you want to see some of them backing up my claims right now, check out:

  • Merriam Webster’s Dictionary of English Usage, page 778
  • Fowler’s Modern English Usage 4th edition (edited by Butterfield), page 509
  • A Student’s Introduction to English Grammar by Huddleston and Pullum, page 107

A few less countable nouns

While everyone was worrying about whether less or fewer was correct in 10 items or less, another construction has been flying under the radar: a few less. I haven’t seen any style guides make remarks about this phrase, but it is an interesting one. It’s hard to search for online because there’s an Australian movie called A Few Less Men, which dominates the search results. I was able to find a WordReference forum about a few less, but it’s not much help. So let’s go to some corpora to see how a few less is used.

There are 36 hits for a few less in the Corpus of Contemporary American English (COCA), which means it’s not very common (for comparison, there are 4,875 hits for a few more). All of the hits for a few less pre-modify countable nouns.

Year:Genre Concordances – link to search:
West Twenties, one step up from a housing project, which meant a few less elevators chronically out of commission
But if we all drove just a few less times in the entire year, that is progress in an automobile-dependent metropolis like Atlanta
Fox: The Five
They may make a few less dollars, and they should do it.
And it could be that those other services continue on – maybe with a few less people, or maybe some people will cross over.
Move family outerwear out and add a few less flimsy hangers inside.
And how does one cure a sequence consisting of ” a few less atoms every day’?
If (Nu) had a few less zeros, only a short-lived miniature universe could exist. No creatures could grow larger
five, over a ten-year period, maybe a few more, maybe a few less, I don’t know, several times.

If you redo the search, it looks like there are 40 hits but the following do not fit the construction:

  • “Some health plans don’t cover Zyban, but a few less than forthcoming smokers have gotten around that by asking doctors to diagnose them with depression”. It’s more a few less-than-forthcoming smokers.
  • “Only a few less accessible villages have so far been spared of tourists”. This is also a case where less is modifying the following adjective and could be rewritten as a few less-accessible villages.
  • “there are always a few less visible non-tariff barriers which arise which will need to be smoothed out.” This again is a few less-visible non-tariff barriers.

There is also the concordance “Twenty years since our first date. A few less than that since I helped her pick out her first grown-up road bike”. In this construction, I would say that less is a noun and few is an adjective.

In the corpus of Global Web-based English (GloWbE), the US, UK and Australia seem to use this construction most often, although the frequency per million words (the PER MIL column) is not that different between the US, Canada, UK, Ireland, Australia, New Zealand, and the Philippines (see the image below). The concordances also appear to show that a few less is a modifier for a countable noun, although I did not go through all of the 328 hits in GloWbE. You can re-do my search on GloWbE by following this link.

GloWbE - a few less

The way I see it, there are two ways to analyze this construction. First, in a few less NOUNs, the words a few make up a non-exact indefinite quantifying determiner and less is an adjective modifying the noun phrase. What you have is this:

A few less NOUNs = a few (indefinite determiner), less (adjective / head of AdjP), NOUNs

Second, I suppose it’s possible to treat few as an adjective too (modifying the adjective less) and leave a to be the single-word determiner. So you would have something like this:

A few less NOUN = a (determiner), few (adjective / modifier), less (adjective / head of AdjP)

But I wouldn’t go for this analysis because the Longman Student Grammar also treats a few as a quantifying determiner which denotes a small amount (p. 75).

The interesting thing about a few less is that it easily – and quite unremarkably – modifies count nouns. People have a problem with ten less items/dollars/miles/people, but no one seems to raise a fuss about a few less items. Of course, there’s nothing wrong with using less with countable nouns, especially ones that are units of measurement and money. But I don’t think people have considered that if less really can’t modify count nouns – and that fewer needs to be used with count nouns – then the construction we would forced to use is a few fewer items. And no one wants that.


