But this time it’s… on purpose? What?!
Yesterday, Benji Smith became the main character on Writer Twitter. It turns out that Mr. Smith has created a database of novels that he obtained through probably illegal means. Smith used this database in his Prosecraft project, which published statistics about each novel, such as its word count, the number of adverbs in each, and something called the “vividness” of the writing style (I’m not really sure what that means and Smith doesn’t provide a good definition). He was also using this database to promote his word processor program Shaxpir 4, which is why he’s almost certainly breaking the law.
But one of the other things that he claims to analyze is how many passive verbs are in the novels. And Smith has a very interesting (aka “bad”) definition of “passive voice”.
To measure passive voice […] we measure the total number of helping verbs (be, am, is, are, was, were, etc).
Big yikes! That is not what “passive voice” means. First of all, you can’t reliably measure the number of passive voice constructions by counting the number of helping verbs. Second, the examples given by Smith are all forms of one verb: BE. What about the other helper verbs, HAVE and DO? Does Smith count those as well in looking for passive voice constructions? Because I sure hope he doesn’t since those verbs are most definitely NOT used in forming the passive.
Welllllllll, judging by the analysis he adds, it sure seems that he does. Smith gives us a passage from Nick Hornby’s Juliet Naked, in which he highlights modal verbs (might and could) as well as have:
What’s even stranger is that some of the verbs highlighted in this passage are not even helping verbs. All instances of was in the passage, as well as the second instance of were, are lexical verbs. So it seems that Smith’s definition of “helping verb” is just as bad as his definition of “passive voice”. That can’t be right though, could it?
Indeed it is. In a post called “Thoughts on Passive Voice” (lol), Smith expands on his interesting definitions of passive voice and helping verbs. He says that he got this question from an author:
Is there a reason you are just measuring helping verbs rather than the true grammatical definition of passive voice: the subject of the sentence being acted upon by the object? Smith shows us that he knows as much about these as he does about copyright law.
Ok, that’s not the definition of passive voice either. So before going on, let’s clear this up. Passive voice occurs when the direct object in an active clause is moved to the subject position. The main verb phrase is then changed so that it includes a form of the auxiliary verb BE and the past participle of the lexical verb. For example:
Active: The spider bit me. Passive: I was bitten by the spider.
You’ll often hear that in active voice the subject is the one performing the action of the verb and that in passive voice the subject is being acted upon by the object (as we saw in that question posed to Smith above). But that’s nonsense so don’t believe it. I don’t have time to clear it up here, but you can read these posts about it. For now, just remember that there are two things you need for a passive: a form of the auxiliary verb BE and the past participle of a lexical verb (also called the –en form). So was + bitten = passive voice in the sentence I was bitten.
[Full disclosure: There are other ways to form the passive, including the frequent get-passive, but we don’t have time to get into all that. Just know that if you pick up a linguistics textbook, they cover this stuff in like 10 pages tops. So it’s not even that much to read.]
Ok, let’s back to Smith’s answer. It’s a doosy! After somehow miraculously identifying the only passive in the Hornby passage, Smith digs his heels in and says:
The reason we avoid passive voice in the first place is that it describes a state-of-being, rather than the act-of-doing. Eliminating passive voice lets us directly join the action, rather than observing the side-effects of the action. (Emphasis Smith’s)
This is not why we use the passive. I mean, I guess it could be one reason that we use it, but it certainly isn’t the only one. Sometimes we use the passive because we want to focus on one element over another. So we say The president was shot yesterday because the president is more important than whoever shot them. Other times we use the passive because the Agent (the one performing the verb) is obvious. So we say Coffee and tea are being served in the lounge and The victim was rushed to the hospital because it is obvious that the staff is serving coffee and tea and that paramedics rushed the victim to the hospital (these ones also work as focusing on the more important element for the listener/reader). And still other times we use the passive because the Agent is unknown. So we say The safe was broken into last night because we don’t know who broke into it. It would seem redundant to say Thieves broke into the safe last night.
Smith next shows us that he doesn’t understand what “helping verb” means. Helping verbs are usually called auxiliary verbs in linguistics. These verbs are used to show things like aspect and voice, as well as to provide a finite verb in questions. Merriam-Webster has a pretty simple run down of them. The key thing here though is that auxiliary verbs (aka “helping verbs”) cannot be used on their own. Verbs that can stand on their own are called “lexical verbs” and they are what you’re thinking of when you think of “verbs” (Don’t lie to me. You do sometimes think about verbs. You’re this far into a post about the passive voice. You definitely think about verbs. And that’s ok. One of us…. One of us….)
This gets a little confusing because the primary auxiliary verbs in English are BE, HAVE and DO – and each one of these can also be a lexical verb.
BE as auxiliary: The spider was biting me.
BE as lexical verb: It was a spider.
HAVE as auxiliary: The spider has bitten me.
HAVE as lexical verb: I have a spider bite.
DO as auxiliary: Did the spider bite me?
DO as lexical verb: The spider does this all the time.
Smith is clearly confused by this because he writes:
Those aren’t auxiliary verbs! In the sentence Jane is a student, the verb is is a lexical verb. See it hanging out there on its own? It needs no help, nor is it helping any other verb. Smith makes the same mistaken analysis two more times:
What’s actually going on here is that the verb BE is a copular verb. It links the subject to what comes after it. That means that the thing before the verb BE and the thing after it are essentially the same. Semantically, they are referencing the same entity in the real world. So in the sentence Jane is a student, both Jane and a student are describing the same entity. In the sentence John was exhausted, both John and exhausted are describing the same entity. And in the sentence Jane is in the Navy, both Jane and in the Navy are describing the same entity.
Smith says that “All four of these constructions use auxiliary verbs to impose a layer of indirection between the reader and the true action of the story.” But none of his example sentences use auxiliary verbs! And here’s the thing: sometimes we need to describe a state of being. Sometimes an action-packed sentence doesn’t tell us what we need to know. Just look at Smith’s own examples:
Jane stuffed the textbook into her already-overcrowded backpack and started the long walk to school.
Is Jane a student or a teacher in that sentence? We don’t know!
Smith criticizes the sentence Jane might have been in the Navy for having an intensified degree of detachment because of the stacked helping verbs. But sometimes we need to express doubt or speculation in our speech and writing. That’s why we have modal verbs like might. This is not difficult. You literally learned this stuff naturally before you needed to start shaving. Smith is confusing people with this nonsense. Sometimes we just need to state things.
And that is part of the larger problem with Smith and his database, the one that authors were especially upset about. Smith was using his analyses to train an AI program to analyze writing in real time. And then selling that service. This creates a feedback loop. The A.V. Club explains it succinctly:
This may seem like a niche situation that impacts one community, but it’s just going to get worse until we as a society stop people and companies from doing stupid shit like this to art and culture. If not, it’s just a matter of time before every movie is a regurgitated series of “vivid” scenes from other movies written by an A.I. “writer” and starring A.I. “actors,” every book is just a repeat of the same plot points from other book, and every news article is just based around a list of words that people like to click on.
What’s even worse is that Smith was essentially mistraining his program, or training his program to be incorrect. His analyses are wrong and he was teaching his machine to also make the wrong analyses. That’s why they say that AI isn’t intelligent – because it is being made by bad faith actors who don’t know what they’re doing. And according to his “apology”, he’s keeping the dataset in order to use it – and profit off of it – again.