Comma-counting across the centuries
By Deborah Yaffe, Jun 5 2014 01:00PM
I’m a punctuation geek, I must admit. A fellow reporter on my college newspaper once claimed to have heard me declare, with a passion usually reserved for Colin Firth movies or extra-thick coffee milkshakes, “I love commas.”
So imagine my joy at stumbling across this compilation of kind-of-fascinating-if-you’re-really-geeky stats about sentence length, word choice and–yes–comma usage in seven popular novels published between 1811 and 2005. Naturally, Sense and Sensibility is the 1811 representative, stacking up against later works by Dickens, Twain, Tolkien, C.S. Lewis, J.K. Rowling and Stephenie Meyer.
The scientific validity of this exercise seems dubious, since the books in question have been chosen for their sales figures, rather than their historical typicality or other, more relevant kinds of equivalence. It seems particularly problematic to draw conclusions about sentence length and vocabulary usage when half your sample consists of books self-consciously aimed at children and teenagers.
But who cares about accuracy when you can have entertainment? The analysis is brought to us by Tyler Vigen, a Harvard law student who produces a highly diverting blog laying out completely random correlations between obviously unrelated data points. I’m especially partial to the eerily perfect fit between the divorce rate in Maine and the per capita consumption of margarine. What could those divorced Mainers be doing with all that vegetable fat? (No, don’t tell me. I don’t think I want to know.)
Vigen’s latest effort reveals some things we might have guessed–Jane Austen uses longer sentences and far more semicolons than Stephenie Meyer, for instance–and provides quantitative verification of some much-remarked aspects of Austen’s style. It’s not startling to learn that Austen, known for her balanced, economical prose and her sparing use of description, employs fewer adjectives and refers to colors far less often than do the later writers.
On the other hand, some of Vigen's findings are more surprising. Would you have guessed that half the words in S&S are among the one hundred most common in the English language? Or that Austen uses exclamation points three times more often than Meyer?
Myself, I blame Anne Steele for the exclamation-point thing (“Poor little creature!. . . . What a charming man he is!. . . . Oh! dear! one never thinks of married men’s being beaux”–and that’s just in one chapter). But verifying that hypothesis would certainly require another study.