A really accomplished young computer
Jane Austen’s novels have provided the inspiration for any number of musical treatments, from operas and a classical instrumental suite to both original and jukebox musical theater. But only recently, it seems, has anyone had the idea of training a computer to write songs by feeding it the text of Emma.
Yes, that’s what they’re up to over at Google, as we learned last week from an article made available online by two of the company’s researchers, Pablo Samuel Castro and Maria Attarian.
Apparently, getting artificial intelligence to produce decent poetry and song lyrics is an especially challenging computer-science task. Castro and Attarian attempted to improve on previous efforts by training their neural network on two datasets, one consisting of the lyrics of songs in a variety of genres (excluding hip-hop, “to reduce the amount of profanity” in the results), and one drawn from sixteen popular books, all but one of them fiction, available via Project Gutenberg.
Yes, just as Mr. Darcy would dictate, it was not enough for the computers to have a thorough knowledge of music and singing; they also had to improve their minds by extensive reading.
Among the items in the second dataset, which was designed to expand the computer system’s vocabulary, were Emma and Pride and Prejudice. Austen, Dickens, and Twain were the only authors with two books on the list, which also included works by writers ranging from Machiavelli to Charlotte Perkins Gilman.
Alas, however, getting a computer to write poetry is even harder than turning a nineteenth-century heroine into Mr. Darcy's idea of a truly accomplished young lady. The experiment's results, though of interest to computer scientists, were hardly toe-tapping Top 40 hits. (Sample result: “come on, uh/ you remember the voice of the widow/i love the girl of the age/i have a regard for the whole/i have no doubt of the kind/i am sitting in the corner of the mantelpiece.”)
In future, the researchers say, they hope to refine the training process by including more books, as well as books employing a more modern lingo. Since Project Gutenberg primarily includes out-of-copyright works, the list of books used in the experiment is heavy on nineteenth- and early twentieth-century novels, whose vocabularies are not exactly typical for today’s songs. (In the resulting lyrics, Austen’s influence can perhaps be detected in the presence of words like “estate,” and “fancy” used as a noun.)
Personally, however, I’m in favor of nipping this whole thing in the bud right now. Since I make my living, such as it is, as a writer, I’m all in favor of teaching machines as little as possible about writing. Keep the computers ignorant, I say!