I Don't Know. Did They Play Texas Hold 'Em in the Mid-1800s?

On How to Find the Frequency and Existence of Words and Phrases


Lordy, Lordy, Buckeroos, and a happy day to you!

Today's theme is secrets. Let me share a secret with you, but don't share our secret with anyone I've blocked. That would be bad. But unlike a priest who admonishes a child not to share their secret lest the child's family be cursed, I shall merely be sad.

It's called the Google Ngram Viewer. Ever wonder about variations on a word--for instance, symmetric versus symmetrical? The dictionary says they are both fine, but what do folks use in practice? The Ngram Viewer will tell you. It's not perfect, but it's the best there is to date. And you can search date ranges, which is especially helpful for historical fiction (so you don't mistakenly use "home run" [mid-1800s] or "all in" [1903] in a book set in the early 1800s):


Example of Ngram Viewer graph plotting the frequency of the term "home run."

Check out Google's Ngram Viewer

Get more details about Ngram Viewer

Note if you will that the current corpus for Ngram Viewer is 2012, so frequency tends to tail off around 2010 or so as Google continues to scrape recent books into its database. A falloff does not necessarily mean that a word's absolute frequency is declining.

Beyond whether a word exists for a certain period, one can get a decent idea of when a word originated. Perhaps more crucial, especially in dialogue, is that one can discern when one spelling or phrasing has subsumed another. For example, determining when the currently (slightly) dominant phrasing "worse comes to worst" took the reins from the traditional "worst comes to worst" (the understood meaning back then was "when worst becomes the worst"):
Comparing frequency of "worst comes to worst" and "worse comes to worst" usage with Ngram Viewer.
Stay sexy, Buckeroos.

Comments

Popular posts from this blog

Book Review—Too Many Carrots

Doppelgängers