Eggcorns and Fuzzy Spots
The term eggcorn was coined in 2003 by linguist-bloggers Geoffrey Pullum and Mark Liberman, and has spawned something of a cottage industry of eggcorn-hunters on the Web. An eggcorn is defined as "an idiosyncratic substitution of a word or phrase for a word or words that sound similar or identical in the speaker's dialect [which] introduces a meaning that is different from the original, but plausible in the same context." The eponymous eggcorn, for example, is used by some speakers in place of acorn. Though not standard, the substitution is easy to understand: acorns are seeds (corns) that are sort of shaped like eggs. On web sites such as The Eggcorn Database and Eggcorn Forum, Wikipedia, and various blogs, scores of amateur lexicographers (including yours truly) track and discuss the production and use of these lexical innovations.
Recently Eggcorn Forum contributor Kem Luther, noting uses of both brute and blunt in place of brunt, wrote
I'm beginning to think that people have large fuzzy spots in their brains where the semantic contents of the words "brute," "blunt," "brunt," and "butt" are stored. Speakers do not have a good handle on the meaning of these terms.I think he is quite right. Furthermore, Kem's observation about "large fuzzy spots" has lead me to create a half-baked theory about the role of word frequency in eggcorn formation.
If, as some linguists suggest (e.g. Bybee 1995, Clark 1987, Tomasello 2005 inter alia), language acquisition and language change are sensitive to frequency, then it is unsurprising that lower frequency words should be less clearly differentiated even in the minds of adult speakers. In other words, structures that are heard more frequently will be learned faster and "better" in the sense that the hearer can use them in the standard way that other speakers do; words that are heard less frequently will be learned less well. (Such claims can be somewhat controversial for morphology and syntax, but I think much less so for words.)
Add to low frequency a competition with phonologically similar words, and certain forms seem doomed to reside in "large fuzzy spots" of the mental lexicon. Low-frequency near-homonyms are not heard often enough to be remembered clearly, and they shade into one another in the hearer's memory.
None of the words brute, blunt, brunt or butt appear on Kilgarriff's lemmatized BNC word frequency list, which means that none of them occur more than 800 times in the hundred-million word British National Corpus.* It is therefore unsurprising that this complex is confounded for many speakers.
In comparison, the complex of and, ant, and amp are not confounded despite their phonological similarity, since they are the 4th, 5539th, and 5960th most frequent words in the BNC, respectively. Conversely, I would not expect moan to be part of such a complex even though it is only the 6309th most frequent word, since it has few near homophones. (Of course, having written that, I fully expect someone to find an eggcorn of moan within a few days. Comments are open.)
Bybee, Joan. 1995. Regular morphology and the lexicon. Language and Cognitive Processes 10, 425-455.
Clark, Eve. 1987. The principle of contrast: a constraint on language acquisition. In B. MacWhinney (ed) Mechanisms of Language Acquisition, 1-33. Mahwah, NJ: Lawrence Erlbaum Associates.
Tomasello, Michael. 2005. Constructing a Language. Cambridge, MA: Harvard University Press.
* Even if the word butt is relatively common in speech, it probably occurs most frequently in the anatomical sense. A Google search for "butt of" (as in "butt of the joke" etc.) returns more than a million and a half raw hits, but "my butt" returns nearly four and a half million. On the other hand, the conjunction but is the 23rd most frequent word on the Kilgarriff list; I suspect that it is frequent enough to avoid conflation with its homophones. Speaking of homophones, by the way, Meriam Webster's 10th Collegiate Dictionary lists six separate head words for butt and five for but.