Linguistic Anthropology: Quantifying lexical change

Two recent papers in the journal Nature deal with rates of language change. (Thanks are due to Dave for expressing interest in the topic.)

Coverage of these studies in the news media has generally been pretty good. For example, the Independent says, Common words 'less likely to change' and Telegraph.co.uk reports, Scientists chart how words are changing.

As the popular press points out, both recent studies show that frequently used words are relatively resistant to change, compared to words that are used less frequently. Thus, for instance, while the large number of irregular verbs in Old English have gradually become regular (so that healp/halp/holp became help/helped/helped), the most frequent verbs remain irregular (begin/began/begun).

What might not be clear from these news reports is that linguists have noted this link between frequency and regularity for many years. As early as 1935, George Zipf suggested that there is an inverse relationship between frequency and complexity.

What the new studies accomplish is a far more sophisticated analysis of the regularity of language change that earlier scholars noted or theorized.

Lieberman et al. identify 177 irregular verbs in Old English (the language of Beowulf, spoken about 1000-1500 years ago), and compare them with Middle English (the language of Chaucer's Canterbury Tales, spoken about 500-1000 years ago) and Modern English. They divide the verbs into six classes, based on how frequently each word appears in a corpus of Modern English, and note that more frequent verbs are more likely to have become regular over the past millennium or so. Moreover, this tendency follows a fairly regular pattern, so that the rate of change is proportional to the square root of a verb's frequency. In other words, since words such as beat, freeze and melt occur about 100 times more frequently than words such as blend, fret and milk, the latter are about 10 times as likely to be regular. (All six were irregular in Old English.) The authors go so far as to predict the next irregular verb to regularize: wed, the least frequent irregular verb in their sample, may soon be completely regular.

Even more interesting is work by Pagel et al. on cognates in four Indo-European languages. Most languages spoken in Europe, as well as many spoken in central and southern Asia, are part of a large family of related languages called Indo-European. Cognates, similar-sounding words with the same meaning such as Spanish dos, Russian dva, Greek dio and English two, are one source of evidence that all of these languages descend from a common ancestor.

Pagel and his colleagues calculate the frequency of 200 words in English, Spanish, Russian and Greek. The frequency of these words is similar in each language - an interesting finding on its own. But wait, it gets better. The "rate of lexical replacement" (roughly, the likelihood of a new word replacing a cognate) varies inversely with a words frequency. Thus, frequently used words like two are more likely to be cognate in the related languages, while less frequent words tend to differ from language to language. The rate of lexical replacement is strikingly similar for each of the four languages analyzed.

The authors suggest two possibilities for this relationship. It may be that words that occur very frequently are learned relatively easily. Fewer 'errors' therefore lead to less variation, and less likelihood that a new word will replace the old one. Alternately, words that are used more frequently may be more useful, and users may be more conservative in producing them so as not to be misunderstood.

These papers may prove inspiring to cognitive scientists, evolutionary anthropologists, sociolinguists, linguistic anthropologists and others interested in how languages (or by extension, other elements of culture or group practice) change over time.

Lieberman, E., J. Michel, J. Jackson, T. Tang & M.A. Nowak. 2007. Quantifying the evolutionary dynamics of language. Nature 449, 713-716.

Pagel, M., Q.D. Atkinson & A. Meade. 2007. Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature 449, 717-720.

Zipf, G. 1935. The Psycho-Biology of Language. Boston: Houghton Mifflin.

Labels: language change

Linguistic Anthropology

Tuesday, October 23, 2007

Quantifying lexical change

1 Comments:

Linguistic Anthropology

Contributors

Previous Posts