Quantitative Comparative Linguistics

Quantitative Comparative Linguistics

Statistical methods have been used in comparative linguistics since at least the 1950s (see Swadesh list). Since about the year 2000, there has been a renewed interest in the topic, based on the application of methods of computational phylogenetics and cladistics to define an optimal tree (or network) to represent a hypothesis about the evolutionary ancestry and perhaps its language contacts. The probability of relatedness of languages can be quantified and sometimes the proto-languages can be approximately dated. The topic came the attention of the popular press in 2003 after the publication of a short study on Indo-European in Nature (Gray and Atkinson 2003). A volume of articles on Phylogenetic Methods and the Prehistory of Languages was published in 2006 as the result of a conference held in Cambridge in 2004.

A goal of comparative historical linguistics is to identify instances of genetic relatedness amongst languages. The steps in quantitative analysis are (i) to devise a procedure based on theoretical grounds, on a particular model or on past experience, etc. (ii) to verify the procedure by applying it to some data where there exists a large body of linguistic opinion for comparison (this may lead to a revision of the procedure of stage (i) or at the extreme of its total abandonment) (iii) to apply the procedure to data where linguistic opinions have not yet been produced, have not yet been firmly established or perhaps are even in conflict.

Applying phylogenetic methods to languages is a multi-stage process (a) the encoding stage - getting from real languages to some expression of the relationships between them in the form of numerical or state data, so that those data can then be used as input to phylogenetic methods (b) the representation stage - applying phylogenetic methods to extract from those numerical and/or state data a signal that is converted into some useful form of representation, usually two dimensional graphical ones such as trees or networks, which synthesise and "collapse" what are often highly complex multi dimensional relationships in the signal (c) the interpretation stage - assessing those tree and network representations to extract from them what they actually mean for real languages and their relationships through time.

Read more about Quantitative Comparative Linguistics:  Background, Types of Trees and Networks, Language Change, Databases, Probabilistic Models, Detection of Borrowing, Split Dating, Types of Analysis, Studies Comparing Methods, Choosing The Best Model

Famous quotes containing the word comparative:

    The utmost familiarity with dead streams, or with the ocean, would not prepare a man for this peculiar navigation; and the most skillful boatman anywhere else would here be obliged to take out his boat and carry round a hundred times, still with great risk, as well as delay, where the practiced batteau-man poles up with comparative ease and safety.
    Henry David Thoreau (1817–1862)