Quantitative Comparative Linguistics - Studies Comparing Methods

Studies Comparing Methods

Nakhleh et al. carried out a comparison of six analysis methods using an IE database. The methods compared were UPGMA, NJ MP, MC, WMC and GA. The PAUP software package was used for UPGMA, NJ, and MC as well as computing the majority consensus trees. The RWT database was used but 40 characters were removed due to evidence of polymorphism. Then a screened database was produced excluding all characters that clearly exhibited parallel development, so eliminating 38 features. The trees were evaluated on the basis of the number of incompatible characters and on agreement with established sub-grouping results. They found that UPGMA was clearly worst but there was not a lot of difference between the other methods. The results depended on the data set used. It was found that weighting the characters was important, which requires linguistic judgement.

A comparison of coding methods was carried out by Rexova et al.. They created a reduced data set from the Dyen database but with the addition of Hittite. They produced a standard multistate matrix where the 141 character states corresponds to individual cognate classes, allowing polymorphism. They also joined some cognate classes, to reduce subjectivity and polymorphic states were not allowed. Lastly they produced a binary matrix where each class of words was treated as a separate character. The matrices were analysed by PAUP. It was found that using the binary matrix produced changes near the root of the tree.

Barbancon et al. studied various tree reconstruction methods using simulated data. Their simulated data varied in the number of contact edges, the degree of homoplasy, the deviation from a lexical clock, and the deviation from the rates-across-sites assumption. It was found that the accuracy of the unweighted methods (MP, NJ, UPGMA, and GA) were consistent in all the conditions studied, with MP being the best. The accuracy of the two weighted methods (WMC and WMP) depended on the appropriateness of the weighting scheme. With low homoplasy the weighted methods generally produced the more accurate results but inappropriate weighting could make these worse than MP or GA under moderate or high homoplasy levels.

McMahon and McMahon used three PHYLIP programs (NJ, Fitch and Kitch) on the DKB dataset. They found that the results produced were very similar. Bootstrapping was used to test the robustness of any part of the tree. Later they used subsets of the data to assess its retentiveness and reconstructability. The outputs showed topological differences which were attributed to borrowing. They then also used Network, Split Decomposition, Neighbor-net and Splitstree on several data sets. Significant differences were found between the latter two methods. Neighbor-net was considered optimal for discerning language contact.

Cysouw et al. compared Holm's original method with NJ, Fitch, MP and SD. They found Holm's method to be less accurate than the others.

Saunders compared NJ, MP, GA and Neighbor-Net on a combination of lexical and typological data. He recommended use of the GA method but Nichols and Warnow have some concerns about the study methodology.

Read more about this topic: Quantitative Comparative Linguistics

Famous quotes containing the words studies, comparing and/or methods:

“Even the poor student studies and is taught only political economy, while that economy of living which is synonymous with philosophy is not even sincerely professed in our colleges. The consequence is, that while he is reading Adam Smith, Ricardo, and Say, he runs his father in debt irretrievably.”
—Henry David Thoreau (1817–1862)

“There is no comparing the brutality and cynicism of today’s pop culture with that of forty years ago: from High Noon to Robocop is a long descent.”
—Charles Krauthammer (b. 1950)

“I conceive that the leading characteristic of the nineteenth century has been the rapid growth of the scientific spirit, the consequent application of scientific methods of investigation to all the problems with which the human mind is occupied, and the correlative rejection of traditional beliefs which have proved their incompetence to bear such investigation.”
—Thomas Henry Huxley (1825–95)

Related Phrases

Historical Linguistics

Internal Nodes

Phylogenetic Methods

Represent Ancestors

Similarity Percentage

Related Words