Quantitative Comparative Linguistics - Studies Comparing Methods

Studies Comparing Methods

Nakhleh et al. carried out a comparison of six analysis methods using an IE database. The methods compared were UPGMA, NJ MP, MC, WMC and GA. The PAUP software package was used for UPGMA, NJ, and MC as well as computing the majority consensus trees. The RWT database was used but 40 characters were removed due to evidence of polymorphism. Then a screened database was produced excluding all characters that clearly exhibited parallel development, so eliminating 38 features. The trees were evaluated on the basis of the number of incompatible characters and on agreement with established sub-grouping results. They found that UPGMA was clearly worst but there was not a lot of difference between the other methods. The results depended on the data set used. It was found that weighting the characters was important, which requires linguistic judgement.

A comparison of coding methods was carried out by Rexova et al.. They created a reduced data set from the Dyen database but with the addition of Hittite. They produced a standard multistate matrix where the 141 character states corresponds to individual cognate classes, allowing polymorphism. They also joined some cognate classes, to reduce subjectivity and polymorphic states were not allowed. Lastly they produced a binary matrix where each class of words was treated as a separate character. The matrices were analysed by PAUP. It was found that using the binary matrix produced changes near the root of the tree.

Barbancon et al. studied various tree reconstruction methods using simulated data. Their simulated data varied in the number of contact edges, the degree of homoplasy, the deviation from a lexical clock, and the deviation from the rates-across-sites assumption. It was found that the accuracy of the unweighted methods (MP, NJ, UPGMA, and GA) were consistent in all the conditions studied, with MP being the best. The accuracy of the two weighted methods (WMC and WMP) depended on the appropriateness of the weighting scheme. With low homoplasy the weighted methods generally produced the more accurate results but inappropriate weighting could make these worse than MP or GA under moderate or high homoplasy levels.

McMahon and McMahon used three PHYLIP programs (NJ, Fitch and Kitch) on the DKB dataset. They found that the results produced were very similar. Bootstrapping was used to test the robustness of any part of the tree. Later they used subsets of the data to assess its retentiveness and reconstructability. The outputs showed topological differences which were attributed to borrowing. They then also used Network, Split Decomposition, Neighbor-net and Splitstree on several data sets. Significant differences were found between the latter two methods. Neighbor-net was considered optimal for discerning language contact.

Cysouw et al. compared Holm's original method with NJ, Fitch, MP and SD. They found Holm's method to be less accurate than the others.

Saunders compared NJ, MP, GA and Neighbor-Net on a combination of lexical and typological data. He recommended use of the GA method but Nichols and Warnow have some concerns about the study methodology.

Read more about this topic:  Quantitative Comparative Linguistics

Famous quotes containing the words studies, comparing and/or methods:

    The conduct of a man, who studies philosophy in this careless manner, is more truly sceptical than that of any one, who feeling in himself an inclination to it, is yet so over-whelm’d with doubts and scruples, as totally to reject it. A true sceptic will be diffident of his philosophical doubts, as well as of his philosophical conviction; and will never refuse any innocent satisfaction, which offers itself, upon account of either of them.
    David Hume (1711–1776)

    There is no comparing the brutality and cynicism of today’s pop culture with that of forty years ago: from High Noon to Robocop is a long descent.
    Charles Krauthammer (b. 1950)

    A woman might claim to retain some of the child’s faculties, although very limited and defused, simply because she has not been encouraged to learn methods of thought and develop a disciplined mind. As long as education remains largely induction ignorance will retain these advantages over learning and it is time that women impudently put them to work.
    Germaine Greer (b. 1939)