Robust Statistics - Example: Speed of Light Data

Example: Speed of Light Data

Gelman et al. in Bayesian Data Analysis (2004) consider a data set relating to speed of light measurements made by Simon Newcomb. The data sets for that book can be found via the Classic data sets page, and the book's website contains more information on the data.

Although the bulk of the data look to be more or less normally distributed, there are two obvious outliers. These outliers have a large effect on the mean, dragging it towards them, and away from the center of the bulk of the data. Thus, if the mean is intended as a measure of the location of the center of the data, it is, in a sense, biased when outliers are present.

Also, the distribution of the mean is known to be asymptotically normal due to the central limit theorem. However, outliers can make the distribution of the mean non-normal even for fairly large data sets. Besides this non-normality, the mean is also inefficient in the presence of outliers and less variable measures of location are available.

This article may contain original research. Please improve it by verifying the claims made and adding inline citations. Statements consisting only of original research may be removed.

Read more about this topic:  Robust Statistics

Famous quotes containing the words speed, light and/or data:

    It was undoubtedly the feeling of exile—that sensation of a void within which never left us, that irrational longing to hark back to the past or else to speed up the march of time, and those keen shafts of memory that stung like fire.
    Albert Camus (1913–1960)

    With love’s light wings did I o’erperch these walls,
    For stony limits cannot hold love out,
    And what love can do, that dares love attempt.
    William Shakespeare (1564–1616)

    To write it, it took three months; to conceive it three minutes; to collect the data in it—all my life.
    F. Scott Fitzgerald (1896–1940)