Classic Data Sets
Several classic data sets have been used extensively in the statistical literature:
- Iris flower data set - multivariate data set introduced by Ronald Fisher (1936).
- Categorical data analysis - Data sets used in the book, An Introduction to Categorical Data Analysis, by Agresti are provided on-line by StatLib.
- Robust statistics - Data sets used in Robust Regression and Outlier Detection (Rousseeuw and Leroy, 1986). Provided on-line at the University of Cologne.
- Time series - Data used in Chatfield's book, The Analysis of Time Series, are provided on-line by StatLib.
- Extreme values - Data used in the book, An Introduction to the Statistical Modeling of Extreme Values are provided on-line by Stuart Coles, the book's author.
- Bayesian Data Analysis - Data used in the book are provided on-line by Andrew Gelman, one of the book's authors.
- The Bupa liver data, used in several papers in the machine learning (data mining) literature.
- Anscombe's quartet Small dataset illustrating the importance of graphing the data to avoid statistical fallacies
Read more about this topic: Data Set
Famous quotes containing the words classic, data and/or sets:
“One classic American landscape haunts all of American literature. It is a picture of Eden, perceived at the instant of history when corruption has just begun to set in. The serpent has shown his scaly head in the undergrowth. The apple gleams on the tree. The old drama of the Fall is ready to start all over again.”
—Jonathan Raban (b. 1942)
“To write it, it took three months; to conceive it three minutes; to collect the data in itall my life.”
—F. Scott Fitzgerald (18961940)
“It is mediocrity which makes laws and sets mantraps and spring-guns in the realm of free song, saying thus far shalt thou go and no further.”
—James Russell Lowell (181991)