You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information. |
Table of Contents Appendix References List of Data Sets | |
See also: Exercises, Interactive Examples |
The following table contains a compilation of all data sets supplied
with Teach/Me - Data Analysis. Most of the data sets are real data which
have been obtained from various sources (see reference section at end of
this page). A few data sets are simulated data sets which have been generated
with a background story in mind. The file names of the simulated data sets
are displayed in brown color.
Filename | Description | Ref. |
ALCOHOL | Subset of the data set WINE containing only the alcohol content of two brands. | [7] |
BENZ | Spectroscopic data (NMR) on various brands of gasoline, and the relative octane number. | [10] |
BODYFAT | Percentage of body fat, age, weight, height, and ten body circumference measurements (e.g., abdomen) are recorded for 252 men. Body fat, a measure of health, is estimated through an underwater weighing technique. Fitting body fat to the other measurements using multiple regression provides a convenient way of estimating body fat for men using only a scale and a measuring tape. | [1] |
BOILPTS | Boiling points and topological descriptors of 185 chemical substances. | [3,13] |
CANCER | Number of intestine cancer cases in West Germany in the period between 1955 and 1995 | [24] |
CIGART | Artificial data set for classification, created by INSPECT. The data points are arranged in a way that only non-linear methods are able to classify the data correctly | [2] |
COINS | Weight of 114 coins (Austrian 1 Schilling pieces) of different age. | [5] |
ETHANOL | NOx concentration in the exhaust gases of an experimental ethanol motor. | [25] |
EXMPL-A | Artificial data set which shows a few simple relationships among variables. | - |
FISH1SPECIES | Subset of data set FISHCATCH showing the relationship between length of weight of fish. | [22] |
FISHCATCH | Body measurements of different species of perch. | [22] |
FLURIEDW | This data set comprises geometric measures of 100 authentic and 100 counterfeit bank notes. | [12] |
FREEFALL | Simulated data to show variability in data. A steelball is released at different heights; for each height the experiment is repeated 100 times. | - |
HENRYSEM | Henry's constant of chemical substances together with molecular descriptors. The physical data has been obtained from [17], the molecular descriptors have been calculated using TOPIX [18] | [17,18] |
HUMIDIT2 | Average Relative Humidity(%) of 264 places in USA. The data set contains the data of June and September, morning and afternoon each. In addition, the annual averages are in the last two columns. | [8] |
IRIS | Three types of iris plants. The plants are described by four variables. | [14] |
METHANE | This data set contains the concentration of atmospheric methane measured monthly during the period from September 1980 to September 1988. | [15] |
MINWATER | Chemical analysis of different brands of mineral water. | [20] |
MINWATER2 | Subset of MINWATER | [20] |
MOTE9603 | Climate data obtained from Mote weather station, Florida, USA. Data set contains measurements of 9 meteorological variables over a period of ten days in March 1996. | [4] |
MOTETIDES | Water level at the Mote weather station, Florida, USA, during July 1998. Data was obtained every 15 minutes. | [4] |
MULTIEST | Artificial data used in an interactive example on multidimensional models. | - |
POLYFIT | Artificial data showing a polynomial relationship of the third order. | - |
PRECIPITATION | Normal monthly precipitation (Inches) in the period 1961-90. | [8] |
RABBITS | Fluctuations of a rabbit population | [21] |
REACTTEST | The reaction times to visual stimuli were recorded for 9 persons. The experiment was repeated on two different days; one series was obtained before a two-hour lecture, the other series after a two-hour lecture. | [9] |
STRONTIUM | Simulated data to show two-sample t-test. | - |
SUNSPOTS | Average monthly sunspot areas between 1874 and 1998. | [19] |
TERPBIC | Data set containing two classes of chemical substances described by two spectral parameters. This data set cannot be treated by linear methods. | [11] |
TRAIN | Simulated data to show a skewed distribution. | - |
TWOCLASS | Artificial data set containing two classes of observations | - |
WATERRESID | Subset of MINWATER | [20] |
WINE | Chemical analysis of three kinds of Italian red wines (Barolo, Grignolino, Barbera). | [7] |
WINEGER | Chemical analysis of various kinds of German wines. | [23] |
WORLDPOP | Demographical, sociological and economical data on the world's nations (1988). | [6] |
References to the sources of the data sets:
[1] | K. Penrose, A. Nelson, and A.G. Fisher, (1985),
Generalized Body Composition Prediction Equation for Men Using Simple Measurement Techniques Medicine and Science in Sports and Exercise 17(2) (1985) 189 Data set by courtesy of Garth Fisher |
[2] | H. Lohninger
INSPECT - A program system for scientific and engineering data analysis. Springer, Berlin, Heidelberg, New York 1996 |
[3] | H. Lohninger
Evaluation of Neural Networks Based on Radial Basis Functions and Their Application to the Prediction of Boiling Points from Structural -Parameters. J. Chem. Inf. Comput. Sci. 33 (1993) 736-744 |
[4] | Mote Weather Station, Florida, USA
Data by courtesy of Don Hayward Mote Marine Laboratory 1600 Ken Thompson Parkway Sarasota, FL 34236, USA http://www.mote.org/ |
[5] | Coins have been collected and weighted by H. Lohninger and A. Satzinger, Vienna University of Technology, Vienna, Austria |
[6] | This data set has been compiled from a variety of public sources, including the United Nations (http://www.un.org/), the Worldbank (http://www.worldbank.org/), and the CIA Factbook (http://www.odci.gov/cia/publications/factbook/). |
[7] | M. Forina, E. Tiscornia
Ann. Chim. 72 (1982) 143 Data set courtesy of M. Forina, Università di Genova, Italy |
[8] | The data has been published by the National Climatic Data Center on their Web site: http://www.ncdc.noaa.gov/ |
[9] | H.Lohninger
Reaction measurements to visual stimuli. Vienna University of Technology, 1998 |
[10] | R. Meusinger, R. Moros:
Application of Genetic Algorithms and Neural Networks in Analysis of Multicomponent Mixtures by NMR-Spectroscopy, in J. Gasteiger (Ed.) "Software Development in Chemistry, 10", Gesellschaft Deutscher Chemiker, Frankfurt 1996, p. 209 Data set courtesy of R. Meusinger. |
[11] | H. Lohninger
Data has been computed from mass spectral data by means of MSLIB (http://www.lohninger.com/mslib.html) |
[12] | B. Flury, H. Riedwyl
Angewandte multivariate Statistik G.Fischer- Verlag, Stuttgart 1983 Data set by courtesy of H. Riedwyl, Bern, Switzerland |
[13] | A.T. Balaban, L.B. Kier, N. Joshi
Correlations between chemical structure and normal boiling points of acyclic ethers, peroxides, acetals and their sulfur analogues J. Chem. Inf. Comput.Sci. 32 (1992) 237-244 |
[14] | R.A. Fisher
The use of multiple measurements in taxonomic problems Annual Eugenics 7 (1936), Part II, 179-188 |
[15] | M.A.K. Khalil, R.A. Rasmussen
Atmospheric Methane: Recent Global Trends Environ. Sci. Technol. 1990, 24, 549-553 |
[16] | H. Lohninger
Estimation of Soil Partition Coefficients of Pesticides from their Chemical Structure Chemosphere 29 (1994) 1611 |
[17] | J. Hine, P.K. Mookerjee
The intrinsic hydrophilic character of organic compounds. Correlations in terms of structural contributions J. Org. Chem. 40 (1975) 292-298 |
[18] | D. Svozil, H. Lohninger
TOPIX - A program to calculate topological indices. http://www.lohninger.com/topix.html |
[19] | Royal Greenwich Observatory/USAF/NOAA
NASA/Marshall Space Flight Center Data set by courtesy of David H. Hathaway http://science.nasa.gov/ssl/pad/solar/ |
[20] | H. Lohninger
Data on different brands of mineral water has been collected by the author from the labels of the water bottles. |
[21] | Rabbits |
[22] | P. Brofeldt
Bidrag till kaennedom on fiskbestondet i vaara sjoear. Laengelmaevesi. in T.H.Jaervi: Finlands Fiskeriet Band 4, Meddelanden utgivna av fiskerifoereningen i Finland. Helsingfors 1917 |
[23] | G. Thiel, K. Danzer
Direct analysis of mineral components in wine by inductively coupled plasma optical emission spectrometry (ICP-OES). Fresenius J Anal Chem. 1997; 357: 553-557. Data set courtesy of Klaus Danzer, Friedrich-Schiller Universität Jena, Germany |
[24] | N.Becker, J. Wahrendorf
Atlas of cancer mortality in the Federal Republic of Germany. Springer Berlin Heidelberg 1998 |
[25] | N.D. Brinkman
Ethanol Fuel--A Single-cylinder engine study of efficiency an exhaust emissions SAE Transactions 90 (1981), No. 810345, 1410-1424. |
Last Update: 2005-Jän-25