DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO
References on data mining and analytics
To keep current with what is happening in the world of data mining, subscribe (free) to the KDnuggets newsletter.
For a business perspective on data mining and analytics, without technical detail, see
Competing on Analytics: The New Science of Winning
by
Thomas H. Davenport and Jeanne G. Harris. For the table of contents see http://www.amazon.com/gp/reader/1422103323/
Machine learning is the name of the principal research area underlying
data mining. The best undergraduate-level textbook in this area
is Introduction to Machine Learning by Ethem Alpaydin. For the detailed table of contents, see here.
The best graduate-level textbook on machine learning is
Pattern Recognition and Machine Learning by Christopher M. Bishop. For the table of contents see http://www.amazon.com/gp/reader/0387310738
Web Analytics: An Hour a Day by
Avinash Kaushik is a best-seller on the fastest-growing
application area of data mining, namely data mining applied to web
sites. For the table of contents see http://www.amazon.com/gp/reader/0470130652
Understanding
Complex Datasets: Data Mining with Matrix Decompositions by
David Skillicorn is a good specialized book on a growing technical
subfield, namely matrix methods applied to modeling two-dimensional
data. For the table of contents see http://www.amazon.com/gp/reader/1584888326/
It is conventional wisdom that 80% of the effort in a data mining
project is devoted to data acquisition and cleaning. A highly
recommended book on this topic is Data Preparation for Data Mining
by Dorian Pyle. Here is the table of contents.
Most recently updated on March 17, 2009 by Charles Elkan, elkan@cs.ucsd.edu