Data Mining
4.0
creditsAverage Course Rating
Data mining is a relatively new term used in the academic and business world, often associated with the development and quantitative analysis of very large databases. Its definition covers a wide spectrum of analytic and information technology topics, such as machine learning, artificial intelligence, statistical modeling, and efficient database development. This course will review these broad topics, and cover specific analytic and modeling techniques. The students will learn the foundation of data visualization, classification, regression, clustering and dimensionality reduction. Although some of the mathematics underlying these techniques will be discussed, our focus will be on the application of the techniques to real data and the interpretation of results. Because use of the computer is extremely important when “mining” large amounts of data, we will make substantial use of software tools to learn the techniques and analyze datasets. In particular, students will program in Python and use Jupyter Notebooks during lectures, for the homework and the exams. Recommended Course Background: EN.550.413, EN.550.420, AS.171.205, EN.550.112