Semester.ly

Johns Hopkins University | EN.600.479

Representation Learning

3.0

credits

Average Course Rating

(4.2)

Often the success of a machine learning project depends on the choice of features used. Machine learning has made great progress in training classification, regression and recognition systems when "good" representations, or features, of input data are available. However, much human effort is spent on designing good features which are usually knowledge-based and engineered by domain experts over years of trial and error. A natural question to ask then is "Can we automate the learning of useful features from raw data?" Representation learning algorithms such as principal component analysis aim at discovering better representations of inputs by learning transformations of data that disentangle factors of variation in data while retaining most of the information. The success of such data-driven approaches to feature learning depends not only on how much data we can process but also on how well the features that we learn correlate with the underlying unknown labels (semantic content in the data). This course will focus on scalable machine learning approaches for learning representations from large amounts of unlabeled, multi-modal, and heterogeneous data. We will cover topics including deep learning, multi-view learning, dimensionality reduction, similarity-based learning, and spectral learning. Students may receive credit for 600.479 or 600.679 but not both. [Analysis or Applications] Required course background: machine learning or basic probability and linear algebra.

Fall 2014

Professor: Raman Arora

(4.2)

Students praised this course for giving them the opportunity to work independently on projects. Most students disliked that the course seemed to lack structure and that there wasn’t enough feedback on their work. Suggestions for improvement included a belief that students could use additional direction from the instructor and in particular, hands-on experience. In addition students suggested the course would benefit from clearer lectures in which each course built upon the next. Prospective students should know that students found that it was important to have some sort of background in ‘big data’ such as experience in distributed systems and parallel programing.