Instructor
Note: If you do not satisfy prerequisites for this course and want to take it with me, please read my announcements regarding SPN/Prerequisite overrides. I will post instructions ahead of each semester.
Spring 2018 Instructions are here.
Course Description
An in-depth study of supervised methods for machine learning, to impart an understanding of the major topics in this area, the capabilities and limitations of existing methods, and research topics in this field.
Topics
Inductive learning, including decision-tree, Bayesian methods, computational learning theory, instance-based learning, explanation-based learning, reinforcement learning, nearest neighbor methods, PAC-learning, kernels methods, graphical models, regression modeling, deep models.
Expected Work
Regular readings; mini-projects; in-class presentations; midterm and a final course project.
Course Policies and Procedures
Important, perhaps boring details. But please read them carefully.
Schedule
Topic # | Title | Text |
---|---|---|
1 | Introduction to Supervised Learning | FML Ch 1 PRML Ch 1.1 - 1.4 MLPP Ch 1.1 - 1.3 DL Ch 5.1 ML Ch 1 |
2 | Overview of linear algebra and probability | PRML Ch 2 MLPP Ch 2 |
3 | Overview of optimization; Gradient Descent; Second Order Methods | DL Ch 4 |
4 | Linear Regression; Overfitting and Ridge Regression; Bias-Variance Decomposition; Risk Minimization | PRML Ch 3.1 - 3.2 MLPP Ch 6.1 - 6.5, 7.1 - 7.5 ESL Ch 3 - 4 |
5 | Decision Theory; Generative Classification Models, Linear Discriminant Analysis; Nave Bayes | PRML Ch 1.5, Ch. 4.1 - 4.2 PPML Ch 3, Ch 4.1 - 4.2 |
6 | Design and Analysis of Machine Learning Experiments; Model Assesement | I2ML Ch 19 MLPP Ch. 7 |
7 | Discriminative Classification Models; Logistic Regression; Bias-Variance Decomposition in Classification | PRML Ch. 4.3 MLPP Ch. 8.1 - 8.3 |
8 | Bayesian Learning; Bayesian Linear Regression & Bayesian Logistic Regression; Generalized Linear Models | MLPP Ch. 5, 7.6, 8.4, 9 PRML Ch. 3.3 - 3.4, 4.4 - 4.5 |
9 | Sparse Models and Feature Selection | MLPP Ch. 13 |
10 | Kernel Models; RBF Networks; Kernel Trick | PRML Ch. 6.1 - 6.3 MLPP Ch. 14.1 - 14.2, 14.4 |
11 | Support Vector Machines; Relevance Vector Machine | PRML Ch. 7 MLPP Ch. 14.5 - 14.7, 14.3 |
12 | Gaussian Process Models | PRML Ch. 6.4 MLPP Ch. 15 |
13 | Adaptive Basis Models; Decision and Regression Trees | MLPP Ch. 16.1-16.3 PRML Ch. 14.4 ESL Ch. 9 |
14 | Ensemble Models; Boosting; Stacking; Mixtures of Models | MLPP Ch. 16.4, 16.6 PRML Ch. 14.1 - 14.3 ESL Ch. 10 |
15 | Neural Networks; Feedforward Networks; Gradient Learning; Backpropagation | PRML Ch. 5 MLPP Ch. 16.5 DL Ch. 6 |
16 | Deep Generative Models; Deep Neural Networks | MLPP Ch. 28.1 - 28.3 |
17 | Regularization and Optimization in Deep Models | DL Ch. 7 - 8 |
18 | Convolutional Network | DL Ch. 9 |
19 | Structured Prediction; Conditional Random Fields; Structured SVMs; Prediction on Graphs | MLPP Ch. 19 |
20 | Sequential Deep Models; Recurrent Neural Networks | DL Ch. 10 |
21 | Reinforcement Learning; Deep Reinforcement Learning | https://web.mst.edu/~gosavia/tutorial.pdf http://hunch.net/~jl/projects/RL/RLTheoryTutorial.pdf http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf |
Textbooks
Abbreviation | Textbook Title | Author | Publisher | Year |
---|---|---|---|---|
PRML | Pattern Recognition and Machine Learning | Christopher C. Bishop | Springer | 2006 |
MLPP | Machine Learning: A Probabilistic Perspective | Kevin P. Murphy | MIT Press | 2012 |
CVMLI | Computer vision: models, learning and inference | Prince, Simon J D | Cambridge University Press | 2012 |
DL | Deep Learning | Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron | MIT Press | 2016 |
FML | Foundations of Machine Learning | Mohri, Mehryar and Rostamizadeh, Afshin and Talwalkar, Ameet | MIT Press | 2012 |
DHS | Pattern Classification, 2nd ed | Duda, Richard O. and Hart, Peter E. and Stork, David G. | Wiley Interscience | 2004 |
ML | Machine Learning | Mitchell, Tom | McGraw Hill | 1997 |
I2ML | Introduction to Machine Learning, 2nd ed | Alpaydin, Ethem | MIT Press | 2012 |
MLAP | Machine Learning: An Algorithmic perspective | Marsland, Stephen | CRC press | 2009 |
PTPR | A Probabilistic Theory of Pattern Recognition | Devroye, Luc and Gyorfi, Laszlo and Lugosi, Gabor | Springer | 1997 |
ESL | The elements of statistical learning: Data mining, inference, and prediction | Friedman, J and Hastie, T and Tibshirani, R | Springer | 2009 |
NAPR | Netlab: Algorithms for Pattern Recognition | Nabney, Ian | Springer | 2002 |
DMPMLTT | Data Mining: Practical Machine Learning Tools and Techniques | Witten, Ian H and Frank, Eibe | Morgan Kaufmann | 2005 |
LAA | Linear Algebra and Its Applications | Strang, Gilbert | Elsevier Science | 2014 |
MC | Matrix computations, 4th ed | Golub, Gene H and Van Loan, Charles F | JHU Press | 2013 |
CO | Convex Optimization | Boyd, Steven P and Vandenberghe, Lieven | Cambridge University Press | 2004 |
ILCO | Introductory lectures on convex optimization: a basic course_ | Nestorov, Yurii | Springer | 2004 |
GPML | Gaussian Processes for Machine Learning | Rasmussen, Carl Edward and Williams, Christopher K. I. | MIT Press | 2006 |
ITILA | Information Theory, Inference, and Learning Algorithms | MacKay, David | Cambridge University Press | 2003 |
HOML | Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems | Géron, Aurélien | O’Reilly Media, Inc. | 2019 |
PMLI | Probabilistic Machine Learning: An Introduction | Murphy, Kevin P. | MIT Press | 2022 |
PMLA | Probabilistic Machine Learning: Advanced Topics | Murphy, Kevin P. | MIT Press | 2022 |
Software
We will use Python and MATLAB extensively!