**Instructor**

*Note: If you do not satisfy prerequisites for this course and want to take it with me, please read my announcements regarding SPN/Prerequisite overrides. I will post instructions ahead of each semester.*

Spring 2018 Instructions are here.

**Course Description**

An in-depth study of **supervised methods** for machine learning, to impart an understanding of the major topics in this area, the capabilities and limitations of existing methods, and research topics in this field.

**Topics**

Inductive learning, including decision-tree, Bayesian methods, computational learning theory, instance-based learning, explanation-based learning, reinforcement learning, nearest neighbor methods, PAC-learning, kernels methods, graphical models, regression modeling, deep models.

**Expected Work**

Regular readings; mini-projects; in-class presentations; midterm and a final course project.

**Course Policies and Procedures**

Important, perhaps boring details. But please read them carefully.

**Schedule**

Topic # | Title | Text |
---|---|---|

1 | Introduction to Supervised Learning | FML Ch 1 PRML Ch 1.1 - 1.4 MLPP Ch 1.1 - 1.3 DL Ch 5.1 ML Ch 1 |

2 | Overview of linear algebra and probability | PRML Ch 2 MLPP Ch 2 |

3 | Overview of optimization; Gradient Descent; Second Order Methods | DL Ch 4 |

4 | Linear Regression; Overfitting and Ridge Regression; Bias-Variance Decomposition; Risk Minimization | PRML Ch 3.1 - 3.2 MLPP Ch 6.1 - 6.5, 7.1 - 7.5 ESL Ch 3 - 4 |

5 | Decision Theory; Generative Classification Models, Linear Discriminant Analysis; Nave Bayes | PRML Ch 1.5, Ch. 4.1 - 4.2 PPML Ch 3, Ch 4.1 - 4.2 |

6 | Design and Analysis of Machine Learning Experiments; Model Assesement | I2ML Ch 19 MLPP Ch. 7 |

7 | Discriminative Classification Models; Logistic Regression; Bias-Variance Decomposition in Classification | PRML Ch. 4.3 MLPP Ch. 8.1 - 8.3 |

8 | Bayesian Learning; Bayesian Linear Regression & Bayesian Logistic Regression; Generalized Linear Models | MLPP Ch. 5, 7.6, 8.4, 9 PRML Ch. 3.3 - 3.4, 4.4 - 4.5 |

9 | Sparse Models and Feature Selection | MLPP Ch. 13 |

10 | Kernel Models; RBF Networks; Kernel Trick | PRML Ch. 6.1 - 6.3 MLPP Ch. 14.1 - 14.2, 14.4 |

11 | Support Vector Machines; Relevance Vector Machine | PRML Ch. 7 MLPP Ch. 14.5 - 14.7, 14.3 |

12 | Gaussian Process Models | PRML Ch. 6.4 MLPP Ch. 15 |

13 | Adaptive Basis Models; Decision and Regression Trees | MLPP Ch. 16.1-16.3 PRML Ch. 14.4 ESL Ch. 9 |

14 | Ensemble Models; Boosting; Stacking; Mixtures of Models | MLPP Ch. 16.4, 16.6 PRML Ch. 14.1 - 14.3 ESL Ch. 10 |

15 | Neural Networks; Feedforward Networks; Gradient Learning; Backpropagation | PRML Ch. 5 MLPP Ch. 16.5 DL Ch. 6 |

16 | Deep Generative Models; Deep Neural Networks | MLPP Ch. 28.1 - 28.3 |

17 | Regularization and Optimization in Deep Models | DL Ch. 7 - 8 |

18 | Convolutional Network | DL Ch. 9 |

19 | Structured Prediction; Conditional Random Fields; Structured SVMs; Prediction on Graphs | MLPP Ch. 19 |

20 | Sequential Deep Models; Recurrent Neural Networks | DL Ch. 10 |

21 | Reinforcement Learning; Deep Reinforcement Learning | https://web.mst.edu/~gosavia/tutorial.pdf http://hunch.net/~jl/projects/RL/RLTheoryTutorial.pdf http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf |

**Textbooks**

Abbreviation | Textbook Title | Author | Publisher | Year |
---|---|---|---|---|

PRML | Pattern Recognition and Machine Learning | Christopher C. Bishop | Springer | 2006 |

MLPP | Machine Learning: A Probabilistic Perspective | Kevin P. Murphy | MIT Press | 2012 |

CVMLI | Computer vision: models, learning and inference | Prince, Simon J D | Cambridge University Press | 2012 |

DL | Deep Learning | Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron | MIT Press | 2016 |

FML | Foundations of Machine Learning | Mohri, Mehryar and Rostamizadeh, Afshin and Talwalkar, Ameet | MIT Press | 2012 |

DHS | Pattern Classification, 2nd ed | Duda, Richard O. and Hart, Peter E. and Stork, David G. | Wiley Interscience | 2004 |

ML | Machine Learning | Mitchell, Tom | McGraw Hill | 1997 |

I2ML | Introduction to Machine Learning, 2nd ed | Alpaydin, Ethem | MIT Press | 2012 |

MLAP | Machine Learning: An Algorithmic perspective | Marsland, Stephen | CRC press | 2009 |

PTPR | A Probabilistic Theory of Pattern Recognition | Devroye, Luc and Gyorfi, Laszlo and Lugosi, Gabor | Springer | 1997 |

ESL | The elements of statistical learning: Data mining, inference, and prediction | Friedman, J and Hastie, T and Tibshirani, R | Springer | 2009 |

NAPR | Netlab: Algorithms for Pattern Recognition | Nabney, Ian | Springer | 2002 |

DMPMLTT | Data Mining: Practical Machine Learning Tools and Techniques | Witten, Ian H and Frank, Eibe | Morgan Kaufmann | 2005 |

LAA | Linear Algebra and Its Applications | Strang, Gilbert | Elsevier Science | 2014 |

MC | Matrix computations, 4th ed | Golub, Gene H and Van Loan, Charles F | JHU Press | 2013 |

CO | Convex Optimization | Boyd, Steven P and Vandenberghe, Lieven | Cambridge University Press | 2004 |

ILCO | Introductory lectures on convex optimization: a basic course_ | Nestorov, Yurii | Springer | 2004 |

GPML | Gaussian Processes for Machine Learning | Rasmussen, Carl Edward and Williams, Christopher K. I. | MIT Press | 2006 |

ITILA | Information Theory, Inference, and Learning Algorithms | MacKay, David | Cambridge University Press | 2003 |

**Software**

We will use Python and MATLAB extensively!