cs535 – Pattern Recognition


Vladimir Pavlovic

Course Description

Pattern Recognition (a.k.a. Machine Learning I) course focuses Unsupervised Learning methods for data analysis.  In particular, we cover topics such as:

  • Association rules
  • Density modeling
  • Clustering (k-means, k-medoids, hierarchical models, spectral methods)
  • Mixture models
  • Distance metric learning
  • Factor analysis (PCA, CCA, etc.) and other latent variable models
  • Probabilistic topic models
  • Matrix factorization
  • Tensor models
  • Clustering evaluation

Expected Work

Regular readings; mini-projects; in-class presentations; midterm and/or a final course project.

Course Schedule

Lec. #TopicReadings
1Introduction & Overview_      http://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf
_      M.I.Jordan and T.M.Mitchell, "Machine learning: Trends, perspectives, and prospects," Science, 17 July 2015, Vol 349 Issue 6245
1aProbability & Linear Algebra_      https://content.sakai.rutgers.edu/access/content/group/c626f8f2-9099-4059-b4b9-5794524e759d/algebra.pdf
_      https://content.sakai.rutgers.edu/access/content/group/c626f8f2-9099-4059-b4b9-5794524e759d/probability.pdf
2Association Rules & _      Chapter 14.1 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
Frequent Itemsets_      Chapter 14.2-14.2.3 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
3_      Chapter 6 of http://www.mmds.org/#book
4Density Estimation_      Chapters 6.6 through 6.9 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      http://ned.ipac.caltech.edu/level5/March02/Silverman/Silver_contents.html
5_       Chapters 4.1 - 4.4 of Duda, Hart, and Stork
Homework #1 assigned
Project proposals and in-class pitches assigned
6K -means_      Chapters 9.1 and 9.2 of http://robotics.stanford.edu/~nilsson/MLBOOK.pdf
_      Chapters 14.3.1 through 14.3.6 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 7 of http://www.mmds.org/#book
_      https://www.cs.rutgers.edu/~mlittman/courses/lightai03/jain99data.pdf
_      Chapters 13.1 and 13.2 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_       Chapters 10.1 – 10.4 and 10.7 of Duda, Hart, and Stork
7Gaussian Mixtures & Expectation Maximization & Factor Analysis_      Mixture of Gaussians: http://cs229.stanford.edu/notes/cs229-notes7b.pdf
_      The EM Algorithm: http://cs229.stanford.edu/notes/cs229-notes8.pdf
_      Factor Analysis: http://cs229.stanford.edu/notes/cs229-notes9.pdf
_      Chapters 14.3.7 through 14.3.9 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
8K -medoids & Hierarchical Clustering_      Chapter 14.3.10 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 14.3.12 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 9.3 of http://robotics.stanford.edu/~nilsson/MLBOOK.pdf
_       Chapter 10.9 of Duda, Hart, and Stork
9Evaluation Metrics & Practical Issues_      http://web.itu.edu.tr/sgunduz/courses/verimaden/paper/validity_survey.pdf
_      Chapter 14.3.11 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
Homework #1 due at 11:59 PM Eastern
10Distance/Similarity Measures & Metric Learning_      http://web.cse.ohio-state.edu/~kulis/pubs/ftml_metric_learning.pdf
_       Check out the Encyclopedia of Distances on this course’s Sakai site (under Resources).
Homework #2 assigned
11Principal Component Analysis (PCA) & Singular Value Decomposition (SVD)_      Chapter 14.5 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 11 of http://www.mmds.org/#book
12Spectral Clustering & Graph Clustering_      http://ai.stanford.edu/~ang/papers/nips01-spectral.pdf
_      http://www.cs.columbia.edu/~jebara/4772/papers/Luxburg07_tutorial.pdf
_      [Optional] http://arxiv.org/pdf/0906.0612.pdf
Homework #1 gradedNote: Warning grades will be issued by Fri Oct 16.
13Kernel Principal Components & Independent Component Analysis (ICA) & Canonical Correlation Analysis (CCA) & PageRank_      Chapter 14.5 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      ICA: Chapter 14.7 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      CCA: https://www.cs.cmu.edu/~tom/10701_sp11/slides/CCA_tutorial.pdf
_       PageRank:
o   Chapter 5 of http://www.mmds.org/#book
o   Chapter 14.10 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
14o   https://www.cs.purdue.edu/homes/dgleich/publications/Gleich%202015%20-%20prbeyond.pdf
Homework #2 due at 11:59 PM Eastern
Two-page project proposals due at 11:59 PM Eastern
15In-class project pitches
16Recommendation Systems_      http://infolab.stanford.edu/~ullman/mmds/ch9.pdf
_      http://eliassi.org/papers/chaney-recsys15.pdf
17Midterm exam
Project proposals & pitches graded
Project presentations and reports assigned
18Latent Variable Models & _      http://research.microsoft.com/pubs/67187/bishop-latent-erice-99.pdf
Probabilistic Topic Models_      http://www.cs.columbia.edu/~blei/papers/Blei2012.pdf
_      http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf
19_      http://www.cs.columbia.edu/~blei/papers/BleiLafferty2009.pdf
Homework #2 graded
20Latent Variable Models & _      http://www.cs.berkeley.edu/~jordan/papers/variational-intro.pdf
Probabilistic Topic Models (continued)_      http://www.cs.ubc.ca/~arnaud/andrieu_defreitas_doucet_jordan_intromontecarlomachinelearning.pdf
21_      https://www.ee.washington.edu/techsite/papers/documents/UWEETR-2010-0006.pdf
22Matrix Factorization_      Chapter 14.6 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization.pdf
23aTensor Factorization_      http://www.sandia.gov/~tgkolda/pubs/pubfiles/TensorReview.pdf
23bMidterm exam graded and returned at the end of lecture
Thanksgiving Holiday – no class
24Sequence Models
25Model Selection_      Chapter 7 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
26Theory of Clustering_      http://www.cs.cornell.edu/home/kleinber/nips15.pdf
_      http://papers.nips.cc/paper/3491-measures-of-clustering-quality-a-working-set-of-axioms-for-clustering.pdf
27In-class project presentations
27In-class project presentations
Last day of classes
Project presentations graded
Project reports due at 11:59 PM Eastern
Project reports graded and final grades released.


AbbreviationTextbook TitleAuthorPublisherYear
PRMLPattern Recognition and Machine LearningChristopher C. BishopSpringer2006
MLPPMachine Learning: A Probabilistic PerspectiveKevin P. MurphyMIT Press2012
CVMLIComputer vision: models, learning and inferencePrince, Simon J DCambridge University Press2012
DLDeep LearningGoodfellow, Ian and Bengio, Yoshua and Courville, AaronMIT Press2016
FMLFoundations of Machine LearningMohri, Mehryar and Rostamizadeh, Afshin and Talwalkar, AmeetMIT Press2012
DHSPattern Classification, 2nd edDuda, Richard O. and Hart, Peter E. and Stork, David G.Wiley Interscience2004
MLMachine LearningMitchell, TomMcGraw Hill1997
I2MLIntroduction to Machine Learning, 2nd edAlpaydin, EthemMIT Press2012
MLAPMachine Learning: An Algorithmic perspectiveMarsland, StephenCRC press2009
PTPRA Probabilistic Theory of Pattern RecognitionDevroye, Luc and Gyorfi, Laszlo and Lugosi, GaborSpringer1997
ESLThe elements of statistical learning: Data mining, inference, and predictionFriedman, J and Hastie, T and Tibshirani, RSpringer2009
NAPRNetlab: Algorithms for Pattern RecognitionNabney, IanSpringer2002
DMPMLTTData Mining: Practical Machine Learning Tools and TechniquesWitten, Ian H and Frank, EibeMorgan Kaufmann2005
LAALinear Algebra and Its ApplicationsStrang, GilbertElsevier Science2014
MCMatrix computations, 4th edGolub, Gene H and Van Loan, Charles FJHU Press2013
COConvex OptimizationBoyd, Steven P and Vandenberghe, LievenCambridge University Press2004
ILCOIntroductory lectures on convex optimization: a basic course_Nestorov, YuriiSpringer2004
GPMLGaussian Processes for Machine LearningRasmussen, Carl Edward and Williams, Christopher K. I. MIT Press2006
ITILAInformation Theory, Inference, and Learning AlgorithmsMacKay, DavidCambridge University Press2003