cs535 – Pattern Recognition

Instructor

Vladimir Pavlovic

Course Description

Pattern Recognition (a.k.a. Machine Learning I) course focuses Unsupervised Learning methods for data analysis.  In particular, we cover topics such as:

  • Association rules
  • Density modeling
  • Clustering (k-means, k-medoids, hierarchical models, spectral methods)
  • Mixture models
  • Distance metric learning
  • Factor analysis (PCA, CCA, etc.) and other latent variable models
  • Probabilistic topic models
  • Matrix factorization
  • Tensor models
  • Clustering evaluation

Expected Work

Regular readings; mini-projects; in-class presentations; midterm and/or a final course project.

You can find examples of some of the past projects here.

Course Schedule

Lec. #DateTopicReadings
12020-09-02 00:00:00Introduction & Overview_      http://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf
_      M.I.Jordan and T.M.Mitchell, "Machine learning: Trends, perspectives, and prospects," Science, 17 July 2015, Vol 349 Issue 6245
22020-09-02 00:00:00Probability & Linear Algebra_      https://content.sakai.rutgers.edu/access/content/group/c626f8f2-9099-4059-b4b9-5794524e759d/algebra.pdf
_      https://content.sakai.rutgers.edu/access/content/group/c626f8f2-9099-4059-b4b9-5794524e759d/probability.pdf
32020-09-09 00:00:00Association Rules & _      Chapter 14.1 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
Frequent Itemsets_      Chapter 14.2-14.2.3 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 6 of http://www.mmds.org/#book
42020-09-16 00:00:00Density Estimation_      Chapters 6.6 through 6.9 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      http://ned.ipac.caltech.edu/level5/March02/Silverman/Silver_contents.html
_      Chapters 4.1 - 4.4 of Duda, Hart, and Stork
2020-09-16 00:00:00Homework #1 assigned
Project proposals and in-class pitches assigned
52020-09-23 00:00:00K-means_      Chapters 9.1 and 9.2 of http://robotics.stanford.edu/~nilsson/MLBOOK.pdf
_      Chapters 14.3.1 through 14.3.6 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 7 of http://www.mmds.org/#book
_      https://www.cs.rutgers.edu/~mlittman/courses/lightai03/jain99data.pdf
_      Chapters 13.1 and 13.2 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapters 10.1 – 10.4 and 10.7 of Duda, Hart, and Stork
62020-09-23 00:00:00Gaussian Mixtures & Expectation Maximization & Factor Analysis_      Mixture of Gaussians: http://cs229.stanford.edu/notes/cs229-notes7b.pdf
_      The EM Algorithm: http://cs229.stanford.edu/notes/cs229-notes8.pdf
_      Factor Analysis: http://cs229.stanford.edu/notes/cs229-notes9.pdf
_      Chapters 14.3.7 through 14.3.9 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
72020-09-30 00:00:00K-medoids & Hierarchical Clustering_      Chapter 14.3.10 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 14.3.12 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 9.3 of http://robotics.stanford.edu/~nilsson/MLBOOK.pdf
_      Chapter 10.9 of Duda, Hart, and Stork
82020-09-30 00:00:00Evaluation Metrics & Practical Issues_      http://web.itu.edu.tr/sgunduz/courses/verimaden/paper/validity_survey.pdf
_      Chapter 14.3.11 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
2020-09-30 00:00:00Homework #1 due at 11:59 PM Eastern
92020-10-07 00:00:00Distance/Similarity Measures & Metric Learning_      http://web.cse.ohio-state.edu/~kulis/pubs/ftml_metric_learning.pdf
_      Check out the Encyclopedia of Distances on this course’s Sakai site (under Resources).
2020-10-07 00:00:00Homework #2 assigned
102020-10-07 00:00:00Principal Component Analysis (PCA) & Singular Value Decomposition (SVD)_      Chapter 14.5 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      Chapter 11 of http://www.mmds.org/#book
112020-10-14 00:00:00Spectral Clustering & Graph Clustering_      http://ai.stanford.edu/~ang/papers/nips01-spectral.pdf
_      http://www.cs.columbia.edu/~jebara/4772/papers/Luxburg07_tutorial.pdf
_      [Optional] http://arxiv.org/pdf/0906.0612.pdf
2020-10-14 00:00:00Homework #1 graded
122020-10-21 00:00:00Kernel Principal Components & Independent Component Analysis (ICA) & Canonical Correlation Analysis (CCA) & PageRank_      Chapter 14.5 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      ICA: Chapter 14.7 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      CCA: https://www.cs.cmu.edu/~tom/10701_sp11/slides/CCA_tutorial.pdf
_      PageRank:
o   Chapter 5 of http://www.mmds.org/#book
o   Chapter 14.10 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
o   https://www.cs.purdue.edu/homes/dgleich/publications/Gleich%202015%20-%20prbeyond.pdf
2020-10-21 00:00:00Homework #2 due at 11:59 PM Eastern
TBDTwo-page project proposals due at 11:59 PM Eastern
TBDIn-class project pitches
132020-10-28 00:00:00Recommendation Systems Retrieval models_      http://infolab.stanford.edu/~ullman/mmds/ch9.pdf
_      http://eliassi.org/papers/chaney-recsys15.pdf
TBDMidterm exam
TBDProject proposals & pitches graded
Project presentations and reports assigned
142020-11-04 00:00:00Latent Variable Models & _      http://research.microsoft.com/pubs/67187/bishop-latent-erice-99.pdf
Probabilistic Topic Models_      http://www.cs.columbia.edu/~blei/papers/Blei2012.pdf
_      http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf
_      http://www.cs.columbia.edu/~blei/papers/BleiLafferty2009.pdf
152020-11-04 00:00:00Latent Variable Models & _      http://www.cs.berkeley.edu/~jordan/papers/variational-intro.pdf
Probabilistic Topic Models (continued)_      http://www.cs.ubc.ca/~arnaud/andrieu_defreitas_doucet_jordan_intromontecarlomachinelearning.pdf
_      https://www.ee.washington.edu/techsite/papers/documents/UWEETR-2010-0006.pdf
1900-01-13 00:00:00Homework #2 graded
162020-11-11Matrix Factorization_      Chapter 14.6 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
_      http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization.pdf
172020-11-11Tensor Factorization_      http://www.sandia.gov/~tgkolda/pubs/pubfiles/TensorReview.pdf
TBDMidterm exam graded and returned at the end of lecture
182020-11-18 00:00:00Sequence Models - Non IID
2020-11-25 00:00:00Thanksgiving Holiday – no class (Friday classes)
192020-12-02 00:00:00Model Selection_      Chapter 7 of http://statweb.stanford.edu/~tibs/ElemStatLearn/
262020-12-02 00:00:00Theory of Clustering_      http://www.cs.cornell.edu/home/kleinber/nips15.pdf
_      http://papers.nips.cc/paper/3491-measures-of-clustering-quality-a-working-set-of-axioms-for-clustering.pdf
27TBDProject presentations
2020-12-10 00:00:00Last day of classes
TBDProject presentations graded
TBDProject reports due at 11:59 PM Eastern
TBDProject reports graded and final grades released.

Textbooks

AbbreviationTextbook TitleAuthorPublisherYear
PRMLPattern Recognition and Machine LearningChristopher C. BishopSpringer2006
MLPPMachine Learning: A Probabilistic PerspectiveKevin P. MurphyMIT Press2012
CVMLIComputer vision: models, learning and inferencePrince, Simon J DCambridge University Press2012
DLDeep LearningGoodfellow, Ian and Bengio, Yoshua and Courville, AaronMIT Press2016
FMLFoundations of Machine LearningMohri, Mehryar and Rostamizadeh, Afshin and Talwalkar, AmeetMIT Press2012
DHSPattern Classification, 2nd edDuda, Richard O. and Hart, Peter E. and Stork, David G.Wiley Interscience2004
MLMachine LearningMitchell, TomMcGraw Hill1997
I2MLIntroduction to Machine Learning, 2nd edAlpaydin, EthemMIT Press2012
MLAPMachine Learning: An Algorithmic perspectiveMarsland, StephenCRC press2009
PTPRA Probabilistic Theory of Pattern RecognitionDevroye, Luc and Gyorfi, Laszlo and Lugosi, GaborSpringer1997
ESLThe elements of statistical learning: Data mining, inference, and predictionFriedman, J and Hastie, T and Tibshirani, RSpringer2009
NAPRNetlab: Algorithms for Pattern RecognitionNabney, IanSpringer2002
DMPMLTTData Mining: Practical Machine Learning Tools and TechniquesWitten, Ian H and Frank, EibeMorgan Kaufmann2005
LAALinear Algebra and Its ApplicationsStrang, GilbertElsevier Science2014
MCMatrix computations, 4th edGolub, Gene H and Van Loan, Charles FJHU Press2013
COConvex OptimizationBoyd, Steven P and Vandenberghe, LievenCambridge University Press2004
ILCOIntroductory lectures on convex optimization: a basic course_Nestorov, YuriiSpringer2004
GPMLGaussian Processes for Machine LearningRasmussen, Carl Edward and Williams, Christopher K. I. MIT Press2006
ITILAInformation Theory, Inference, and Learning AlgorithmsMacKay, DavidCambridge University Press2003