Machine Learning and Uncertainty Quantification for Data Science
MA59800-550, Fall 2016
Office hours: TR
04:20-05:00pm (tentative) or by appointment
Email: guanglin [at] purdue.edu
--------------------------------------------------------------------------------------------------------
Lectures Time and Location
TR 03:00-04:15pm
Classroom: UNIV 103
--------------------------------------------------------------------------------------------------------
Syllabus
--------------------------------------------------------------------------------------------------------
Graduate Course Description
This introductory course will cover many concepts, models,
algorithms and Python codes in machine learning and uncertainty quantification
in data science. Topics include classical supervised learning (e.g., regression
and classification), unsupervised learning (e.g., principle component analysis
and K-means), uncertainty quantification algorithms (e.g., importance sampling,
Markov Chain Monte Carlo) and recent development in the machine learning and
uncertainty quantification field such as deep machine learning, variational Bayes, and Gaussian processes. While this
course will give students the basic ideas, intuition and hands on practice
behind modern machine learning and uncertainty quantification methods, the
underlying theme in the course is probabilistic inference for data science.
--------------------------------------------------------------------------------------------------------
Tentative Topics
1. Review on basic concepts in information theory
and probability distributions
2. Unsupervised machine learning algorithm
a. Principal component analysis
3. Supervised machine learning algorithm
a. Active subspace algorithm
b. Sliced inverse regression
c. Localized sliced inverse regression
4. Dimension reduction algorithms & data
compression
5. Compressive sensing algorithm
6. Linear regression and classification
7. Bayesian inference
8. Clustering analysis (K-means Clustering, mixture
models and Expectation Maximization)
9. Stochastic gradient descent algorithm
10. Random forest
11. Hidden Markov models
12. Deterministic approximate inference: Variational Bayes, and expectation propagation
13. Support Vector Machines regression and
classification
14. Gaussian process regression, Gaussian process
classification
15. Uncertainty quantification algorithms
a. Monte Carlo,
b. Latin Hyper Cube sampling,
c. Importance sampling
d. Polynomial chaos method
e. Gaussian process regression
f. Compressive sensing
16. Data assimilation (Kalman
filter, particle filter)
--------------------------------------------------------------------------------------------------------
Prerequisites
Basic linear algebra,
calculus, and probability, or permission of instructor.
--------------------------------------------------------------------------------------------------------
Textbooks
Pattern Recognition and
Machine Learning, Christopher M. Bishop, 2007
Gaussian processes for machine learning, Carl Edward Rasmussen and Christopher
K. I. Williams, 2005
--------------------------------------------------------------------------------------------------------
Lecture Notes in Python Interactive Jupyter
Notebook:
https://github.com/PredictiveModelingMachineLearningLab/MA598
--------------------------------------------------------------------------------------------------------
Assignments
--------------------------------------------------------------------------------------------------------
Grading