Machine Learning and Uncertainty Quantification for Data Science

MA59800-550, Fall 2016

Office: MATH 410, 150 N. University Street, West Lafayette, IN, 47907

Office hours: TR 04:20-05:00pm (tentative) or by appointment

Email: guanglin [at] purdue.edu

--------------------------------------------------------------------------------------------------------

Lectures Time and Location

 

TR 03:00-04:15pm

Classroom: UNIV 103

--------------------------------------------------------------------------------------------------------

Syllabus


Syllabus

--------------------------------------------------------------------------------------------------------

Graduate Course Description

This introductory course will cover many concepts, models, algorithms and Python codes in machine learning and uncertainty quantification in data science. Topics include classical supervised learning (e.g., regression and classification), unsupervised learning (e.g., principle component analysis and K-means), uncertainty quantification algorithms (e.g., importance sampling, Markov Chain Monte Carlo) and recent development in the machine learning and uncertainty quantification field such as deep machine learning, variational Bayes, and Gaussian processes.  While this course will give students the basic ideas, intuition and hands on practice behind modern machine learning and uncertainty quantification methods, the underlying theme in the course is probabilistic inference for data science.

--------------------------------------------------------------------------------------------------------

Tentative Topics

1.     Review on basic concepts in information theory and probability distributions

2.     Unsupervised machine learning algorithm

a.      Principal component analysis

3.     Supervised machine learning algorithm

a.      Active subspace algorithm

b.     Sliced inverse regression

c.      Localized sliced inverse regression

4.     Dimension reduction algorithms & data compression

5.     Compressive sensing algorithm

6.     Linear regression and classification

7.     Bayesian inference

8.     Clustering analysis (K-means Clustering, mixture models and Expectation Maximization)

9.     Stochastic gradient descent algorithm

10. Random forest

11. Hidden Markov models

12. Deterministic approximate inference: Variational Bayes, and expectation propagation

13. Support Vector Machines regression and classification

14. Gaussian process regression, Gaussian process classification

 

15. Uncertainty quantification algorithms

a.      Monte Carlo,

b.     Latin Hyper Cube sampling,

c.      Importance sampling

d.     Polynomial chaos method

e.      Gaussian process regression

f.       Compressive sensing

 

16. Data assimilation (Kalman filter, particle filter)


--------------------------------------------------------------------------------------------------------

Prerequisites

Basic linear algebra, calculus, and probability, or permission of instructor.

--------------------------------------------------------------------------------------------------------

Textbooks

Pattern Recognition and Machine Learning, Christopher M. Bishop, 2007

Gaussian processes for machine learning, Carl Edward Rasmussen and Christopher K. I. Williams, 2005

--------------------------------------------------------------------------------------------------------

Lecture Notes in Python Interactive Jupyter Notebook:

https://github.com/PredictiveModelingMachineLearningLab/MA598

 

--------------------------------------------------------------------------------------------------------

Assignments

--------------------------------------------------------------------------------------------------------

Grading