Spring 2019:
MA 59800 Mathematical Aspects of Neural Networks
Instructor: Greg Buzzard
Assignments:
Note: The lowest 4 presentation summaries will be dropped (i.e., 2 weeks of summaries at 2 per week). Also, presenters do not need to prepare summaries for the papers they are presenting.
Week 13 (April 8-12):- Read at least Part 6 of An Outsider's Tour of Reinforcement Learning.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- Programming Assignment: Due Tuesday, April 16 at 9pm. Here are the templates:
- Part 1: Python notebook
- Part 2: Python notebook
- Read Goodfellow, Ch 14 and Chollet, Chapter 8.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- Read Goodfellow, Ch 14 and Chollet, Section 8.4.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- Read Goodfellow, Ch 14 and Chollet, Section 8.4.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- Finish Programming assignment 3, due Thursday, March 21 at 9pm.
- Read Goodfellow, Ch 10 and Chollet, Chs 6 and Section 7.1.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- UPDATE: Programming assignment, due March 21 at 9pm.
Implement two forms of word embedding.- Use a dense embedding to implement CBOW. Follow the detailed instructions in this Jupyter notebook.
- Use LSTM to implement Skip Gram. Follow the detailed instructions in this Jupyter notebook.
- Read Goodfellow, Ch 10 and Chollet, Chs 6 and Section 7.1.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- Read Goodfellow, Ch 10 and Chollet, Chs 6 and Section 7.1.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- Update:Here is a notebook with code for the skip connection and the time encoding. Programming assignment, due Monday, February 25 at 8pm (template below). Note: Here are code samples for accessing files on Google Drive and for saving and loading models.
Use this template for both models below.- Model 1: Modify the model described in Chollet, Section 6.3.6 (GRU plus reccurent dropout) as follows: (a) use a skip connection that adds the baseline prediction of the same temperature 24 hours ago to the output of the GRU, (b) use only 16 nodes in the GRU instead of 32, and (c) use a lookback of 432 (3 days).
- Model 2: Encode the date using 2 floating point numbers of the form (cos 2 π t/365, sin 2 π t/365), where t is the number of days from January 1. Likewise, encode the time of day using a similar format. Include these values as input to the GRU + skip connection from Model 1.
- Update: Use this notebook as a template for your submissions and follow the directions given there. Submit your work on gradescope. Printing to pdf from Google colaboratory works better on Chrome than other browsers.
Here are instructions about submitting on Gradescope.
Programming assignment, due Monday, February 18, 8pm. Details of the assignment are in this Jupyter Notebook. You must work in teams of 2 or 3 and identify your submission using the 598 IDs for all members of the team. You must do your work using a Jupyter notebook, and the first text box must include the 598 IDs for all team members. You will submit your work by printing to pdf. Submission details will be given later. - Read the papers being presented on Wednesday.
- Prepare summaries for Friday. Note that presenters do not need to prepare summaries for the papers they are presenting.
- Read Chapter 9 in Goodfellow and Chapter 5 in Chollet.
- Step through the code for Chapter 5 of Chollet.
- Read the papers being presented on Wednesday.
- Prepare summaries for Friday.
- Read Chapter 6 in Goodfellow and Chapter 4 in Chollet.
- Step through the code for Chapter 4 of Chollet.
- Read Chapters 5 in Goodfellow and Chapters 3 in Chollet by Friday.
- Read both papers that will be presented on Wednesday. The schedule for paper presentations with links to the papers is here.
- Wednesday: please go to WTHR 320 for the seminar.
- Friday: Paper summaries be due in class as in week 2. Also, step through each of the code examples for Chollet Chapter 3 before Friday.
For future reference, when you give a presentation or submit a summary of a paper, you should describe the items in
this list of questions for papers/presentations
clearly, in addition to any other items needed to explain the paper clearly.
Presentation schedule:
Click here for the presentation schedule.
- Read Chapters 1-3 in Goodfellow and Chapters 1-2 in Chollet by Friday.
- Read both papers that will be presented on Wednesday. The schedule for paper presentations with links to the papers is here.
- Wednesday: please go to WTHR 320 for the seminar.
- Friday: Paper summaries will be due in class - one for each paper presented on Wednesday. Each summary should be at most one page, 12 point font, 1 inch margins and should describe each of the main points described in this document. Try to be short and to the point. You should put your personal code (sent in a separate email) at the top of the paper - do not put your name on your paper. Papers will graded on a scale of 0, 1, 2, 3.
- 0 = no work or essentially no work
- 1 = some work, but important items missing
- 2 = reasonable summary of paper highlights, all important items present
- 3 = requirements for 2 plus description of something important in the paper but not in the presentation
- Read Chapters 1 and 2 of the Goodfellow book and Chapters 1 and 2 of the Chollet book.
- Set up to use a Jupyter notebook (see notes on the course web page) and load the example code 2.1. In colaboratory, you should choose 'Runtime' along the top, then choose 'Change runtime type' and under 'Hardware accelerator' choose 'GPU'.
- After the line
network.add(layers.Dense(10, activation='softmax'))
add a new line
network.summary()
Then step through the code from the top (put your mouse in a code box and then click on the arrow at the left, then read the output and any text and repeat for the next code box). Try to understand as much of the code and explanation as possible. - Once you've done this, choose 'Run all' under 'Runtime'. Then try changing network to use multiple layers (but leaving the final layer the same).
- Can you find a network using only Dense layers with relu activation that does better than the original network? Can you do it without increasing the number of trainable parameters or the number of training epochs? Note that you have to run from the beginning each time you make a change in order to avoid cheating by training for more than 5 epochs.
- If you find a better network, print out the model summary, the training record, and the test accuracy. This is not a graded exercise, so don't get stuck doing this when you should be doing other things.
- After the line
- Find a partner for a presentation. You can use the piazza forum if needed.
- Look through the papers for presentations and select your top 5.