Time and Location. Mondays 11:10–12:30 in Evans-740.
Course description.
The course will survey modern developments in (applied) Deep Learning for an audience of (pure) mathematicians. Each lecture will give a thin slice of a major
topic in deep learning, which itself is probably the subject of a full course in the CS department. The audience will be imagined
to have high mathematical sophistication, but little actual familiarity with the mathematics relevant to machine learning (e.g., statistics and optimization),
and relatively little computer science background.
(However, mastery of analysis, abstract linear algebra, and probability theory will be assumed. Enrolled students should also be proficient with Python.)
Lectures will have a theoretical perspective, which will be supplemented by student presentations and implementation projects.
Course References. The instructor will distribute course notes here. Other than these, the class will not follow any particular existing references, but here are some generally useful resources.
Specific references for each lecture will be indicated in the Schedule document.
-
Bishop and Bishop, Deep Learning — Foundations and Concepts (2024).
A well-written and modern textbook, which surveys a similar selection of topics to this course.
-
Kaplan, Notes on Contemporary Machine Learning for Physicists.
Idiosyncratic notes from the perspective of a physicist (and later cofounder of Anthropic).
-
Distill.pub.
An online journal focused on exposition, with many high-quality articles on select topics.
-
colah's blog.
A blog with thoughtful and well-illustrated posts, similar to Distill articles.
-
Hastie, Tibshirani, Friedman, The Elements of Statistical Learning.
A comprehensive textbook covering mathematical and statistical background for classical machine learning.
-
MacKay, Information Theory, Inference, and Learning Algorithms.
A charming and insightful textbook, if somewhat dated.
Other course documents (e.g., Syllabus) are posted on bCourses. For the convenience of auditors, the
Schedule and Projects are posted here.
Course schedule. Lecture notes (in progress)
- Week 1: Introduction to neural networks (Sep 8).
- Week 2: Information theory (Sep 15).
- Week 3: Statistical inference (Sep 22).
- Week 4: Optimization (Sep 29).
- Week 5: Convolutional Neural Networks (Oct 6).
- Week 6: Recurrent Neural Networks (Oct 13).
- Week 7: Transformers (Oct 20)
- Week 8: Large Language Models (Oct 27)
- Week 9: Generative Adversarial Networks and Variational Autoencoders (Nov 3)
- Week 10: Diffusion Models (Nov 10)
- Week 11: No class (Nov 17)
- Week 12: Reinforcement Learning I: Value function optimization (Nov 24)
- Week 13: Reinforcement Learning II: Policy optimization (Dec 1)
|