Introduction
Lecture
1
Introduction
In this lecture we review the basic concepts and discuss some introductory examples.
What we will learn
Introduction
Examples: Linear Regression, Logistic Regression, SVM
Convex Functions
Convex Sets
Convex Problem
Scope and Summary
Introduction
Lecture
2
Convex Optimization
In this lecture we review the basic concepts and provide preliminary results.
What we will learn
Basic notations and definitions.
Separation Theorems.
(Sub)gradient.
First-order optimality conditions.
Convex problems.
First order oracle.
Optimization
Lecture
3
Cutting plane methods
In this lecture, we review cutting plane methods which are first order optimization methods
What we will learn
Center of Gravity.
Separation oracle.
Linear programs
Semi-definite programs.
Ellipsoid Method.
Optimization
Lecture
4
Gradient Descent
In this lecture we introduce the gradient descent method and analyze it.
What we will learn
Linear projections
Lipschitness
Projected (sub)Gradient Descent
GD Analysis
General case
Strong convexity
Smooth functions
well conditioned functions.
Generalization
Lecture
5
Stochastic optimization
In this lecture we begin to discuss the setting stochastic convex optimization, or learning convex functions.
What we will learn
General setup
Empirical risk
Hoeffding's inequality
Uniform convergence
Empirical risk minimization
Generalization
Lecture
6
Uniform Convergence Bounds
In this lecture we discuss generalization and provide the first technique to prove generalization: u.c. bounds.
What we will learn
Covering Numbers.
Union Bound.
Ɛ-approximation
General Linear Models.
Rademacher Complexity.
Mcdiarmid's Inequality.
Dimension independent generalization bounds.
Regularization.
Generalization
Lecture
7
Stability
In this lecture we discuss an alternative technique to prove generalization bounds, stability
What we will learn
Uniform Stability
Analysis of uniformly stable algorithms
Regularization (revisited)
Uniform convergence lower bounds
Generalization
Lecture
8
Stochastic Gradient Descent
In this lecture we introduce and analyze stochastic gradient descent
What we will learn
Noisy first-order oracle.
SGD algorithm.
Azuma's Inequality.
Fast rates.
Generalization lower bounds.
Identifying a biased coin.
Pinsker's Inequality.
Optimality of SGD.
Algorithms
Lecture
9
Stability Analysis for GD
In this lecture we provide algorithmic dependent generalization analysis, for GD.
What we will learn
Stability by proximity and,
analysis for strongly convex functions
Expansiveness and, analysis for smooth functions
Stability for general convex functions
Algorithms
Lecture
10
Generalization Lower Bounds
In this lecture we provide generalization lower bounds that are algorithmic specific
What we will learn
Generalization lower bound for smooth functions.
Generalization lower bound for general functions
Revisiting the role of regularization.
Full batch methods.