Focus
This direction studies curvature-aware algorithms that make modern machine learning
more efficient, reliable, and theoretically grounded. The work spans Newton and
quasi-Newton methods, cubic regularization, variance-reduced stochastic methods,
higher-order methods, Hessian sketches, and communication-efficient second-order algorithms.
Typical Questions
- When does curvature information give real gains over first-order training?
- How can Newton, cubic Newton, and quasi-Newton methods scale to large models?
- How can stochastic recursive gradients reduce variance without expensive full gradients?
- How should second-order information be compressed, sketched, or distributed?
- Can we obtain clean global rates while preserving practical implementability?