Focus

This direction studies curvature-aware algorithms that make modern machine learning more efficient, reliable, and theoretically grounded. The work spans Newton and quasi-Newton methods, cubic regularization, variance-reduced stochastic methods, higher-order methods, Hessian sketches, and communication-efficient second-order algorithms.

Typical Questions

  • When does curvature information give real gains over first-order training?
  • How can Newton, cubic Newton, and quasi-Newton methods scale to large models?
  • How can stochastic recursive gradients reduce variance without expensive full gradients?
  • How should second-order information be compressed, sketched, or distributed?
  • Can we obtain clean global rates while preserving practical implementability?
Selected papers

Second-Order Optimization Highlights

3 papers
Variance reduction

SARAH and AI-SARAH Line

4 papers
Quasi-Newton methods

Scalable Curvature Approximations

4 papers