A STOCHASTIC SUB-GRADIENT METHOD FOR DISTRIBUTIONALLY ROBUST AND RISK-AVERSE LEARNING

MERT GURBUZBALABAN – RUTGERS UNIVERSITY

ABSTRACT

We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to uncertainty in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses semi-deviation risk for quantifying uncertainty, allowing us to compute solutions that are robust against perturbations in the population data distribution. We consider a large family of loss functions that can be non-convex and non-smooth including those that arise in deep learning with ReLU activations and develop an efficient stochastic subgradient method. We prove that it converges to a point satisfying the optimality conditions. To our knowledge, this is the first method with rigorous convergence guarantees in the context of non-convex non-smooth distributionally robust stochastic optimization. Our method can achieve any desired level of robustness with little extra computational cost compared to population risk minimization and admits non-asymptotic convergence and finite-sample guarantees when the loss function is weakly convex. We also illustrate the performance of our algorithm on real datasets arising in convex and non-convex supervised learning problems including deep learning. Finally, we discuss how risk measures can be used to obtain robust guarantees for stochastic gradient methods and their accelerated versions. This is joint work with Bugra Can, Andrzej Ruszczyński and Landi Zhu.

Related papers:

1. Distributionally robust learning with general non-smooth non-convex losses: https://link.springer.com/article/10.1007/s10957-022-02063-6
2. Distributionally robust learning with weakly convex losses: https://arxiv.org/abs/2301.06619
3. Entropic risk measure for robust accelerated stochastic gradient methods: https://arxiv.org/abs/2204.11292