Multi-level stochastic gradient methods for nested composition optimization

Shuoguang Yang; Mengdi Wang; Ethan X. Fang

Multi-level stochastic gradient methods for nested composition optimization

Shuoguang Yang, Mengdi Wang, Ethan X. Fang

Research output: Contribution to journal › Article › peer-review

Abstract

Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are oneor two-level expectations. In this paper, we consider the multi-level compositional optimization problem that involves compositions of multi-level component functions and nested expectations over a random path. It finds applications in risk-averse optimization and sequential planning. We propose a class of multi-level stochastic gradient methods that are motivated from the method of multi-timescale stochastic approximation. First we propose a basic T-level stochastic compositional gradient algorithm, establish its almost sure convergence and obtain an n-iteration error bound O(n-1/2T ). Then we develop accelerated multi-level stochastic gradient methods by using an extrapolation-interpolation scheme to take advantage of the smoothness of individual component functions. When all component functions are smooth, we show that the convergence rate improves to O(n-4/(7+T)) for general objectives and O(n-4/(3+T)) for strongly convex objectives. We also provide almost sure convergence and rate of convergence results for nonconvex problems. The proposed methods and theoretical results are validated using numerical experiments.

Original language	English (US)
Journal	Unknown Journal
State	Published - Jan 10 2018
Externally published	Yes

Keywords

Convex Optimization
Sample complexity
Simulation
Statistical learning
Stochastic gradient
Stochastic optimization

ASJC Scopus subject areas

General

Cite this

@article{4c367fb2009c4bb0a8687002a63aadae,

title = "Multi-level stochastic gradient methods for nested composition optimization",

abstract = "Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are oneor two-level expectations. In this paper, we consider the multi-level compositional optimization problem that involves compositions of multi-level component functions and nested expectations over a random path. It finds applications in risk-averse optimization and sequential planning. We propose a class of multi-level stochastic gradient methods that are motivated from the method of multi-timescale stochastic approximation. First we propose a basic T-level stochastic compositional gradient algorithm, establish its almost sure convergence and obtain an n-iteration error bound O(n-1/2T ). Then we develop accelerated multi-level stochastic gradient methods by using an extrapolation-interpolation scheme to take advantage of the smoothness of individual component functions. When all component functions are smooth, we show that the convergence rate improves to O(n-4/(7+T)) for general objectives and O(n-4/(3+T)) for strongly convex objectives. We also provide almost sure convergence and rate of convergence results for nonconvex problems. The proposed methods and theoretical results are validated using numerical experiments.",

keywords = "Convex Optimization, Sample complexity, Simulation, Statistical learning, Stochastic gradient, Stochastic optimization",

author = "Shuoguang Yang and Mengdi Wang and Fang, {Ethan X.}",

year = "2018",

month = jan,

day = "10",

language = "English (US)",

journal = "Unknown Journal",

issn = "0309-1708",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Multi-level stochastic gradient methods for nested composition optimization

AU - Yang, Shuoguang

AU - Wang, Mengdi

AU - Fang, Ethan X.

PY - 2018/1/10

Y1 - 2018/1/10

N2 - Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are oneor two-level expectations. In this paper, we consider the multi-level compositional optimization problem that involves compositions of multi-level component functions and nested expectations over a random path. It finds applications in risk-averse optimization and sequential planning. We propose a class of multi-level stochastic gradient methods that are motivated from the method of multi-timescale stochastic approximation. First we propose a basic T-level stochastic compositional gradient algorithm, establish its almost sure convergence and obtain an n-iteration error bound O(n-1/2T ). Then we develop accelerated multi-level stochastic gradient methods by using an extrapolation-interpolation scheme to take advantage of the smoothness of individual component functions. When all component functions are smooth, we show that the convergence rate improves to O(n-4/(7+T)) for general objectives and O(n-4/(3+T)) for strongly convex objectives. We also provide almost sure convergence and rate of convergence results for nonconvex problems. The proposed methods and theoretical results are validated using numerical experiments.

AB - Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are oneor two-level expectations. In this paper, we consider the multi-level compositional optimization problem that involves compositions of multi-level component functions and nested expectations over a random path. It finds applications in risk-averse optimization and sequential planning. We propose a class of multi-level stochastic gradient methods that are motivated from the method of multi-timescale stochastic approximation. First we propose a basic T-level stochastic compositional gradient algorithm, establish its almost sure convergence and obtain an n-iteration error bound O(n-1/2T ). Then we develop accelerated multi-level stochastic gradient methods by using an extrapolation-interpolation scheme to take advantage of the smoothness of individual component functions. When all component functions are smooth, we show that the convergence rate improves to O(n-4/(7+T)) for general objectives and O(n-4/(3+T)) for strongly convex objectives. We also provide almost sure convergence and rate of convergence results for nonconvex problems. The proposed methods and theoretical results are validated using numerical experiments.

KW - Convex Optimization

KW - Sample complexity

KW - Simulation

KW - Statistical learning

KW - Stochastic gradient

KW - Stochastic optimization

UR - http://www.scopus.com/inward/record.url?scp=85093027251&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85093027251&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85093027251

SN - 0309-1708

JO - Unknown Journal

JF - Unknown Journal

ER -

Multi-level stochastic gradient methods for nested composition optimization

Abstract

Keywords

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this