A. V. Gasnikov, E. A. Gorbunov, D. A. Kovalev, A. Mohammed, E. O. Chernousova, “The global rate of convergence for optimal tensor methods in smooth convex optimization”, Computer Research and Modeling, 2018, Volume 10, Issue 6,Pages <nobr>737

This article is cited in 6 papers

MATHEMATICAL MODELING AND NUMERICAL SIMULATION

The global rate of convergence for optimal tensor methods in smooth convex optimization

A. V. Gasnikov^abc, E. A. Gorbunov^a, D. A. Kovalev^a, A. Mohammed^a, E. O. Chernousova^a

^a Moscow Institute of Physics and Technology, 9 Institute lane, Dolgoprudny, 141701, Russia
^b Institute for Information Transmission Problems RAS, 9 B. Karetny lane, Moscow, 127051, Russia
^c Caucasian Mathematical Center, Adygea State University, 208 University st., Republic of Adygea, 352700, Russia

Abstract: In this work we consider Monteiro – Svaiter accelerated hybrid proximal extragradient (A-HPE) framework and accelerated Newton proximal extragradient (A-NPE) framework. The last framework contains an optimal method for rather smooth convex optimization problems with second-order oracle. We generalize A-NPEframework for higher order derivative oracle (schemes). We replace Newton’s type step in A-NPE that was used for auxiliary problem by Newton's regularized (tensor) type step (Yu. Nesterov, 2018). Moreover we generalize large step A-HPE/A-NPE framework by replacing Monteiro – Svaiter's large step condition so that this framework could work for high-order schemes. The main contribution of the paper is as follows: we propose optimal high-order methods for convex optimization problems. As far as we know for that moment there exist only zero, first and second order optimal methods that work according to the lower bounds. For higher order schemes thereexists a gap between the lower bounds (Arjevani, Shamir, Shiff, 2017) and existing high-order (tensor) methods (Nesterov – Polyak, 2006; Yu. Nesterov, 2008; M. Baes, 2009; Yu. Nesterov, 2018). Asymptotically the ratio of the rates of convergences for the best existing methods and lower bounds is about 1.5. In this work we eliminate this gap and show that lower bounds are tight. We also consider rather smooth strongly convex optimization problems and show how to generalize the proposed methods to this case. The basic idea is to use restart technique until iteration sequence reach the region of quadratic convergence of Newton method and then use Newton method.One can show that the considered method converges with optimal rates up to a logarithmic factor. Note, that proposed in this work technique can be generalized in the case when we can't solve auxiliary problem exactly, moreover we can't even calculate the derivatives of the functional exactly. Moreover, the proposed technique can be generalized to the composite optimization problems and in particular to the constraint convex optimization problems. We also formulate a list of open questions that arise around the main result of this paper (optimal universal method of high order e.t.c.).

Keywords: Newton method, Hesse matrix, lower bounds, tensor methods, proximal fast gradient descent.

UDC: 519.86

Received: 03.09.2018
Revised: 19.11.2018
Accepted: 03.12.2018

DOI: 10.20537/2076-7633-2018-10-6-737-753