Аннотация:
The article considers minimization of the expectation of convex function. Problems of this type often arise in machine learning and a number of other applications. In practice, stochastic gradient descent (SGD) and similar procedures are often used to solve such problems. We propose to use Vaidya's cutting plane method with minibatching, which converges linearly and hence requires significantly less iterations than SGD. This is verified by our experiments, which are publicly available. The algorithm does not require neither smoothness nor strong convexity of target function to achieve linear convergence. We prove that the method arrives at approximate solution with given probability when using minibatches of size proportional to the desired precision to the power -2. This enables efficient parallel execution of the algorithm, whereas possibilities for batch parallelization of SGD are rather limited. Despite fast convergence, Vaidya's cutting plane method can result in a greater total number of calls to oracle than SGD, which works decently with small batches. Complexity is quasi linear in dimension of the problem, hence the method is suitable for relatively small dimensionalities.
|