Abstract:
Many tasks of data analysis deal with high-dimensional data, and curse of dimensionality is an obstacle to the use of many methods for their solving. In many applications, real-world data occupy only a very small part of high-dimensional observation space, the intrinsic dimension of which is essentially lower than the dimension of this space. A popular model for such data is a manifold model in accordance with which data lie on or near an unknown low-dimensional data manifold (DM) embedded in an ambient high-dimensional space. Data analysis tasks studied under this assumption are referred to as the manifold learning ones. Their general goal is to discover a low-dimensional structure of high-dimensional manifold valued data from the given dataset. If dataset points are sampled according to an unknown probability measure on the DM, we face statistical problems on manifold valued data. The paper gives a short review of statistical problems regarding high-dimensional manifold valued data and the methods for solving them.
Keywords:data analysis, mathematical statistics, manifold learning, manifold estimation, density on manifold estimation, regression on manifolds.