K. L. Lu, “How Can We Identify the Sparsity Structure Pattern of High-Dimensional Data: an Elementary Statistical Analysis to Interpretable Machine Learning”, Mat. Zametki, 2022, Volume 112, Issue 2,Pages <nobr>223

Papers published in the English version of the journal

How Can We Identify the Sparsity Structure Pattern of High-Dimensional Data: an Elementary Statistical Analysis to Interpretable Machine Learning

K. L. Lu^ab

^a Jiangsu Automation Research Institute, Shanghai, 201210 China
^b School of Urban Railway Transportation, Shanghai University of Engineering Science, Shanghai, 201620 China

Abstract: Machine learning is a key tool to identify low-dimensional structure patterns in high-dimensional data in the current “Big Data” era. Taking linear regression and supervised binary classification for simplicity as study cases, we present a whole statistical analysis framework and procedure from formulation to computation, which aims to provide an elementary introduction to interpretable machine learning methods or algorithms, e.g., Lasso and its variants, SVM, etc. Meanwhile, the optimality, risk bounds, and complexity of these sparsity structure pattern recognition algorithms have been precisely characterized through proved theorems or corollaries. And the limitations of these algorithms and why we need deep learning are realized.

Keywords: high-dimensional data, sparsity structure, pattern recognition, statistical analysis, interpretable machine learning.

Received: 22.01.2022

Language: English