|
SEMINARS |
Principle Seminar of the Department of Probability Theory, Moscow State University
|
|||
|
Probabilistic methods of feature selection A. Kozhevin Lomonosov Moscow State University |
|||
Abstract: The thesis is devoted to some methods for variable selection. This problem is not only of theoretical interest but also has a variety of applications, see, for example, Buhlmann, van de Geer (2011), Bolon-Canedo, Alonso-Betanoz (2018). Chapter 1 presents a modification of the MDR method proposed by Ritchie et al. (2001) and developed by Velez et al. (2007), Gui et al. (2011), Bulinsky (2012), Gola et al. (2015) and others. The main focus of the chapter is on the analysis of stratified samples. For the constructed estimates of the used error functional, strong consistency is proved. Chapters 2 and 3 develop information approaches to identifying significant factors, see, for example, Bennasar et al. (2014), Vergara, Estevez (2014). The second chapter discusses a new estimate of conditional entropy in a mixed model (which includes, in particular, logistic regression) when the vector of predictors has an absolutely continuous distribution and the response variable is a discrete random variable. For the proposed estimate, its asymptotic unbiasedness and In the third chapter, the estimate of mutual information in a mixed model is constructed. Asymptotic unbiasedness and for it. The consistency of the procedure for variable selection based on the introduced estimate of mutual information is proved when the number of significant factors is known. The proofs use conditional mathematical expectations, probabilistic inequalities, estimates of the rate of convergence in the central limit theorem, and other techniques. Theoretical results are supplemented by computer simulations. Comparison with recent papers by Coelho et al. (2016), Gao et al. (2017), Macedo et al. (2019) also provided. The thesis consists of 118 pages, the bibliography contains more than 100 items. |