A. Durmus, E. Moulines, A. Naumov, S. Samsonov, “Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation”, Mathematics of Operations Research, 2024, 1–30
S. Samsonov, D. Tiapkin, A. Naumov, E. Moulines, “Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability”, Proceedings of Thirty Seventh Conference on Learning Theory, Proceedings of Machine Learning Research, 247, 2024, 4511–4547https://proceedings.mlr.press/v247/samsonov24a.html
S. Samsonov, E. Moulines, Qi-Man Shao, Zhuo-Song Zhang, A. Naumov, Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning, NeurIPS, 2023