RUS  ENG
Ïîëíàÿ âåðñèÿ
ÆÓÐÍÀËÛ // Proceedings of Machine Learning Research (PMLR) // Àðõèâ

Proc. Mach. Learn. Res. (PMLR), 2024, òîì 247, ñòðàíèöû 4511–4547 (Mi pmlr3)

Improved high-probability bounds for the temporal difference learning algorithm via exponential stability

Sergey Samsonova, Daniil Tiapkinbc, Alexey Naumovad, Eric Moulinesb

a HSE University, Moscow, Russia
b Centre de Mathématiques Appliquées – CNRS – École polytechnique – Institut Polytechnique de Paris, France
c Université Paris-Saclay, CNRS, Laboratoire de mathématiques d'Orsay, France
d Steklov Mathematical Institute of Russian Academy of Sciences, Moscow, Russia

ßçûê ïóáëèêàöèè: àíãëèéñêèé



© ÌÈÀÍ, 2024