![]() |
|
СЕМИНАРЫ |
Большой семинар кафедры теории вероятностей МГУ
|
|||
|
[Dynamic Data Selection in Large Model Training] Bingyi Jing Southern University of Science and Technology |
|||
Аннотация: The training of large models typically requires the use of internet-scale massive data. Data quality is crucial to model performance, making the selection of high-quality samples from such vast datasets a critical issue. To address this, we have redesigned the lifecycle of data during the training process from the ground zero, starting with the underlying training framework. However, numerous challenges arise in the large-scale application of dynamic data filtering within current large model training systems. This report explores how to tackle these challenges. Язык доклада: английский |