Abstract:
Text classification is a fundamental task in natural language processing, and a huge body of research has been devoted to it. However, there has been little work on investigating noi se robustness for the developed approaches. In this work, we are bridging this gap, introducing results on noise robustness testing of modern text classification architectures for Engl ish and Russian languages. We benchmark the CharCNN and SentenceCNN models and introduce a new model, called RoVe, that we show to be the most robust to noise.
Key words and phrases:word vectors, distributed representations, d natural language processing.