RUS  ENG
Full version
JOURNALS // Avtomatika i Telemekhanika // Archive

Avtomat. i Telemekh., 2020 Issue 12, Pages 153–172 (Mi at15360)

This article is cited in 4 papers

Intellectual Control Systems, Data Analysis

Topical classification of text fragments accounting for their nearest context

A. V. Glazkova

University of Tyumen, Tyumen, Russia

Abstract: We present an approach to topical classification of biographical text fragments that takes into account the nearest context of classified fragments using a neural network with several inputs. The choice of the model architecture is based on the assumption that since texts written in a natural language differ in consistency and coherence, the context of a passage can be used as additional input data. The model was trained and tested on the biographical corpus compiled by ourselves. The results obtained using the proposed approach outperformed the results of models that do not take into account the context of the passage.

Keywords: sentence classification, data mining, recurrent neural networks, natural language processing, biographical texts, context, text corpus, biographical research, Word2Vec, BERT.

Presented by the member of Editorial Board: O. P. Kuznetsov

Received: 08.10.2019
Revised: 30.05.2020
Accepted: 09.07.2020

DOI: 10.31857/S0005231020120090


 English version:
Automation and Remote Control, 2020, 81:12, 2262–2276

Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024