RUS  ENG
Full version
JOURNALS // Proceedings of the Institute for System Programming of the RAS // Archive

Proceedings of ISP RAS, 2025 Volume 37, Issue 2, Pages 217–236 (Mi tisp977)

Use of deep learning and natural language processing techniques for searching named entities in the medical instructions for use of drugs

Yu. P. Titov, N. V. Kilmishkin, D. D. Kubrakov, P. M. Ivanova

Plekhanov Russian State University of Economics

Abstract: As part of the work, a specialized dictionary has been created to search for key terms in the texts of medical instructions, using data from VigiAccess, ICD-10 and rlsnet.ru. The text corpus was previously cleaned and brought to a single format to improve the quality of model training. In the future, it is planned to use the source grls.rosminzdrav.ru, as more authoritative and complete, for information about registered medicines. To automate data annotation, an algorithm has been developed that searches and marks terms from the dictionary in BIO (Begin, Inside, Outside) format, providing structured markup for model training. The model based on deep neural networks has demonstrated high efficiency in recognizing named entities by taking into account contextual dependencies. The semantic graph of medicines was constructed using algorithms for finding connections between named entities. However, automatic identification of deeper connections between graph nodes is difficult and requires additional data markup to account for complex grammatical structures, which will improve the analysis of interactions in the texts of medical instructions.

Keywords: machine learning, deep learning, neural networks, natural language processing, medical drug instructions, semantic graph.

DOI: 10.15514/ISPRAS-2025-37(2)-16



© Steklov Math. Inst. of RAS, 2025