RUS  ENG
Full version
JOURNALS // Program Systems: Theory and Applications // Archive

Program Systems: Theory and Applications, 2024 Volume 15, Issue 4, Pages 153–181 (Mi ps460)

Medical Informatics

Symptoms extraction and automatic diagnosis prediction from medical clinical records

Yu. P. Serdyuk, N. A. Vlasova, S. R. Momot

Ailamazyan Program Systems Institute of RAS, Ves’kovo, Russia

Abstract: The paper introduces a system for symptoms extraction from medical clinical records (texts in natural Russian language) and automatic prediction of a diagnosis in the form of the disease title and its ICD-10 code. The system is designed for a restricted domain of 6 pulmonary diseases (chronic obstructive pulmonary disease, pneumonia, bronchial asthma etc) and COVID-19.
Different neural networks are employed for the symptoms extraction by recognizing certain medical entities and relations between them. A classifier based on a neural network is responsible for the automatic diagnosis. An annotated corpus of sentences is created for the training of the neural networks. The principles and rules of the annotation are described. A corpus of texts is used for the training of the classifier.
Both subsystems were tested, the resulting accuracy estimates are provided. The accuracy of diagnosis in the given domain is 88.5%. We also compare our system with similar works on symptom extraction from texts in various languages, as well as on automatic diagnosis, including systems such as ChatGPT.

Key words and phrases: clinical decision support systems, symptom extraction, automatic diagnosis prediction, BERT models, ChatGPT-based systems.

UDC: 004.89: 61
BBK: 32.813.5: 53

MSC: Primary 68T50; Secondary 92C50

Received: 03.12.2024
Accepted: 28.12.2024

DOI: 10.25209/2079-3316-2024-15-4-153-181



© Steklov Math. Inst. of RAS, 2025