RUS  ENG
Full version
JOURNALS // Preprints of the Keldysh Institute of Applied Mathematics // Archive

Keldysh Institute preprints, 2023 046, 15 pp. (Mi ipmp3174)

Automatic construction the Russian corpus of verbal government

E. S. Klyshinsky, V. A. Ganeeva, E. A. Klykova, O. I. Vasina, A. E. Bogdanova, O. V. Karpik


Abstract: This paper presents further development of the Russian Verb Co-occurrences Corpus. The first version of the corpus was improved using a variety of methods, including the application of frequency thresholds to filter out irrelevant information, identification of parenthetical expressions, adoption of a semantics-based approach to differentiate between arguments and adjuncts, and the clustering of verbs based on their semantic frames.

Keywords: verbal government, word co-occurrences corpus, the Russian language, natural language processing.

DOI: 10.20948/prepr-2023-46



© Steklov Math. Inst. of RAS, 2024