RUS  ENG
Full version
JOURNALS // Vestnik of Astrakhan State Technical University. Series: Management, Computer Sciences and Informatics // Archive

Vestn. Astrakhan State Technical Univ. Ser. Management, Computer Sciences and Informatics, 2013 Number 2, Pages 168–174 (Mi vagtu286)

SOCIAL AND ECONOMIC SYSTEMS MANAGEMENT

Analysis of statistical methods of a part-of-speech disambiguation in Russian texts

A. A. Porokhnin

Moscow State Technical University named after N. E. Bauman

Abstract: Ambiguity complicates text processing. As for English texts there are a number of disambigua-tion methods based on application of the probability method, which gives high precision results. Regarding to the Russian texts the problem is not only in part-of-speech ambiguity specific for English texts, but also in morphological and lexical ambiguity. In view of the fact that it is difficult to create a mathematical model for Russian language with free order of words in a sentence, the disambiguation methods based on the rules have received a larger development. In order to define the results of support vectors method and hidden Markov’s model of a part-of-speech disambiguation and full disambiguation in Russian texts processing the experiment, which lies in the use of sub-corpus with the disambiguated national corpus of the Russian language, is set up. It is shown that the hidden Markov’s model for disambiguation in Russian texts works better than the method of support vectors.

Keywords: ambiguity, part-of-speech ambiguity, morphological ambiguity, lexical ambiguity, disambiguation methods, hidden Markov’s model, method of support vectors.

UDC: [004.934:519.21/.24]:[81’322:811.161.1’36]
BBK: ÁÁÊ [32.973:22.17]:[81.1:81.411.2-21]

Received: 01.06.2013



© Steklov Math. Inst. of RAS, 2024