D. A. Kocharov, A. P. Menshikova, “Detection of prominent words in Russian texts using linguistic features”, Tr. SPIIRAN, 2017, Issue 55,Pages <nobr>216

This article is cited in 1 paper

Algorithms and Software

Detection of prominent words in Russian texts using linguistic features

D. A. Kocharov, A. P. Menshikova

Saint Petersburg State University (SPbSU)

Abstract: The article presents a method of detecting prosodically prominent words, i.e. words that carry most of the information in the utterance. The method relies on lexical, grammatical and syntactic markers of prominence, and can be used in Text-to-Speech synthesis systems to make synthesized speech sound more natural.
Three different classification methods were used: Naive Bayes, Maximum Entropy and Conditional Random Fields models. The results of the experiments show that discriminative models provide more balanced values of the performance metrics, while the generative model is potentially more useful for detecting prominent words in speech signal.
The results of the study are comparable with the performances of similar systems developed for other languages, and in some cases surpass them.

Keywords: prosodic prominence; emphasis; prosody; lexical analysis; syntax analysis; Naive Bayes classifier; Maximum Entropy classifier; Conditional Random Fields; Russian language.

UDC: 004.93'1, 004.912, 81'32

DOI: 10.15622/sp.55.9