RUS  ENG
Full version
JOURNALS // Eurasian Journal of Mathematical and Computer Applications // Archive

Eurasian Journal of Mathematical and Computer Applications, 2017, Volume 5, Issue 4, Pages 70–79 (Mi ejmca93)

The refined identification of beginning-end of speech; the recognition of the voiceless sounds at the beginning-end of speech. on the recognition of the extra-large vocabularies

V. Yu. Shelepov, A. V. Nitsenko

Institute of Artifical Intelligence, 118-b, Artyom st, 83048 Donetsk, Ukraine

Abstract: The present paper belongs to the diphone DTW-recognition strategy developed by the authors. Voiceless plosives, as well as energetically weak hard and soft [f] constitute a problem for recognition when they occur at the beginning or end of speech, owing to their similarity to neighboring silence stretches. The article opens up a description of some refined methods for specifying the beginning and the end of a spoken word or phrase. This is the basis for the proposed methods of recognizing the mentioned sounds beginning or concluding a spoken word or phrase. We introduce a concept of the final quasifricative fragment as well as the algorithms for its selection and use to classify voiceless plosives in the final position. The results obtained together with an insignificant increase in the number of basic speech units, makes it possible to advance in solving the difficult problems of recognizing short speech segments as well as extra-large vocabularies

Keywords: continuous-speech recognition, speech segmentation, large vocabulary speech recognition, voiceless fragment, diphone, dynamic time warping (DTW).

MSC: 68T10, 68T50

Received: 15.09.2017
Accepted: 09.11.2017

Language: English



© Steklov Math. Inst. of RAS, 2024