Eurasian Journal of Mathematical and Computer Applications, 2017, Volume 5, Issue 4, Pages 70–79(Mi ejmca93)
The refined identification of beginning-end of speech; the recognition of the voiceless sounds at the beginning-end of speech. on the recognition of the extra-large vocabularies
Abstract:
The present paper belongs to the diphone DTW-recognition strategy developed by the authors. Voiceless plosives, as well as energetically weak hard and soft [f] constitute a problem for recognition when they occur at the beginning or end of speech, owing to their similarity to neighboring silence stretches. The article opens up a description of some refined methods for specifying the beginning and the end of a spoken word or phrase. This is the basis for the proposed methods of recognizing the mentioned sounds beginning or concluding a spoken word or phrase. We introduce a concept of the final quasifricative fragment as well as the algorithms for its selection and use to classify voiceless plosives in the final position. The results obtained together with an insignificant increase in the number of basic speech units,
makes it possible to advance in solving the difficult problems of recognizing short speech segments as well as extra-large vocabularies
Keywords:continuous-speech recognition, speech segmentation, large vocabulary speech recognition, voiceless fragment, diphone, dynamic time warping (DTW).