Abstract:
This paper provides a concise review of the most applied methods in speech recognition. Various principles of transcription developed in the Linguistic Data Consortium are discussed. The problems in evaluating the human level of efficiency in solving the problem of speech recognition are described. The typical errors made by a human are analyzed. It has been shown that transcribers demonstrate a high level of consistency with accurate transcription of pre-prepared English speech and fast transcription of conversational telephone speech. It is also shown that with increasing complexity of speech, the word disagreement rate increases. The results of a comparative analysis of errors generated by the speech system and those made by humans are presented. Their similarities and differences are analyzed. The modern automatic speech recognition problems are listed, the prospects for their solution and the directions of future research are estimated.