RUS  ENG
Full version
JOURNALS // Artificial Intelligence and Decision Making // Archive

Artificial Intelligence and Decision Making, 2021 Issue 1, Pages 75–85 (Mi iipr93)

Analysis of signals, audio and video information

Segmentation of noisy speech signals

A. G. Shishkin, S. D. Protserov

Lomonosov Moscow State University, Moscow, Russia

Abstract: One of the most important problems in digital speech signal processing is to determine which parts of input acoustic signal contain speech, and which contain background noise or silence. This problem arises in many important practical applications, such as speech analysis in voice command systems, transmission of speech over the network and automated speech recognition. However, most of the existing systems designed for automated speech analysis are unable to solve this problem efficiently if the signal-to-noise ratio is too low. Moreover, their parameters have to be tuned separately for different noise levels. This prevents fully automated segmentation of noisy speech signals. In this paper we design a system for automated segmentation of speech signals distorted by additive noise of different type and intensity. Our system is based on three different convolutional neural network models and is capable of efficiently determining speech and silence segments in noisy signals with a wide range of noise intensity and different noise types.

Keywords: speech signal, convolutional neural network, segmentation, digital signal processing.

DOI: 10.14357/20718594210107


 English version:
, 2022, 49:5, 356–363

Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024