A. A. Petrovsky, A. A. Petrovsky, “A scalable speech and audio coders based on adaptive time-frequency signal analysis”, Tr. SPIIRAN, 2017, Issue 50,Pages <nobr>55

Methods of Information Processing and Management

A scalable speech and audio coders based on adaptive time-frequency signal analysis

A. A. Petrovsky^a, A. A. Petrovsky^b

^a Russian Research Center of Huawei Technologies
^b Belarusian State University of Informatics and Radioelectronics (BSUIR)

Abstract: In the paper is discussed the methods of perceptual sub-band audio signal processing with the dynamic time-frequency map transformation based on the discrete wavelet packet (WP) transform. The advantages of it is that the growing process of WP tree is going from the top to down without returning to smaller scale levels of decomposition and needing to build a complete WP tree, that corresponds to the concept of scalable audio/speech coders implementation in real time. The objective quality assessment of proposed coders based techniques PEMO-Q and comparing with the widespread encoders Opus and Vorbis are given. It shows that the reconstructed signal complies with ITU-R PEAQ at a high compression ratio up to 18 times or more, does not contain artifacts and noise to mask ratio NMR$_{total}$ less =-9 dB.

Keywords: scalable audio/speech coder, wavelet packet, matching pursuit.

UDC: 004.032.6

DOI: 10.15622/sp.50.3