RUS  ENG
Full version
JOURNALS // Informatics and Automation // Archive

Tr. SPIIRAN, 2014 Issue 36, Pages 128–150 (Mi trspy753)

Training Personal Voice Model of a Speaker with Unified Phonetic Space of Features Using Artificial Neural Network

E. Azarov, A. A. Petrovsky

Belarussian State University of Computer Science and Radioelectronic Engineering

Abstract: The paper investigates possibility of creating a personal voice model using transcribed speech samples of a specified speaker. The paper presents a practical way of building such speech model and some experimental results of applying the model to voice conversion. The model uses an artificial neural network organized as autoencoder that establishes correspondence between space of speech parameters and space of possible phonetic states, unified for any voice.

Keywords: Voice Conversion; Speech Synthesis; Artificial Neural Network.

UDC: 004.934

DOI: 10.15622/sp.36.8



© Steklov Math. Inst. of RAS, 2024