Abstract:
This study investigates the application of neural networks in the task of classifying audio signals into ten different genres. The peculiarities of processing audio signals in the digital environment are examined, along with the relationship between Fourier transformation and spectrograms, and the characteristics of audio signals. Neural network training was conducted using the GTZAN dataset, which contains 1000 compositions. Four comparable datasets were formed based on this dataset, and the performance of three neural network architectures – convolutional, recurrent, and multilayer perceptron – was evaluated on each of them. The practical significance of this work lies in the possibility of forming musical recommendations and organizing music. The goal of the study is to develop a classifier that could accurately determine the probability of a composition belonging to one of the ten genres.