Abstract:
In this paper, we consider the application of computer vision and recurrent neural networks to solve the problem of identifying and classifying actions on video.
The article describes the approach taken by the authors to analyze video files.
Recurrent neural networks uses as a classifier.
The classifier takes data in a
“bags of words” format that describes low-level actions.
The histograms contained in a “bags of words” are represented by sets of video file descriptors.
Next algorithms are used to search for descriptors: SIFT, ORB, BRISK, AKAZE. (In Russian).
Key words and phrases:computer vision, descriptors, bags of words, deep learning, recurrent neural networks, long short-term memory networks, video analysis.