RUS  ENG
Full version
JOURNALS // Computer Research and Modeling // Archive

Computer Research and Modeling, 2024 Volume 16, Issue 3, Pages 615–631 (Mi crm1181)

MODELS IN PHYSICS AND TECHNOLOGY

Convolutional neural networks of YOLO family for mobile computer vision systems

S. G. Nebaba, N. G. Markov

National Research Tomsk Polytechnic University, 30 Lenina ave., Tomsk, 634050, Russia

Abstract: The work analyzes known classes of convolutional neural network models and studies selected from them promising models for detecting flying objects in images. Object detection here refers to the detection, localization in space and classification of flying objects. The work conducts a comprehensive study of selected promising convolutional neural network models in order to identify the most effective ones from them for creating mobile real-time computer vision systems. It is shown that the most suitable models for detecting flying objects in images, taking into account the formulated requirements for mobile real-time computer vision systems, are models of the YOLO family, and five models from this family should be considered: YOLOv4, YOLOv4-Tiny, YOLOv4-CSP, YOLOv7 and YOLOv7-Tiny. An appropriate dataset has been developed for training, validation and comprehensive research of these models. Each labeled image of the dataset includes from one to several flying objects of four classes: “bird”, “aircraft-type unmanned aerial vehicle”, “helicopter-type unmanned aerial vehicle”, and “unknown object” (objects in airspace not included in the first three classes). Research has shown that all convolutional neural network models exceed the specified threshold value by the speed of detecting objects in the image, however, only the YOLOv4-CSP and YOLOv7 models partially satisfy the requirements of the accuracy of detection of flying objects. It was shown that most difficult object class to detect is the “bird” class. At the same time, it was revealed that the most effective model is YOLOv7, the YOLOv4-CSP model is in second place. Both models are recommended for use as part of a mobile real-time computer vision system with condition of additional training of these models on increased number of images with objects of the “bird” class so that they satisfy the requirement for the accuracy of detecting flying objects of each four classes.

Keywords: detection of flying objects in images, convolutional neural network, YOLO, mobile computer vision system

UDC: 004.93’12

Received: 17.10.2023
Revised: 19.02.2024
Accepted: 19.02.2024

DOI: 10.20537/2076-7633-2024-16-3-615-631



© Steklov Math. Inst. of RAS, 2024