Аннотация:
Sensor fusion is one of the important solutions for the perception problem in self-driving cars, where the main aim is to enhance the perception of the system without losing real-time performance. Therefore, it is a trade-off problem and its often observed that most models that have a high environment perception cannot perform in a real-time manner. Our article is concerned with camera and Lidar data fusion for better environment perception in self-driving cars, considering 3 main classes which are cars, cyclists and pedestrians. We fuse output from the 3D detector model that takes its input from Lidar as well as the output from the 2D detector that take its input from the camera, to give better perception output than any of them separately, ensuring that it is able to work in real-time. We addressed our problem using a 3D detector model (Complex-Yolov3) and a 2D detector model(Yolo-v3), where in we applied the image-based fusion method that could make a fusion between Lidar and camera information with a fast and efficient late fusion technique that is discussed in detail in this article. We used the mean average precision (mAP) metric in order to evaluate our object detection model and to compare the proposed approach with them as well. At the end, we showed the results on the KITTI dataset as well as our real hardware setup, which consists of Lidar velodyne 16 and Leopard USB cameras. We used Python to develop our algorithm and then validated it on the KITTI dataset. We used ros2 along with C++ to verify the algorithm on our dataset obtained from our hardware configurations which proved that our proposed approach could give good results and work efficiently in practical situations in a real-time manner.
Ключевые слова:autonomous vehicles, self-driving cars, sensors fusion, Lidar, camera, late fusion, point cloud, images, KITTI dataset, hardware verification.
УДК:
004.896
Поступила в редакцию: 15.09.2022 Принята в печать: 10.10.2022