XVII International Conference on Systems, Automatic Control and Measurements, SAUM 2024 (pp. 63-66)
АУТОР(И) / AUTHOR(S): Jianxun Cui , Xin Zhou , Marko Milojković , Miroslav Milovanović , Stanisa Perić
Download Full Pdf
DOI: 10.46793/SAUM24.063S
САЖЕТАК / ABSTRACT:
As autonomous driving technology advances, the feature-level fusion of LiDAR and camera data has emerged as a critical area of focus for enhancing perception systems. This paper begins by outlining the strengths and weaknesses of both LiDAR and cameras in the context of autonomous vehicles, highlighting the rationale behind their integration. It then delves into the concept of feature-level fusion, providing a detailed explanation of its significance. The methods for fusing LiDAR and camera data at the feature level are categorized into three main approaches: proposal-based, point-based, and unified-space-based fusion. The proposal-based approach mainly involves generating 2D or 3D object proposals from images and refining them with point cloud data. In contrast, the point-based method leverages image segmentation and point cloud information, or point projection, to filter out noise and enhance detection accuracy. The unified-space-based approach seeks to fuse features within a common intermediate space, thereby minimizing information loss that might occur when converting between image and point cloud representations. Each of these methodologies brings unique advantages and innovative solutions, contributing significantly to the advancement of autonomous driving. Despite their potential, they also present challenges that necessitate ongoing research and development.
КЉУЧНЕ РЕЧИ / KEYWORDS:
autonomous driving, LiDAR, camera, feature-level fusion
ЛИТЕРАТУРА / REFERENCES
- R. Qi, W. Liu, C. Wu, H. Su and L. J. Guibas, „Frustum PointNets for 3D Object Detection from RGB-D Data,“ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 918-927
- Shin, Y. P. Kwon and M. Tomizuka, „RoarNet: A Robust 3D Object Detection based on RegiOn Approximation Refinement,“ 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 2019, pp. 2510-2515
- Wan and T. Zhao, „Enhancing Intra- and Inter-Object Part Features for 3-D Object Detection Through LiDAR–Camera Fusion,“ in IEEE Sensors Journal, vol. 24, no. 16, pp. 27029-27044, 15 Aug.15, 2024
- Yin et al., „IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection,“ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2024, pp. 14905-14915
- Vora, A. H. Lang, B. Helou and O. Beijbom, „PointPainting: Sequential Fusion for 3D Object Detection,“ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 4603-4611
- Xu, D. Zhou, J. Fang, J. Yin, Z. Bin and L. Zhang, „FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection,“ 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 2021, pp. 3047-3054
- Yin, X. Zhou and Philipp Krhenbühl, „Multimodal Virtual Point 3D Detection,“ 2021 Neural Information Processing Systems, 2021
- Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, and D. Cai, „PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module,“ National Conference on Artificial Intelligence. Association for the Advancement of Artificial Intelligence (AAAI), 2020
- Huang, Z. Liu, X. Chen, and X. Bai, „Epnet: enhancing point features with image semantics for 3d object detection,“ Springer, Cham, 2020
- Bai et al., „TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers,“ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 1080-1089
- Liu et al., „BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation,“ 2023 IEEE International Conference on Robotics and Automation (ICRA), London, United Kingdom, 2023, pp. 2774-2781
- Li, B. Fan, J. Tian and H. Fan, „GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection,“ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2024, pp. 21209-21218
- Hao et al., „MBFusion: A New Multi-modal BEV Feature Fusion Method for HD Map Construction,“ 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 2024, pp. 15922-15928
- Li, Y. Chen, X. Qi, Z. Li, J. Sun, J. Jia, „UVTR: Unifying Voxel-based Representation with Transformer for 3D Object Detection, “ Proceedings of the 36th International Conference on Neural Information Processing Systems Article, 2022, no. 1340, pp. 18442–18455
- Wang et al., „UniTR: A Unified and Efficient Multi-Modal Transformer for Bird’s-Eye-View Representation,“ 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023, pp. 6769-6779
- R. Qi, W. Liu, C. Wu, H. Su and L. J. Guibas, „Frustum PointNets for 3D Object Detection from RGB-D Data,“ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 918-927