Autonomous driving systems heavily depend on effective environmental perception, particularly in the area of object detection. Although YOLO (You Only Look Once) models [1][2][3] have established themselves as a standard for real-time object detection due to their balance between accuracy and speed, they show limitations when faced with unexpected situations, such as the sudden appearance of objects like animals on the road.
This project aims to address rare and unexpected cases, such as the sudden entry of animals into the field of view. By taking existing datasets, we will add suddenly appearing objects, with variations in speed and size, and analyze how current models react. The goal is to compare the performance of the models with these particular cases against those obtained on more conventional datasets.
To address these challenges, this project aims to explore a comprehensive approach to improve existing YOLO frameworks and integrate techniques such as transformers, attention mechanisms, and open-world recognition strategies [4][5]. The objective is to create a robust object detection system capable of adapting to complex and unexpected environments while maintaining real-time performance [6]. Thus, the main objectives of this project are as follows:
References:
J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
C. -Y. Wang, A. Bochkovskiy and H. -Y. M. Liao, "YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors," 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Cao, J., Zhang, T., Hou, L. et al. An improved YOLOv8 algorithm for small object detection in autonomous driving. J Real-Time Image Proc 21, 138 (2024).
K. J. Joseph, S. Khan, F. S. Khan and V. N. Balasubramanian, "Towards Open World Object Detection," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Yiming Li, Yi Wang, Wenqian Wang, Dan Lin, Bingbing Li, and Kim-Hui Yap. Open world object
detection: A survey. ArXiv, 2024.
J. M. Pierre, "Incremental Lifelong Deep Learning for Autonomous Vehicles," 2018 21st International Conference on Intelligent Transportation Systems (ITSC)
Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai: Deformable DETR: Deformable Transformers for End-to-End Object Detection. ICLR 2021
kshita Gupta, Sanath Narayan, K J Joseph, Salman Khan, Fahad Shahbaz Khan, and Mubarak
Shah. Ow-detr: Open-world detection transformer. ArXiv, 2022.
Xiao Zhao, Xukun Zhang, Dingkang Yang, Mingyang Sun, Mingcheng Li, Shunli Wang, and Lihua
Zhang. Maskbev: Towards a unified framework for bev detection and map segmentation. arXiv,
2024.
(c) GdR IASIS - CNRS - 2024.