Context
Pedestrian Attribute Recognition (PAR) aims to develop algorithms and systems that can accurately identify these attributes from visual data, typically images or video frames. This technology finds applications in various domains, including surveillance, crowd analysis, human-computer interaction, and autonomous driving. It can be used for tasks such as monitoring crowded areas, identifying individuals in security footage, or enhancing the perception capabilities of autonomous vehicles to better understand the surrounding environment. These attributes can include but are not limited to:
• Clothing color and style (e.g., shirt color, pants type)
• Accessories (e.g., backpack, hat, glasses)
• Age group (e.g., adult, child, elderly)
• Gender
• Hair style and color
• Presence of objects (e.g., umbrella, bag)
• Body pose or posture
• Activities or behaviors (e.g., walking, running, standing)
On the other hand, Deep learning, particularly through convolutional neural networks (CNNs), has developed pedestrian attribute recognition research by automatically learning intricate features directly from data, enabling end-to-end learning without manual feature extraction. Its flexibility allows adaptation to diverse datasets and variations in lighting, poses, and occlusions. With attention mechanisms, deep learning models focus on relevant parts of pedestrian images, enhancing attribute recognition. Furthermore, deep learning integrates contextual information, such as spatial and temporal cues, and benefits from transfer learning with pretrained models, accelerating learning and improving performance, leading to more accurate and robust attribute recognition systems.
Objectives
The objective of this internship is to explore the literature of PAR based on deep learning assessing the performance of the well-known methods on large benchmarks including recent one and considering a variety of PAR devoted metrics. Moreover, the internship aims to propose a novel CNN architecture coupled with attention mechanisms to benefit from the contextual cues. The internship will focus also on synthetizing a realistic dataset for training by means of Stable Diffusion pipelines and recent generative IA architectures.
Applicant Profile
To ensure a successful internship in pedestrian attribute recognition using deep learning with attention mechanisms, candidates should possess the following skills:
• Image Processing.
• Deep Convolutional Neural Networks.
• Generative IA.
• Stable Diffusion with controlled and guided generation.
• Human Pose detection and segmentation.
• Attention-based features learning.
• PyTorch Library for Deep Learning.
• Good English level for writing a potential scientific paper.
The candidate should be in final year of Master or Engineering School, in data science, artificial intelligence, applied mathematics, or related fields.
Host Laboratory
The internship will take place at the CIAD laboratory (https://www.ciad-lab.fr/) on the Montbéliard campus. CIAD is a leading research group specializing in distributed knowledge systems and artificial intelligence, particularly in applications that involve robotics and deep learning (Montbéliard campus). The team includes a dedicated group of four permanent professors and numerous PhD students, providing a collaborative and stimulating research environment.
The campus is well-equipped with high-performance GPU machines for training deep models, allowing students to work on cutting-edge AI research. The lab also features advanced robotic platforms and autonomous vehicles, enabling the researchers to validate their developments on real-world robotic systems.
Duration
6 months, starting in February or March 2024. This internship may lead to a PhD thesis is funds are granted depending on the internship outcomes (scientific publication)
Supervisors
The intern will be supervised by
- Full Prof. Yassine Ruichek
- Asst Prof. Mohamed KAS
The candidates are invited to send their (Resume, Transcripts of las two years, Motivation letter, …) to mohamed.kas@utbm.fr