CESI LINEACT has an open position for a PhD student to develop affordance caracterization algorithms based on contextual information. The position is located on CESI Campus Dijon.
Toward Proactive Intelligence: Environmental Contextual Information and Gesture Recognition for Characterizing Affordances in Human-System Interactions
Scientific fields: Artificial Intelligence, Computer Vision, Robotics
Keywords: Context-based object affordance, Object recognition, Gesture recognition/evaluation.
Yuehua DING, Associate professor, HDR
Youssef MOURCHID & Nicolas RAGOT, Associate professors
Vision systems, previously confined to industrial settings, are now used in open environments such as homes, hospitals, and public spaces. This shift presents new challenges, particularly in Human-System Interaction (HSI), necessitating vision systems capable of more flexible and adaptive decision-making. Despite advancements in mechatronics and artificial intelligence (AI), challenges remain, particularly in interpreting human intentions and intuitively responding to unforeseen situations.
Gibson (1979) [1] and Norman (2013) [2] introduced the concept of "affordance," defined as the action possibilities an object offers to an agent based on its physical properties and the agent's abilities [5,6,9]. Affordances provide a promising means to enhance Human-System Interactions (HIS) [3,4]. However, ambiguity arises because a single object may have multiple affordances. In human-robot scenarios, where a robot must provide objects for specific tasks, such ambiguities are unacceptable as they degrade interactions.
Environmental context information plays a key role because it must allow for resolving doubts about the system's (here, a robot's) choice of certain affordances. The contextual information we are referring to includes both geometric, semantic, and map-related information of other objects positioned around the scene, as well as information related to the human, such as their gestures, actions taken, or those to come.
At CESI LINEACT, the "Engineering and Digital Tools" team has conducted research on affordances within Human-System Interaction, focusing on operator-device interactions for industrial assembly, dismantling, or maintenance tasks. Building on Simonian's definition of affordance, the research emphasizes identifying the minimal data set required to characterize object affordance from a human perspective.
This thesis complements these efforts by focusing on robot-centered affordances. The scientific challenge is to resolve ambiguity in affordance selection during human-robot interactions, leveraging contextual environmental data to improve interaction fluidity and intuitiveness. Two use cases are prioritized: 1) Collaborative human-robot industrial assembly [7]; 2) Assistance collaboration in healthcare between caregivers and robots [8].
Scientific challenges
Thesis Objectives
1. Developing a Theoretical Framework for Affordance Learning:
2. Gesture Recognition and Affordance Integration for Action Anticipation:
3. Practical Implementation and Experimental Validation:
Work Plan
These research works will be carried out in CESI, campus Dijon and will organized as follows:
Expected Scientific/Technical Outcomes
CESI LINEACT (UR 7527), the Digital Innovation Laboratory for Businesses and Learning in support of Territorial Competitiveness, anticipates and supports technological transformations in sectors and services related to industry and construction. CESI's historical ties with businesses are a determining factor in its research activities, leading to a focus on applied research in partnership with industry. A human-centered approach coupled with the use of technologies, as well as regional networking and links with education, have enabled cross-disciplinary research that centers on human needs and uses, addressing technological challenges through these contributions.
Its research is organized into two interdisciplinary scientific teams and two application domains:
· Team 1, "Learning and Innovating," is primarily focused on Cognitive Sciences, Social Sciences, Management Sciences, Education Science, and Innovation Sciences. The main scientific objectives are understanding the effects of the environment, particularly instrumented situations with technical objects (platforms, prototyping workshops, immersive systems), on learning, creativity, and innovation processes.
· Team 2, "Engineering and Digital Tools," is mainly focused on Digital Sciences and Engineering. Its main scientific objectives include modeling, simulation, optimization, and data analysis of cyber-physical systems. Research also covers decision-support tools and studies of human-system interactions, especially through digital twins coupled with virtual or augmented environments.
These two teams cross and develop their research in the two application domains of Industry of the Future and City of the Future, supported by research platforms, primarily the Rouen platform dedicated to the Factory of the Future and the Nanterre platform dedicated to the Factory and Building of the Future.
Modalities: documents and interview. Submit the following documents to ymourchid@cesi.fr, nragot@cesi.fr, yding@cesi.fr with the subject "[Application] Thesis Title on Page 1".
Your application will contain:
Required skills:
Scientific and Technical skills:
· Proficiency in Python and C++.
· Advanced written/oral communication.
· Independence, rigor, and teamwork.
Relational Skills:
[1] Gibson, E. J. (2003). The world is so full of a number of things: On specification and perceptual learning. Ecological psychology, 15(4), 283-287.
[2] Norman, Donald A. (2013). The design of everyday things (Revised and expanded editions ed.). Cambridge, MA London: The MIT Press. ISBN 978-0-262-52567-1.
[3] Hassanin, Mohammed, Salman Khan, and Murat Tahtali. "Visual affordance and function understanding: A survey." ACM Computing Surveys (CSUR) 54.3 (2021): 1-35.
[4] Corona, E., Pumarola, A., Alenya, G., Moreno-Noguer, F., & Rogez, G. (2020). Ganhand: Predicting human grasp affordances in multi-object scenes. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5031-5041).
[5] Deng, S., Xu, X., Wu, C., Chen, K., & Jia, K. (2021). 3d affordancenet: A benchmark for visual object affordance understanding. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1778-1787).
[6] Zhai, W., Luo, H., Zhang, J., Cao, Y., & Tao, D. (2022). One-shot object affordance detection in the wild. International Journal of Computer Vision, 130(10), 2472-2500.
[7] Eswaran, M., & Bahubalendruni, M. R. (2023). Augmented reality aided object mapping for worker assistance/training in an industrial assembly context: Exploration of affordance with existing guidance techniques. Computers & Industrial Engineering, 185, 109663.
[8] Kim, N. G., Effken, J. A., & Lee, H. W. (2022, May). Impaired affordance perception as the basis of tool use deficiency in Alzheimer’s disease. In Healthcare (Vol. 10, No. 5, p. 839). MDPI.
[9] Girish, D. S. (2020). Action Recognition in Still Images and Inference of Object Affordances (Doctoral dissertation, University of Cincinnati).
[10] Dallel, M., Havard, V., Baudry, D., Savatier, X., 2020. InHARD-Industrial Human Action Recognition Dataset in the Context of Industrial Collaborative Robotics. Presented at the 2020 IEEE International Conference on Human-Machine Systems (ICHMS), IEEE, pp. 1-6. https://doi.org/10.1109/ICHMS49158.2020.9209531
(c) GdR IASIS - CNRS - 2024.