Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.
Inscriptions closes à cette réunion.
25 personnes membres du GdR ISIS, et 21 personnes non membres du GdR, sont inscrits à cette réunion.
Capacité de la salle : 100 personnes.
Les techniques d'apprentissage profond se sont progressivement imposées comme les outils les plus performants pour résoudre de nombreux problèmes de vision par ordinateur. Toutefois, la vision 3D est fondée sur des concepts théoriques bien établis issus de la physique, qui ne sont pas explicitement pris en compte dans les modèles d'apprentissage. L'objectif de cette journée "Vision 3D et Apprentissage" est de réunir les chercheurs et chercheuses dont les travaux se situent à l'interface entre ces deux domaines.
La journée portera notamment sur ce que peuvent apporter les méthodes d'apprentissage à la vision 3D, et réciproquement, sur comment introduire des aspects de géométrie 3D dans des techniques d'apprentissage. Par exemple, nous chercherons à répondre aux questions suivantes : Pour quels aspects de la vision 3D les méthodes d'apprentissage sont-elles adaptées, et comment les appliquer ? Existe-t-il encore des applications pour lesquelles les méthodes purement géométriques restent plus adaptées et pourquoi ? Comment tenir compte de la dynamique de la scène ou de la déformation des objets dans les méthodes d'apprentissage profond ?
Cette journée aura lieu l'amphi Durand à Sorbonne Université (Jussieu) le 30 mai 2024 de 10h à 17h et inclura deux conférences invitées :
Nous lançons également un appel à contribution, notamment à destination des doctorant·e·s. Les personnes souhaitant présenter leurs travaux sont invitées à envoyer, par e-mail, leur proposition (titre et résumé d'une demi-page) aux personnes qui organisent la journée, par email (email : yvain.queau@ensicaen.fr ; sylvie.chambon@toulouse-inp.fr), avant le 26 avril. Suivant les propositions reçues, nous proposerons une présentation orale ou poster.
Les organisateurs,
Sylvie Chambon, IRIT, INP Toulouse
Yvain Queau, GREYC, CNRS
En l'absence temporaire de gestionnaire dédiée au GdR IASIS, nous sommes malheureusement dans l'impossibilité d'assurer la prise en charge de mission sur le budget du GdR.
9h30 - 9h45 : Accueil
9h45 - 10h45 : Keynote I
Céline Loscos (Huawei)
10h45 - 11h : Café
11h - 12h30 : Contributions I
Nabil Madali (INSA Rennes)
Guenole Fiche (Centrale Supelec)
Fabien Castan (Technicolor Creative Studios)
12h30 - 14h : Déjeuner
14h - 15h : Keynote II
Sylvain Lefebvre (INRIA)
15h - 16h : Contributions II
Benjamin Coupry (IRIT)
Hala Djeghim (Univ. Saclay, Huawei)
16h - 16h30 : Discussions GT Vision 3D
Céline Loscos (Huawei)
Bio:
Céline LOSCOS has a 20+-year experience of research in Computer Graphics, contributing to global illumination rendering solutions, crowd simulation AR/VR, 3D reconstruction, and HDR imaging. Since 2022, she has worked in the 3D graphics team of Huawei Nice Research Center where she explores solutions for next generation low power premium smartphones in order to extend playtime while targeting seamless gaming quality experience with high-end desktops, with a special focus on future real-time game rendering approaches.
Huawei Technologies is a leading global information and communications technology (ICT) solutions provider. Huawei is involved in the development of the digital sector and supports research and development in Europe. Huawei Nice Research Center is located in the Sophia Antipolis Technology Park. Among other missions, the team in Huawei Nice Research Center innovates in the field of low power 3D graphics rendering systems for high-end smartphone gaming use cases.
Abstract:
A typical 3D mobile game uses different hardware components, like CPUs, GPUs and NPUs, while paying attention to memory bandwidth. The raise of machine learning and its use for reducing computation costs encourage a higher use of NPUs with a better distribution of computations between CPU/GPU/NPU.
Our goal is to provide an optimized system solution for high-visual quality with low compute energy cost and low latency. For a mobile manufacturer, system optimization requires considering together chip design and software.
In this presentation, I will present the typical mobile game pipeline and the different necessary/possible hardware optimizations in order to reach real time. I will review some existing deep learning solutions relevant for 3D gaming, with a special focus on super-resolution and frame-generation approaches. I will explain what the challenges are when implementing deep learning for accelerating mobile game rendering framerate and how it fits in the game pipeline.
Sylvain Lefebvre (INRIA)
Bio:
Sylvain Lefebvre is a senior researcher at Inria (France), where he leads the MFX team. His main research focus is on geometry modeling, processing and procedural synthesis in the context of additive manufacturing, most often targeting GPU algorithms. Together with his team Sylvain revisited many aspects of the 3D printing processing pipeline, introducing novel path planning techniques, novel support generation, novel infill methodologies based on procedural methods, adaptive slicing, varying width deposition and non-planar slicing. Sylvain received the EUROGRAPHICS Young Researcher Award in 2010. From 2012 to 2017 he was the principal investigator of the ERC ShapeForge (StG) and IceXL (PoC) projects. Sylvain is a recurring member of the EUROGRAPHICS, SIGGRAPH and SIGGRAPH Asia program committees and the EUROGRAPHICS paper advisory board. He was associate editor for TOG from 2012 to 2017 and served as program co-chair for EG (short papers and STAR), SMI and SGP (2024).He created and is the lead developer of the IceSL software for additive manufacturing, which regroups most of his team's research since 2012.
Abstract:
This seminar will focus on how to design shapes and plates that exhibit specific behaviors thanks to a precise control of their fabrication process. Specifically, by orienting the deposition trajectories of a fused filament 3D printer, we introduce anisotropies that impact the observed properties of the final object. In one case, the orientations trigger anisotropic deformations under heat, allowing a plate to take a target curved shape. In the second case, the changes in deposition orientation trigger an anisotropic light reflectance, creating brushed-metal effects on the surface of the 3D printed object.
Both approaches rely on the optimization of oscillating fields, a topic we initially explored in the context of Computer Graphics, and that naturally evolved toward fabricating shapes with anisotropic structures.
"Shrink-and-morph"
- https://inria.hal.science/hal-04252044
- video: https://www.youtube.com/watch?v=YNufMqcDk5I
"Orientable Dense Cyclic Infill for Anisotropic Appearance Fabrication"
- https://xavierchermain.github.io/fdm_aa/
- https://youtu.be/aUDzZrlRnNU
Nabil Madali (INSA Rennes)
L'estimation du mouvement dans les vidéos holographiques nécessitent d'extraire et d?analyser les variations de la géométrie 3D à partir du signal holographique. Malheureusement, récupérer la scène à partir d?un seul hologramme numérique est un problème inverse mal posé pour lequel aucune solution exacte n?existe. En effet, l?onde lumineuse diffusée par chaque point de la scène contribue à chaque pixel lors de l?enregistrement de l'hologramme. Par conséquent, le signal holographique brouille les informations de la scène 3D, qui ne peuvent pas être récupérées directement. En particulier, un léger changement dans la scène se traduit par des motifs holographiques très différents, faisant de l'estimation de mouvement un sujet de recherche non trivial. Pour résoudre cette problématique, nous allons étudier plusieurs approches basées sur les outils d?analyse espace-fréquence et d'apprentissage profond pour récupérer la scène à partir des données holographiques. Ensuite, les mouvements des objets de la scène seront estimés à partir de ces données extraites. Des expérimentations approfondies seront menées pour évaluer d'une part les performances des méthodes proposées pour le recouvrement de la géométrie de la scène sous diverses conditions, et d?autre part les effets des dégradations commises sur la qualité des vecteurs de mouvement estimés.
Guenole Fiche (Centrale Supelec)
Previous works on Human Pose and Shape Estimation (HPSE) from RGB images can be broadly categorized into two main groups: parametric and non-parametric approaches. Parametric techniques leverage a low-dimensional statistical body model for realistic results, whereas recent non-parametric methods achieve higher precision by directly regressing the 3D coordinates of the human body mesh.
This work introduces a novel paradigm to address the HPSE problem, involving a low dimensional discrete latent representation of the human mesh and framing HPSE as a classification task. Instead of predicting body model parameters or 3D vertex coordinates, we focus on predicting the proposed discrete latent representation, which can be decoded into a registered human mesh. This innovative paradigm offers two key advantages. Firstly, predicting a low-dimensional discrete representation confines our predictions to the space of anthropomorphic poses and shapes even when little training data is available. Secondly, by framing the problem as a classification task, we can harness the discriminative power inherent in neural networks.
The proposed model, VQ-HPS, predicts the discrete latent representation of the mesh. The experimental results demonstrate that VQ-HPS outperforms the current state-of-the-art non-parametric approaches while yielding results as realistic as those produced by parametric methods when trained with few data. VQ-HPS also shows promising results when training on large-scale datasets, highlighting the significant potential of the classification approach for HPSE.
Fabien Castan (Technicolor Creative Studios)
In this presentation, we will present the latest status of Meshroom and introduce the new Meshroom-Research project.
This new project has been designed for experimenting, nurturing new ideas and benchmarking various methods.
Benjamin Coupry (IRIT)
Numerous multi-view 3D reconstruction solutions are available to the general public. However, these methods cannot reconstruct high frequencies as accurately as the so-called photometric stereo (PS) method, which involves estimating the shape and reflectance of a surface from several photographs obtained from the same viewpoint, under different illumination conditions.
In order to estimate the illumination conditions at the time of shooting, it is customary to use a sphere. This approach has a number of disadvantages. First, it can be tedious to install a sphere in the scene without obscuring it. Moreover, this method estimates illumination at a single point in the scene. The latter, in the absence of any more reliable information, is then wrongly generalized to the entire scene. Poor illumination estimation degrades the low frequencies of the PS reconstruction. We therefore propose to locally estimate the illumination of the scene from a coarse reconstruction obtained by photogrammetry, in order to improve PS results.
Hala Djeghim (Univ. Saclay, Huawei)
Neural implicit surface representation methods have recently shown impressive 3D reconstruction results. However, existing solutions struggle to reconstruct urban outdoor scenes due to their large, unbounded, and highly detailed nature. Hence, to achieve accurate reconstructions, additional supervision data such as LiDAR, strong geometric priors, and long training times are required. To tackle such issues, we present SCILLA, a new hybrid implicit surface learning method to reconstruct large driving scenes from 2D images. SCILLA?s hybrid architecture models two separate implicit fields: one for the volumetric density and another for the signed distance to the surface. To accurately represent urban outdoor scenarios, we introduce a novel volume-rendering strategy that relies on self-supervised probabilistic density estimation to sample points near the surface and transition progressively from volumetric to surface representation. Our solution permits a proper and fast initialization of the signed distance field without relying on any geometric prior on the scene, compared to concurrent methods. By conducting extensive experiments on four outdoor driving datasets, we show that SCILLA can learn an accurate and detailed 3D surface scene represention in various urban scenarios while being two times faster to train compared to previous state-of-the-art solutions.
Date : 2024-05-30
Lieu : Sorbonne Université (Jussieu), Amphi Durand
Thèmes scientifiques :
Inscriptions closes à cette réunion.
Accéder au compte-rendu de cette réunion.
(c) GdR IASIS - CNRS - 2024.