Apprentissage faiblement supervisé ou non supervisé pour l'analyse d'images et de video

Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.

Inscriptions closes à cette réunion.

Inscriptions

73 personnes membres du GdR ISIS, et 58 personnes non membres du GdR, sont inscrits à cette réunion.
Capacité de la salle : 160 personnes.

Annonce

L'apprentissage supervisé est au coeur des techniques actuelles de computer vision et d'analyse d'images et de vidéos. Un des limitations des méthodes d'apprentissage supervisé est la nécessité de disposer de grandes bases de données étiquetées. Cet étiquetage peut être coûteux, voire impossible. Les approches d'apprentissage faiblement supervisé permettent de contourner le problème, en utilisant à la fois des données étiquetées ou non-étiquetées, ou des données partiellement étiquetées.

L'objectif de la journée sera de faire le point sur les techniques d'apprentissage non-, semi- ou faiblement supervisé, de transfert de connaissance, de multiple instance learning, pour l'analyse d'images et de vidéos, mais aussi pour l'annotation automatique ou semi-automatique de grandes bases d'images, où de l'apprentissage incrémental est en jeu.
Cette journée est organisée conjointement entre le thème transverse T Apprentissage pour l'analyse du signal et l'action "Analyse, traitement et décision pour les données massives et multimodales en sciences du vivant" du thème B Image et Vision. Elle est ouverte à des contributions théoriques dans le domaine de l'apprentissage partiellement ou non-supervisé ains qu'à des applications dans le domaine de la vision par ordinateur et de l'analyse d'images ou de séquences d'images médicales.

Le programme comporte trois conférences invitées :
- "Weakly-Supervised Localization and Classification of Proximal Femur Fractures", Diana Mateus, Laboratoire des Sciences du Numérique de Nantes (LS2N, UM 6004)
- "Apprentissage semi-supervisé et faiblement supervisé", Nicolas Thome, CEDRIC lab, CNAM Paris
- "Learning with less labels in medical image analysis", Veronika Cheplygina, Medical Image Analysis group, Eindhoven University of Technology, The Netherlands

Les résumés de leurs présentations suivent.

La journée inclut également des communications pour lesquelles nous lançons un appel à contributions. Si vous souhaitez présenter vos travaux, merci d'envoyer vos propositions pour le 22 avril 2019 au plus tard (titre, auteurs, affiliation, résumé de 15 lignes) aux organisateurs :

Christian Wolf: christian.wolf AT liris.cnrs.fr

Carole Lartizien: Carole.Lartizien AT creatis.insa-lyon.fr

Caroline Petitjean: caroline.petitjean AT univ-rouen.fr

Su Ruan: su.ruan AT univ-rouen.fr

************************

Weakly-Supervised Localization and Classification of Proximal Femur Fractures

Pr Diana Mateus, Laboratoire des Sciences du Numérique de Nantes (LS2N, UM 6004)

************************

We target the problem of fracture classification from clinical X-Ray images towards an automated Computer Aided Diagnosis (CAD) system. Although primarily dealing with an image classification problem, we argue that localizing the fracture in the image is crucial to make good class predictions. We therefore analyze several schemes for simultaneous fracture localization and classification, and show that using an auxiliary localization task, improves the classification performance. Moreover, with recent advancements in weakly-supervised deep learning we demonstrate it is possible to localize the fractures and improve the classification performance even without bounding box localization annotations for training.

************************

Apprentissage semi-supervisé et faiblement supervisé

Pr Nicolas Thome, CEDRIC lab, CNAM Paris

************************

Les méthodes d'apprentissage profond connaissent actuellement un succès important lié à leurs très bonnes performances prédictives. Une limitation pratique importante relative à leur utilisation a trait à la nécessité de disposer de larges volumes de données annotées. Dans cette présentation, je présenterai des solutions basées sur l'apprentissage faiblement et semi-supervisé pour surmonter ce problème. L'apprentissage faiblement supervisé consiste à entraîner un modèle pour effectuer des prédictions fines à partir d'annotations grossières. Ce type d'approche est particulièrement pertinent lorsque le niveau de granularité de l'annotation fine est coûteux, par exemple pour la segmentation d'images. Après un survol des des méthodes de l'état de l'art s'appuyant sur des modèles de deep learning pour la reconnaissance visuelle, je présenterai des modèles de "negative evidence", consistent à apprendre des prédictions locales permettant de modéliser l'absence d'une classe. L'apprentissage semi-supervisé consiste à s'appuyer sur des données non annotées pour améliorer la performance de modèles appris avec peu d'exemples. Je présenterai les solutions pratiques de la littérature, en particulier autour des approches s'appuyant sur un critère de reconstruction, ainsi que celles exploitant la stabilité prédictives des modèles. Je détaillerai ensuite une méthode basée sur une architecture hybride, permettant de découpler les représentions discriminantes pour un problème de reconnaissance de celle utilisées pour la reconstruction des données.

************************

Learning with less labels in medical image analysis

Dr. ir. Veronika Cheplygina, Medical Image Analysis group, Eindhoven University of Technology, The Netherlands

************************

Machine learning (ML) has vast potential in medical image analysis, improving possibilities for early diagnosis and prognosis of disease. However, ML needs large amounts of representative, annotated examples for good performance. The annotation process, often consisting of outlining structures in (possibly 3D) medical images, is time-consuming and expensive. Furthermore, annotated data may not always be representative of new data being acquired, for example due to changes in scanners and scanning protocols. In this talk I will give an overview of approaches such as multiple instance learning and transfer learning, used to address these challenges, and discuss examples from my own work on classifying chronic obstructive pulmonary disease (COPD) in chest CT images.

Programme

9h25-9h30 Ouverture

9h30-10h30 Apprentissage semi-supervisé et faiblement supervisé

Pr. Nicolas THOME, orateur invité, CEDRIC lab, CNAM Paris

10h30-10h50 NonAdjLoss : A Semi-Supervised Non-Adjacency Constraint for Semantic Segmentation

Pierre-Antoine Ganaye, Michaël Sdika, Bill Triggs, Hugues Benoit-Cattin, INSA-Lyon, Université Claude Bernard Lyon 1, UJM-Saint Etienne, CREATIS, Laboratoire Jean Kuntzmann

10h50-11h10 SMILE, a deep learning method for training ConvNets with partially annotated data for the task of semantic segmentation.

Olivier Petit, Nicolas Thome, Arnaud Charnoz, Alexandre Hostettler, Luc Soler , CEDRIC, CNAM

11h10-11h25 pause

11h25-11h45 Développement de métriques faiblement supervisées en complément d'un apprentissage non supervisé - Entraînement concurrent à l'aide de réseaux auxiliaires

Anaël Leinert, Dominique Houzet, Gipsa-Lab, équipe AGPIG

11h45-12h45 Learning with less labels in medical image analysis

Veronika CHEPLYGINA, orateur invitée, Eindhoven University of Technology

12h45-14h Déjeuner

14h-15h Weakly-Supervised Localization and Classification of Proximal Femur Fractures

Pr Diana Mateus, orateur invitée, Laboratoire des Sciences du Numérique de Nantes (LS2N, UM 6004)

15h-15h20 weakly-supervised deep learning approaches for sound event detection

Thomas Pellegrini, Léo Cances, IRIT, Université Toulouse III Paul Sabatier

15h20-15h40 Towards Semi-supervised Segmentation of Organs at Risk Using Deep Convolutional Neural Networks

Rosana El Jurdi, Caroline Petitjean, Paul Honeine, Fahed Abdallah, LITIS Lab Université de Rouen Normandie, Université Libanaise, ICD UTT

15h40-16h00 Détection d?objects faiblement supervisée dans des images d?art weakly-supervised

Nicolas Gonthier, Yann Gousseau, Saïd Ladjal, LTCI, Télécom ParisTech

16h00-16h15 pause

16h15-16h35 Deep Embedded Clustering for lymphoma segmentation in PET images

Haigen Hu, Jerome Lapuyade-Lahorgue, Pierre Decazes, Su Ruan, LITIS-Quantif, Université de Rouen Normandie.

16h35-16h55 Deep unsupervised ensemble clustering

Severine Affeldt, Lazhar Labiod and Mohamed Nadif, LIPADE Laboratory, University of Paris

16h55-17h15 Annotation automatique de documents par clustering image/texte

Nicolas Audebert, Catherine Herold, Kuider Slimani, Société Quicksign

17h15-17h30 Table ronde

Résumés des contributions

Présentations courtes :

1) Titre : NonAdjLoss : A Semi-Supervised Non-Adjacency Constraint for Semantic Segmentation

Auteurs : Pierre-Antoine Ganaye, Michaël Sdika, Bill Triggs, Hugues Benoit-Cattin

Affiliations : Univ Lyon, INSA-Lyon, Université Claude Bernard Lyon 1, UJM-Saint Etienne, CNRS, Inserm, CREATIS UMR 5220, U1206, F-69100, Lyon, France, Laboratoire Jean Kuntzmann, B.P. 53, 38041 Grenoble Cedex 9, France

Résumé : The advent of deep learning has pushed medical image analysis to new levels, rapidly replacing more traditional machine learning and computer vision pipelines. However segmenting and labelling anatomical regions remains challenging owing to appearance variations, imaging artifacts, the paucity and variability of annotated data, and the difficulty of fully exploiting domain constraints such as anatomical knowledge about inter-region relationships. We address the last point, improving the network's region-labeling consistency by introducing NonAdjLoss, an adjacency-graph based auxiliary training loss that penalizes outputs containing regions with anatomically-incorrect adjacency relationships. NonAdjLoss supports both fully-supervised training and a semi-supervised extension in which it is applied to unlabeled supplementary training data. The approach significantly reduces brain-structure segmentation anomalies on the MICCAI-2012 and IBSRv2 MRI datasets, especially when semi-supervised training is included.

2) Titre : weakly-supervised deep learning approaches for sound event detection

Auteurs : Thomas Pellegrini, Léo Cances

Affiliation : IRIT, Université Toulouse III Paul Sabatier

Résumé: Weakly-supervised approaches aim at lowering the need for carefully annotated and are a way to eventually strengthen the generalization power of the models regarding unseen conditions. In this contribution, I propose to review weakly-supervised deep learning approaches in audio content analysis and in particular for the task of Sound Event Detection (SED). SED systems aim to detect possibly overlapping audio events, and locate the events temporally in recordings, i.e. determining event onsets and offsets. I will present state-of-the-art deep neural networks for SED trained when only "weak labels" are available for learning. Weak labels refer to audio tags at recording level with no information on temporal onsets and offsets of the annotated events. I will review two main research directions: i) the introduction of attention mechanisms in the network architecture, ii) the use of Multiple Instance Learning inspired objective functions. I will comment on their limitations and how these could be overcome. In particular, the introduction of a similarity penalty between predictions of co-occurring classes seems to increase the discriminative power of the models.

3) Titre : Annotation automatique de documents par clustering image/texte

Auteurs : Nicolas Audebert, Catherine Herold, Kuider Slimani

Affiliation : société Quicksign (https://www.quicksign.com/).

Résumé : Avec l'essor du numérique, la plupart des démarches administratives se fait aujourd'hui via la transmission de documents scannés ou photographiés. Les techniques d'OCR et de classification par apprentissage profond ont permis d'automatiser une grande partie du traitement de ces documents. Cependant, de nouveaux formats et de nouveaux types de documents apparaissent au cours du temps, contraignant les spécialistes à renouveler régulièrement leurs annotations afin de maintenir constantes les performances du modèle. En particulier, les réseaux convolutifs profonds requièrent plusieurs milliers de documents annotés pour leur supervision. Nous nous intéressons ainsi à trois problématiques :

- Comment accélérer le processus d'annotation ?

- Comment découvrir de nouvelles classes de documents ?

- Comment exploiter les documents non-annotés pour la classification ?

Pour ce faire, nous nous basons sur des caractéristiques visuelles et textuelles extraites des documents. Nous comparons plusieurs techniques de réduction de dimension (ACP,t-SNE, umap) et de clustering (k-moyennes, HDBSCAN). Nous montrons qu'un réseau profond générique (type VGG), ou entraîné sur peu de documents administratifs, permet d'extraire des caractéristiques particulièrement robustes pour l'exploration des jeux de données, accélérant significativement le processus d'annotation. Par ailleurs, l'exploration de l'espace latent permet en outre d'identifier des classes inconnues, mais aussi de mieux comprendre les invariances du modèles ou ses cas d'échec.

4) Titre : Deep Embedded Clustering for lymphoma segmentation in PET images

Auteurs : Haigen HU, Jerome LAPUYADE-LAHORGUE, Pierre DECAZES, Su RUAN

Affiliation : LITIS- QUANTIF, Université de Rouen Normandie

Résumé : In medical image analysis, it is usually difficult to obtain a huge annotated dataset

owing to the costliness and scarcity of expert annotation. In this work, an unsupervised method is proposed for 3D lymphoma segmentation from PET images by adopting the framework of Deep Embedded Clustering (DEC), which is consist of a deep autoencoder and a clustering algorithm. The algorithm learns feature representations from the data space and optimizes a clustering objective to group assignments. Different clustering objective functions such as KullbackLeibler (KL) divergence and entropy are investigated. A series of comparison experiments are conducted on 14 patients having DLBCL (Diffuse Large B-Cell Lymphoma).

5) Titre : Deep unsupervised ensemble clustering

Auteurs : Severine Affeldt, Lazhar Labiod and Mohamed Nadif

Affiliation : LIPADE Laboratory, University of Paris, France

Résumé: We report a novel clustering method that combines the advantages of deep learning and ensemble strategy. In several studies, the sequential or joint association of a classical partitioning method to a deep architecture generally improves the clustering of large datasets. Yet, these methods are penalized by a necessary optimal hyperparameter setting (e.g., weights initialization, structure design).

To circumvent this issue, we combine several unsupervised deep models before applying a spectral clustering. The efficiency and originality of our ensemble method are based on the concatenation of several affinity matrices whose sparsity is obtained from representative data points. Our approach does not require pretraining and enables an ensemble strategy either on deep structure designs, epoch numbers or weights initialization. The robustness and clustering performance of our method is demonstrated by various experiments on real and synthetic images datasets.

6) Titre : Détection d?objects faiblement supervisée dans des images d?art

Auteurs : Nicolas Gonthier, Yann Gousseau, Saïd Ladjal

Affiliation : LTCI, Télécom ParisTech

Résumé: Les méthodes d?apprentissage automatique et plus particulièrement les réseaux de neurons profonds [GBC16] ont récemment obtenu des résultats expérimentaux remarquables pour de la classification d?images avec des milliers de catégories visuelles différentes mais aussi pour la détection d?objets [RHGS15] . Cette tâche ne consiste plus uniquement à determiner si une image contient un objet ou pas mais aussi à le localier correctement dans l?image. Cependant l?entraînement de ces réseaux de neurones requière de nombreuses images annotées avec la localisation précise des objets. Dans le cadre de la détection d?objets dans les d??uvres d?art graphiques, les bases sont souvent petites et faiblement annotées, c?est à dire que l?on n?a des mots-clés d?au niveau de l?image entière. Dans ce contexte, le transfert d?apprentissage des algorithmes de détection peut être une méthode intéressante, de la même manière qu?il a été proposé de transférer des réseaux de classification d?images naturelles à des images de peintures par Crowley et al. [CZ16]. Pour effectuer ce transfert, nous avons proposé un modèle simple et rapide qui est une extension du perceptron en se plaçant dans le cadre de l?apprentissage multi-instances. Cela nous permet d?utiliser sur des bases d?images d?art faiblement annotées des réseaux entraînés pour effectuer de la détection sur des bases d?images photographiques. Cela nous permet d?obtenir des résultats proches de l?état de l?art [GGLB18] en étant compétitif avec les méthodes plus classiques d?apprentissage multi-instances.

[CZ16] Elliot J. Crowley and Andrew Zisserman. The Art of Detection. In European Conference on Computer Vision, pages 721?737. Springer, 2016. [GBC16] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016.

[GGLB18] N. Gonthier, Y. Gousseau, S. Ladjal, and O. Bonfait. Weakly supervised object detection in artworks. In Computer Vision ? ECCV 2018 Workshops, pages 692?709. Springer International Publishing, 2018.

[RHGS15] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks. arXiv :1506.01497 [cs], June 2015.

7) Titre : Développement de métriques faiblement supervisées en complément d'un apprentissage non supervisé - Entraînement concurrent à l'aide de réseaux auxiliaires

Auteurs: Anaël Leinert, Dominique Houzet

Affiliation: Gipsa-Lab, équipe AGPIG

Résumé: Les problèmes de machine learning les plus populaires actuellement sont sans doute ceux du domaine de la vision. Cet engouement peut s?expliquer par le succès des algorithmes convolutionnels, particulièrement adaptés pour une implémentation sur le matériel disponible (GPU), ainsi que par les nombreuses applications possibles (imagerie médicale, sécurité, robotique, aide visuelle de substitution...).

Cependant, le comportement instable et/ou imprévisible de ces approches les rend encore peu adaptés à ces usages, pour lesquels les exigences de fiabilité sont élevés. Je me propose donc, lors de mon stagede Master, de plonger au coeur de l?apprentissage des réseaux de neurones convolutionnels, et plus particulièrement les modèles générateurs.Les images générées par un framework de type non supervisé, tel que les VAE (variational auto-encoder) présentent des régularités dans leur encodage, résultant principalement d'une optimisation liée à la taille de l'encodage disponible, les images similaires au sens des moindres carrés (ou toute autre distance choisie lors de l'entraînement) se retrouvant systématiquement voisines dans l'espace de représentation du VAE (ou espace latent).

Fort de ces observations, et afin de bénéficier des avantages de l'approche non supervisée tout en préservant une certaine cohérence, je propose ici de mettre en place de nouvelles métriques tenant compte de ces cas ambigus afin de résorber les biais non souhaités menant à des interprétations fausses.

8) Titre : SMILE, a deep learning method for training ConvNets with partially annotated data for the task of semantic segmentation

Auteurs: Olivier Petit, Nicolas Thome, Arnaud Charnoz, Alexandre Hostettler, Luc Soler

Affiliations: laboratoire CEDRIC, CNAM

Résumé: Fully automatic segmentation of medical images is a major challenge and is extensively study in the medical imaging community. Recently deep convolutional neural networks (ConvNets) have brought impressive results on this task. However, they need a large amount of data to be trained and the annotation process is very expensive and time-consuming, especially for medical images. Moreover, clinical experts often focus on specific anatomical structures and thus, produce partially annotated images. In this presentation, I will talk about SMILE a method for training deep ConvNets with incomplete annotations. The first contribution aims to identify ambiguous labels in order to ignore them during training, and thus, avoid propagating incorrect or noisy information. A second step we called SMILEr, consists in automatically relabeling missing annotations using a curriculum strategy. We performed experiments on a abdominal CT-scans dataset composed of three organs (liver, stomach and pancreas) and we show the relevance of the method for the task of semantic segmentation. With 70% of missing annotations, SMILEr performs similarly as a baseline trained with complete ground truth annotations.

9) Titre: Towards Semi-supervised Segmentation of Organs at Risk Using Deep Convolutional Neural Networks

Auteurs: Rosana El Jurdi, Caroline Petitjean(1), Paul Honeine(1), Fahed Abdallah(2.3)

Affiliations: LITIS Lab, Université de Rouen Normandie, France (2)Université Libanaise, Hadath, Beyrouth, Liban (3)ICD, M2S, Universit´e de technologie de Troyes, France

Résumé: One of the main challenging research notions today within the medical field generally, and the cancer treatment field particularly, is the process of automatic segmentation of Organs At Risk (OAR). Current trends dedicated for this task usually involve the use of powerful machine learning tools such as deep convolutional neural network. However, these later often require huge amounts of data as a condition to gain their high generalization ability. Within the medical domain, annotating data is a tedious process that is subjected to many constraints. Rather, inaccurate or incomplete data annotations, such as bounding boxes or seeds, are often more popular. The objective of our study is to infer relations between bounding boxes and label segments. We dedicate for this purpose two machine learning models, a primary model and an ancillary one. On one hand, the ancillary model takes into consideration the image, the bounding box, as well as the segmentation mask label of a dataset for the organs segmentation. This model is trained in a fully supervised manner and is later on used to infer label estimates for a much larger weakly annotated dataset where only the bounding box information is available. On the other hand, the primary model improves on the label estimates provided by the ancillary model for better segmentation accuracies. Within this context, multiple training and input/output scenarios are considered for accurate inference of label estimates and automatic segmentation of the OAR. Experiments conducted on the Segmentation of THoracic Organs at Risk in CT images (SegTHOR) dataset illsutrate the relevance of the proposed method.

Identification

Apprentissage faiblement supervisé ou non supervisé pour l'analyse d'images et de video

Inscriptions

Annonce

Programme

Résumés des contributions