Apprentissage et reconnaissance des formes en signal et images

Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.

Inscriptions closes à cette réunion.

Inscriptions

29 personnes membres du GdR ISIS, et 32 personnes non membres du GdR, sont inscrits à cette réunion.
Capacité de la salle : 75 personnes.

Annonce

L'apprentissage statistique est un domaine de recherche à la croisée des statistiques, de l'informatique et de l'optimisation. Son objectif est de modéliser des systèmes complexes à partir d'exemples.

Cette problématique est particulièrement centrale en traitement du signal et des images où des capteurs, potentiellement hétérogènes et en grand nombre, délivrent d’importantes masses de données généralement bruitées. Le succès des méthodes d'apprentissage est principalement dû à leur flexibilité, et aux solutions efficaces et souvent élégantes auxquelles elles conduisent, même pour d'importantes masses de données.

L'objectif de cette deuxième journée est de faire le point sur certaines nouvelles voies explorées par la communauté, en lien des applications en traitement du signal et des images.

Organisateurs

Grégoire Mercier, ENST Bretagne, gregoire.mercier@enst-bretagne.fr
Cédric Richard, Laboratoire Fizeau, cedric.richard@unice.fr
Alain Rakotomamonjy, LITIS, Université de Rouen, alain.rakoto@insa-rouen.fr

Programme

Lieu : Amphi Emeraude ParisTech Rue Barrault

10h00 - 10h45 S. Canu : Apprentissage et factorisation de matrices : application au filtrage collaboratif
10h45 - 11h15 V. Emiya : Matching Pursuit with Stochastic Selection
11h15 - 11h45 A. Barachant : Un noyau pour la classification des matrices de covariances : applications aux interfaces cerveau-machine

13h30 - 14h15 M. Cord : Image classification : BoW is beautiful
14h15 - 14h45 G. Mesnil : Learning Semantic Representations of Objects and Parts
14h45 - 15h15 M. Fauvel : Parsimonious Mahalanobis Kernel for the Classification of High Dimensional Data

15h30 - 16h00 D. Tuia : Human-machine interaction for the processing of remote sensing images
16h00 - 16h30 R. Flamary : Selecting from an infinite set of features

Résumés des contributions

Apprentissage et factorisation de matrices : application au filtrage collaboratif

S. Canu, LITIS.

De nombreux sites WEB proposent aujourd'hui des conseils personnalisés élaborés par un système de recommandation, un ensemble de logiciels et de méthodes permettant de suggérer automatiquement des articles pouvant être utiles à un utilisateur. L'arrivée des réseaux sociaux en ligne a ajouté une nouvelle dimension à la recommandation, dans laquelle la structure du graphe social peut être utilisée comme source d'informations. Nous allons proposer un état de l'art de méthodes de recommandation orientées modèle, que sont les méthodes de factorisation dont l'intérêt a été démontré lors du challenge Netflix. Il sera complété par la présentation d'une étude de cas qui prend en compte le réseau social et qui réalise de la prédiction sur des données d'appréciation implicite (\textit{implicit feedback}). Nous avons testé ce modèle sur des données réelles volumineuses et observé l'amélioration des performances obtenues par d'autres méthodes de l'état de l'art.

Matching Pursuit With Stochastic Selection

V. Emiya, LIF, Marseille.

We propose a stochastic selection strategy designed to address sparse approaches with large dictionaries, by accelerating the Matching Pursuit algorithms. This strategy consists of randomly selecting a subset of atoms and a subset of rows in the full dictionary at each step of the Matching Pursuit to obtain a sub-optimal but fast atom selection. The performance of the proposed algorithm is studied in terms of approximation accuracy (decrease of the residual norm), of exact-sparse recovery and of audio declipping of real data. Numerical experiments show the relevance of the approach. The proposed Stochastic Selection strategy is presented with Matching Pursuit but applies to any pursuit algorithms provided that their selection step is based on the computation of correlations.

Cowork with Thomas Peel, Liva Ralaivola, Sandrine Anthoine.

Un noyau pour la classification des matrices de covariance : applications aux interfaces cerveau-machine

A. Barachant, GIPSA-Lab.

La matrice de covariance est un élément central de nombreux algorithmes de traitement du signal. Dans le cadre des interface cerveau-machine, les matrices de covariance du signal EEG peuvent être utilisée comme descripteur de l'activité cérébrale, on cherche donc à les classer dans le but de détecter le type de commande que l'utilisateur souhaite envoyer à l'interface. Ces matrices de covariance sont, par définition, symétriques et définie positives. Elle appartienne à une variété Riemannienne munie d'une métrique permettant d'effectuer des manipulations sur les matrices de covariance en respectant la topologie de cette variété. A partir de cette métrique, un noyau sera défini, et appliqué a la classification par le biais d'un SVM, donnant des résultat supérieurs a l'état de l'art.

Image classification: BoW is beautiful

M. Cord, LIP6.

In this talk, I will focus on recent developments on image classification. The Bag-of-(Visual)-Words (BoVW) model is the most widely used approach to represent visual documents. BoVW relies on the quantization of local descriptors and their aggregation into a single feature vector. The underlying concepts, such as the visual codebook, coding and pooling, will be introduced. I will also explore the impact of the main parameters of the BoVW pipeline. Recently, unsupervised learning methods have emerged to jointly learn visual codebooks and codes. I will present approaches based on restricted Boltzmann machines (RBM) to achieve this joint optimization. To enhance feature coding, RBMs may be regularized with a sparsity constraint term. I will show experimental results of this code learning strategy embedded in the BoVW pipeline for image classification. I will also make a few propositions about spatial pooling (sum/max, spatial pyramid, etc.). Finally, some extensions concerning hierarchical models and deep learning approaches for image representation will also be discussed.

Learning Semantic Representations of Objects and Parts

G. Mesnil

Recently, large scale image annotation datasets have been collected with millions of images and thousands of possible annotations. Latent variable models,or embedding methods, that simultaneously learn semantic representations of object labels and image representations can provide tractable solutions on such tasks. In this work, we are interested in jointly learning representations both for the objects in an image, and the parts of those objects, because such deeper semantic representations could bring a leap forward in image retrieval or browsing. Despite the size of these datasets, the amount of annotated data for objects and parts can be costly and may not be available. In this paper, we propose to bypass this cost with a method able to learn to jointly label objects and parts without requiring exhaustively labeled data. We design a model architecture that can be trained under a proxy supervision obtained by combining standard image annotation (from ImageNet) with semantic part-based within-label relations (from WordNet). The model itself is designed to model both object image to object label similarities, and object label to object part label similarities in a single joint system. Experiments conducted on our combined data and a precisely annotated evaluation set demonstrate the usefulness of our method.

Parsimonious Mahalanobis Kernel for the Classification of High Dimensional Data

M. Fauvel, Université de Toulouse.

The classification of high dimensional data with kernel methods is considered in this article. Exploiting the emptiness property of high dimensional spaces, a kernel based on the Mahalanobis distance is proposed. The computation of the Mahalanobis distance requires the inversion of a covariance matrix. In high dimensional spaces, the estimated covariance matrix is ill-conditioned and its inversion is unstable or impossible. Using a parsimonious statistical model, namely the High Dimensional Discriminant Analysis model, the specific signal and noise subspaces are estimated for each considered class making the inverse of the class specific covariance matrix explicit and stable, leading to the definition of a parsimonious Mahalanobis kernel. A SVM based framework is used for selecting the hyperparameters of the parsimonious Mahalanobis kernel by optimizing the so-called radius-margin bound. Experimental results on three high dimensional data sets show that the proposed kernel is suitable for classifying high dimensional data, providing better classification accuracies than the conventional Gaussian kernel.

Human-machine interaction for the processing of remote sensing images

D. Tuia, EPFL Lausanne.

Remote sensing is nowadays widely used for a panoply of survey-based tasks such as the monitoring of agriculture or the planning of cities. Satellite or airborne sensors can provide updated, non-intrusive and large-scale information about the processes occurring at the surface of the Earth. Even if appealing, the use of remote sensing data passes through a modelling step to make pixel information intelligible. Among the different approaches, supervised methods are the most used, but they need some labelled information to be effective and the collection of those samples can be costly and time consuming. Contrarily to common space-filling techniques or (stratified) random sampling strategies, I will present sampling strategies based on active learning, which is a learning framework, where the model interacts with the user, the first highlighting spectral configurations that are uncertain and the latter providing the corresponding labels, thus allowing faster convergence to models related to better generalization. In two words: sample less, but better. After presenting the general concept of active learning, I will focus on two issues: the transfer of models across different hyperspectral images and the integration of fallible users in the active learning loop.

Selecting from an infinite set of features

R. Flamary, Laboratoire Lagrange, Université de Nice.

We propose a principled framework for learning with infinitely many features, situations that are usually induced by continuously parametrized feature extraction methods. Such cases occur for instance when considering Gabor-based features in computer vision problems or when dealing with Fourier features for kernel approximations. We cast the problem as the one of finding a finite subset of features that minimizes a regularized empirical risk. After having analyzed the optimality conditions of such a problem, we propose a simple algorithm which has the flavour of a column-generation technique. We also show that using Fourier-based features, it is possible to perform approximate infinite kernel learning. Our experimental results on several datasets show the benefits of the proposed approach in several situations including texture classication and large-scale kernelized problems (involving about 100 thousand examples).

Identification