Echantillonnage Monte Carlo et Apprentissage Statistique

Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.

Inscriptions closes à cette réunion.

Inscriptions

67 personnes membres du GdR ISIS, et 52 personnes non membres du GdR, sont inscrits à cette réunion.
Capacité de la salle : 144 personnes.

Annonce

Les développements récents en traitement du signal et des images conduisent à des problèmes d'estimation de plus en plus difficiles, de part leur forme complexe, le grand nombre de variables à estimer, et la dimension croissante des données mises en jeu.

L'objectif de cette journée est de présenter de nouvelles méthodologies Monte Carlo, et leurs interactions avec l'apprentissage statistique notamment via la modélisation Bayésienne et l'optimisation stochastique.

Nous aurons deux exposés tutoriels par Francis Bach (DIENS, INRIA) et Nicolas Chopin (CREST, ENSAE) relatifs, respectivement à l'optimisation stochastique pour l'apprentissage, et à des méthodes de Monte Carlo accélérées. La journée contiendra aussi des exposés invités et des exposés contribués.

La journée se déroulera à Telecom ParisTech (46 rue Barrault, Paris 13ème), Amphi B310

Elle commencera à 10h, pour se terminer avant 17h00.

Orateurs confirmés :

Francis Bach (DIENS, INRIA): ``Stochastic gradient descent methods for machine learning''
Julien Bect (L2S, CentraleSupelec): ``On the use of SMC methods in the sequential design of computer experiments''
Nicolas Chopin (CREST, ENSAE): ``Sequential Monte Carlo: some recent developments''
Michal Valko (Sequel, INRIA): ``Pliable rejection sampling''
Herwig Wendt (IRIT, CNRS): ``A Bayesian estimator for the multifractal analysis of multivariate data''

Appel à contribution :

Les personnes souhaitant présenter leurs travaux sont invitées à faire part de leur intention aux deux organisateurs avant le 15 janvier.

Organisateurs :

Gersende Fort (IMT, CNRS) : gersende.fort@math.univ-toulouse.fr
François Septier (IMT Lille Douai - CRIStAL) : francois.septier@imt-lille-douai.fr

Programme

10h - 10h50 : Francis Bach (DIENS, INRIA)
"Stochastic gradient descent methods for machine learning"

10h50 - 11h20 : Herwig Wendt (IRIT, CNRS)
"A Bayesian estimator for the multifractal analysis of multivariate data"

11h20 - 11h30 (pause)

11h30 - 11h50 : Simon Barthelmé (GipsaLab, CNRS)
"Data sub-sampling with Determinantal Point Processes"

11h50 - 12h20 : Julien Bect (L2S, CentraleSupelec)
"On the use of SMC methods in the sequential design of computer experiments"

---- PAUSE REPAS ----

14h - 14h50 : Nicolas Chopin (CREST, ENSAE)
"Sequential Monte Carlo: some recent developments"

14h50 - 15h10 : Manon Michel (CMAP, Ecole Polytechnique)
"Irreversible Markov chain Monte Carlo methods"

15h10 - 15h30 : Jean-François Giovannelli (IMS, Univ. de Bordeaux)
"Segmentation-déconvolution d'images texturées : approche bayésienne hiérarchique et échantillonnage stochastique"

15h30 - 15h50 (pause)

15h50 - 16h20 : Michal Valko (Sequel, INRIA)
"Pliable rejection sampling"

16h20 -16h40 : Adil Salim (LTCI, Telecom ParisTech)
"Snake: a Stochastic Proximal Gradient Algorithm for Regularized Problems over Large Graphs"

16h40 - 17h : Umut Simsekli (LTCI, Telecom ParisTech)
"Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for MCMC"

Résumés des contributions

Stochastic gradient descent methods for machine learning

Francis Bach (DIENS, INRIA)

In this talk, I will review the main recent advances in stochastic approximation algorithms for large-scale machine learning, with an emphasis on constant-step size stochastic gradient and variance-reduced techniques.

A Bayesian estimator for the multifractal analysis of multivariate data

Herwig Wendt (IRIT, CNRS)

Multifractal analysis allows to assess the fluctuations of the pointwise regularity of signals and images and has been successfully used in a large panel of applications of very different natures. Yet, despite past successes, the accurate estimation of multifractal parameters still remains a challenging task and requires the availability of large sample sizes. This work studies a Bayesian framework that makes use of the jointly recorded data components available in multivariate data to improve estimation of multifractal parameters. It relies on a statistical model for the logarithm of wavelet leaders that comprises a gamma Markov random field joint prior which acts as a regularizer and counterbalances the statistical variability induced by small sample size. The model is specifically designed to lead to an efficient sampler for the estimation of multifractal parameters and yields excellent results on data with potentially many components.

Data sub-sampling with Determinantal Point Processes

Simon Barthelmé (GipsaLab, CNRS)
work with Pierre Olivier Amblard, Nicolas Tremblay

Determinantal Point Processes (DPPs) are a class of point processes that exhibit "repulsion". This property can be leveraged to obtain high-diversity subsets, meaning that DPPs can be used to sub-sample various objects (surfaces, datasets, graphs, etc.) with relatively high fidelity.

This idea has been suggested by several authors and holds tremendous theoretical appeal. However, many difficulties crop up in the implementation, and our goal has been to lift some of them.

One aspect is that DPPs come in two variants: fixed sample size (so-called k-DPPs) and varying sample size. DPPs with varying sample sizes are more tractable, since their inclusion probabilities admit a closed form. k-DPPs make more sense in many applications, but are less tractable, since inclusion probabilities are much harder to compute. We show that k-DPPs and DPPs are asymptotically equivalent in the limit of large databases, which leads to tractable formulas for inclusion probabilities.

On the use of SMC methods in the sequential design of computer experiments

Julien Bect (L2S, CentraleSupelec)

Sequential design methods have received a lot of attention in the computer experiments community, to address designs problems where the task is to estimate some "local" features of an expensive-to-evaluate computer model. Typical examples of such features include : maximizers and/or maxima of the function (optimization), failure probabilities or quantiles (reliability analysis), and so on and so forth. A popular approach to construct such sequential designs is to adopt a Bayesian point of view : the computer model is endowed with a prior distribution (typically, a Gaussian process prior) and a "sampling criterion" (aka aquisition function, or infill criterion) is defined, using the posterior distribution, to quantify the expected benefit of a new evaluation at a given location. This talk will provide an introduction to these methods, and then discuss how Sequential Monte Carlo (SMC) methods can be used to implement them efficiently. The Bayesian subset simulation (BSS) algorithm will be presented as an example.

Main refs (available here) :

[1] Julien Bect, Ling Li and Emmanuel Vazquez (2017). Bayesian subset simulation. SIAM Journal on Uncertainty Quantification, 5:1, 762-786.

[2] Paul Feliot, Julien Bect and Emmanuel Vazquez (2017). A Bayesian approach to constrained single- and multi-objective optimization. Journal of Global Optimization, 67:1, 97-133.

Sequential Monte Carlo: some recent developments

Nicolas Chopin (CREST, ENSAE)

SMC (Sequential Monte Carlo) is a class of Monte Carlo algorithms for filtering and related sequential problems. This talk will cover some recent research on how to connect SMC with quasi-Monte Carlo sampling, and how to study formally certain advanced resampling schemes.

Irreversible Markov chain Monte Carlo methods

Manon Michel (CMAP, Ecole Polytechnique)
work with A. Durmus, S. Sénécal, Y. Deng and X. Tan.

Recently developed in Physics, irreversible MCMC schemes are under a growing attention from the Bayesian statistics community, as they embody a serious candidate for scalable MCMC methods. These methods limits the random-walk behavior of the underlying Markov chain by breaking free from the reversibility condition. Based on piecewise deterministic Markov processes, these methods are rejection-free, provide a continuum of valid samples and do not require critical parameter fine-tuning. Accelerations up to several orders of magnitude have been exhibited for different target distributions.

In this talk, I will provide a broad overview of these methods, starting from the first developments in Physics to the recent applications in Bayesian statistics. I will then discuss recent progress about how the random-walk behavior can be furtherly reduced by the new scheme Forward event-chain Monte Carlo and how the factorization of the transition probabilities is the key to the reduction of the computational complexity of these methods, down to constant-time complexity.

Segmentation-déconvolution d'images texturées: approche bayésienne hiérarchique et échantillonnage stochastique

Jean-François Giovannelli (IMS, Univ. de Bordeaux)
work with C. Vacar

Le travail concerne la question de la déconvolution-segmentation jointe pour des images texturées. Les images sont constituées de régions présentant des patchs de textures, en particulier orientées, appartenant à un ensemble de K classes données. Chaque classe est décrite par un champ gaussien piloté par une densité spectrale paramétrique dont les paramètres sont inconnus. Par ailleurs, les étiquettes de classes sont décrites par un champ de Potts dont le paramètre est également inconnu. La méthode repose sur un modèle hiérarchique et une stratégie bayésienne pour estimer conjointement les étiquettes, les K images texturées, ainsi que les hyperparamètres: les niveaux du bruit et des images ainsi que les paramètres de texture et le paramètre du champ de Potts notamment. La stratégie permet de définir des estimateurs optimaux au sens d'un risque joint: maximiseur marginal a posteriori et moyenne a posteriori selon les paramètres. Ils sont calculés numériquement à partir d'échantillons sous loi a posteriori, eux-mêmes simulés par un algorithme de Gibbs par bloc. Deux des étapes sont délicates: (1) le tirage des images texturées, gaussiennes de grande dimension, est réalisé par un algorithme de Perturbation-Optimization [a] et (2) le tirage des paramètres des images texturées obtenu grâce à une étape de Fisher Metropolis-Hastings [b]. On donnera plusieurs illustrations numériques. Le travail est en cours de publication [c].

[a] F. Orieux, O. Féron and J.-F. Giovannelli, "Sampling high-dimensional Gaussian distributions for general linear inverse problems", Signal Processing Letters, May 2012.
[b] C. Vacar, J.-F. Giovannelli, Y. Berthoumieu, "Langevin and Hessian with Fisher approximation stochastic sampling for parameter estimation of structured covariance" ICASSP 2011.
[b'] M. Girolami, B. Calderhead, "Riemannian manifold Hamiltonian Monte Carlo", Journal of the Royal Statistical Society, 2011.
[c] C. Vacar, J.-F. Giovannelli, "Unsupervised joint deconvolution and segmentation method for textured images: A Bayesian approach and an advanced sampling algorithm", EURASIP Journal on Advances in Signal Processing, special issue on 'Advanced Computational Methods for Bayesian Signal Processing'.

Pliable rejection sampling

Michal Valko (Sequel, INRIA)

Rejection sampling is a known technique for sampling from difficult distributions. However, its use is limited due to a high rejection rate. Common adaptive rejection sampling methods either work for very specific distributions or without performance guarantees. In this talk, we present pliable rejection sampling (PRS), a new approach to rejection sampling, where we adapt the sampling envelope using a kernel estimator. Since our method builds on rejection sampling, the samples obtained are i.i.d. and w.h.p. exactly distributed according to f. Another benefit of PRS is that it comes with a guarantee on the number of accepted samples. We will also discuss extensions that followed these results.

Related paper: Akram Erraqabi, Michal Valko, Alexandra Carpentier, Odalric-Ambrym Maillard: Pliable rejection sampling, in International Conference on Machine Learning (ICML 2016)

Snake: a Stochastic Proximal Gradient Algorithm for Regularized Problems over Large Graphs

Adil Salim (LTCI, Telecom ParisTech)
work with Pascal Bianchi (Télécom ParisTech) and Walid Hachem (UPEM).

A regularized optimization problem over a large unstructured graph is studied, where the regularization term is tied to the graph geometry. Typical regularization examples include the total variation and the Laplacian regularizations over the graph. When applying the proximal gradient algorithm to solve this problem, there exist quite affordable methods to implement the proximity operator (backward step) in the special case where the graph is a simple path without loops. In this paper, an algorithm, referred to as "Snake", is proposed to solve such regularized problems over general graphs, by taking benefit of these fast methods. The algorithm consists in properly selecting random simple paths in the graph and performing the proximal gradient algorithm over these simple paths. This algorithm is an instance of a new general stochastic proximal gradient algorithm, whose convergence is proven.

Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for MCMC

Umut Simsekli (LTCI, Telecom ParisTech)

Along with the recent advances in scalable Markov Chain Monte Carlo methods, sampling techniques that are based on Langevin diffusions have started receiving increasing attention. These so called Langevin Monte Carlo (LMC) methods are based on diffusions driven by a Brownian motion, which gives rise to Gaussian proposal distributions in the resulting algorithms. Even though these approaches have proven successful in many applications, their performance can be limited by the light-tailed nature of the Gaussian proposals. In this talk, I will present an extension to the classical LMC and develop a novel Fractional LMC (FLMC) framework that is based on a family of heavy-tailed distributions, called alpha-stable Lévy distributions. As opposed to classical approaches, the proposed approach can possess large jumps while targeting the correct distribution, which would be beneficial for efficient exploration of the state space. I will also present novel computational methods that can scale up to large-scale problems and provide formal convergence analysis of the proposed scheme. Our experiments support our theory: FLMC can provide superior performance in multi-modal settings, improved convergence rates, and robustness to algorithm parameters.
Lien arXiv

Identification