Optimisation en milieu aléatoire

Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.

Inscriptions closes à cette réunion.

Inscriptions

27 personnes membres du GdR ISIS, et 11 personnes non membres du GdR, sont inscrits à cette réunion.
Capacité de la salle : 130 personnes.

Annonce

En traitement du signal et en apprentissage statistique, les algorithmes d'optimisation de nature stochastique sont omniprésents. En premier lieu, la grande dimension de l'espace des paramètres, les volumes de données ou la complexité des fonctionnelles à optimiser, nécessitent l'emploi de techniques d'approximation stochastique, méthodes de Monte-Carlo ou méthodes de descente par coordonnées, en renfort des algorithmes d'optimisation usuels. En deuxième lieu, le traitement en ligne de flux de données nécessite l'emploi d'algorithmes adaptatifs capables de poursuivre un environnement fluctuant. Enfin, dans divers contextes industriels, la recherche de variables de contrôle pertinentes met en jeu des problèmes d'optimisation dont la nature stochastique intervient dans l'expression de la fonctionnelle et dans les contraintes, elles-mêmes probabilistes.

Cette journée vise à faire le point sur certaines avancées récentes couplant les algorithmes d'optimisation numérique aux méthodes stochastiques, dans le périmètre suivant :

Algorithmes d'approximation stochastique pour l'optimisation
Méthodes de Monte-Carlo et optimisation
Méthodes adaptatives pour le traitement du signal et données de streaming
Optimisation stochastique.

Programme

9h50 - 10h00 : Introduction

10h00 - 10h50 : René Henrion (WIAAS, Berlin)
Initiation aux problèmes d'optimisation sous contraintes en probabilité

10h50 - 11h30 : Rémi Bardenet (CNRS & CRIStAL, Université de Lille)
Monte Carlo with determinantal point processes

11h30 - 11h45 : Pause

11h45 - 12h35 : Walid Hachem (CNRS LTCI / Télécom ParisTech)
Opérateurs maximaux monotones et approximation stochastique

12h35 - 14h00 : Repas

14h00 - 14h40 : Alain Durmus (Télécom ParisTech)
Echantillonnage de loi en grandes dimensions avec l'algorithme de Langevin non-ajusté

14h40 - 15h20 : Richard Combes (CentraleSupélec)
Non-Smooth Unimodal Stochastic Optimization

15h20 - 15h40 : Pause

15h40 - 16h30 : Julien Mairal (INRIA Grenoble)
A Generic Quasi-Newton Algorithm for Faster Gradient-Based Optimization

16h30 - 17h10 : Apostolos Destounis (Huawei)
Applications of Stochastic Control to Software Defined Networking

Résumés des contributions

Initiation aux problèmes d'optimisation sous contraintes en probabilité

René Henrion (WIAAS, Berlin)

Les contraintes en probabilité représentent un modèle de base de l'optimisation stochastique. Elles permettent de trouver des décisions robustes par rapport à l'action des aléas sans causer des coûts excessifs. Le défi principal de ce type de contraintes repose sur le fait qu'elles ne sont pas données par une formule explicite. Seul leur approximation numérique est possible en général. Cette difficulté nécessite des considérations particulières dans l'analyse de la structure (e.g. continuité, différentiabilité, convexité), de la numérique (calcul de gradients) où de la stabilité des solutions (par rapport aux perturbations de la loi nominale de l'aléa). L'exposé tourne autour plusieurs de ces aspects et présente des résultats numériques pour certaines applications.

Monte Carlo with determinantal point processes

Rémi Bardenet (CNRS, Université de Lille), joint work with Adrien Hardy (Univ. Lille)

In this talk, we show that using repulsive random variables, it is possible to build Monte Carlo methods that converge faster than vanilla Monte Carlo. More precisely, we build estimators of integrals, the variance of which decreases as $N^{-1-1/d}$, where $N$ is the number of integrand evaluations, and $d$ is the ambient dimension. To do so, we propose stochastic numerical quadratures involving determinantal point processes (DPPs) associated to multivariate orthogonal polynomials. The proposed method can be seen as a stochastic version of Gauss' quadrature, where samples from a determinantal point process replace zeros of orthogonal polynomials. Formally, it is also strongly connected to model-based optimization using Gaussian processes, to which DPPs can bring new analysis tools.

This talk is based on the following preprint https://arxiv.org/abs/1605.00361.

Opérateurs maximaux monotones et approximation stochastique

Walid Hachem (CNRS LTCI / Télécom ParisTech)

Nous étudions le comportement de processus itératifs engendrés par opérateurs monotones aléatoires. Pour ceci, nous faisons appel à des outils d'approximation stochastique, en nous inspirant de la méthode bien connue de l'équation différentielle ordinaire (ODE). Nous mettons l'accent sur une version de l'algorithme forward-backward qui met en jeu deux opérateurs maximaux monotones aléatoires. Les cas du pas décroissant et du pas constant sont considérés. Dans le premier cas, nous montrons que le processus interpolé construit à partir des itérées est une pseudo trajectoire asymptotique de l'inclusion différentielle associée à la somme des moyennes (intégrales d'Aumann) des deux opérateurs maximaux monotones aléatoires. Dans le second cas, un résultat analogue est établi à l'aide d'outils de convergence faible de processus.

Echantillonnage de loi en grandes dimensions avec l'algorithme de Langevin non-ajusté

Alain Durmus (Télécom ParisTech)

Récemment, le problème d'échantillonner des lois en grandes dimensions avec des garanties théoriques raisonnables a suscité un grand intérêt. En effet, les applications dans lesquelles ce problème se pose sont nombreuses en statistiques, par exemple l'inférence en grandes dimensions pour le machine learning et la résolution de problèmes inverses. Lorsque la densité possède un potentiel qui est gradient Lipschitz, nous analyserons une méthodes MCMC sans réjection basée sur la discrétisation de l?équation de diffusion de Langevin. Nous présenterons plusieurs résultats de convergence explicites pour des pas de discrétisations qui peuvent être constants ou tendre vers 0. En particulier, suivant les hypothèses satisfaites par le potentiel de la densité cible, la dépendance de la convergence de l'algorithme en la dimension sera présentée. Dans le cas où la densité est fortement log-concave, nous proposerons de plus une analyse d'une mesure empirique convenablement pondérée, pour lequel nous fournirons des bornes sur le MSE et des déviations exponentielles. Enfin, basé sur des techniques d'optimisations, nous proposerons de nouveaux algorithmes pour échantillonner suivant une loi en grandes dimensions. Nous nous intéresserons en particulier à l'échantillonnage de densités qui ne sont pas continûment différentiables, ce qui est typiquement le cas pour des lois a priori induisant de la sparsité. Des simulations numériques illustreront nos résultats.

Non-Smooth Unimodal Stochastic Optimization

Richard Combes (CentraleSupélec)

Abstract: We consider stochastic bandit problems with a continuous set of arms and where the expected reward is a continuous and unimodal function of the arm. No further assumption is made regarding the smoothness and the structure of the expected reward function. For these problems, we propose the Stochastic Pentachotomy (SP) algorithm, and derive finite-time upper bounds on its regret and optimization error. In particular, we show that, for any expected reward function \mu that behaves as \mu(x) = \mu(x^*) - C |x - x^*|^\xi locally around its maximizer x^* for some \xi, C > 0, the SP algorithm is order-optimal. Namely its regret and optimization error scale as O( \sqrt{T \log(T)}) and O(\sqrt{\log(T) / T}) , respectively, when the time horizon T grows large. These scalings are achieved without the knowledge of \xi or C. Our algorithm is based on asymptotically optimal sequential statistical tests used to successively trim an interval that contains the best arm with high probability. To our knowledge, the SP algorithm constitutes the first sequential arm selection rule that achieves the optimal regret and optimization error scaling as O(\sqrt{T}) and O(\sqrt{1/T}), respectively, up to a logarithmic factor for non-smooth functions, as well as for smooth functions with unknown smoothness. More information at: http://arxiv.org/abs/1406.7447.

Joint work with Alexandre Proutiere (KTH, Sweden).

A Generic Quasi-Newton Algorithm for Faster Gradient-Based Optimization

Julien Mairal (INRIA Grenoble)

Abstract: We propose a technique to accelerate gradient-based optimization algorithms by giving them the ability to exploit L-BFGS heuristics. Our scheme is (i) generic and can be applied to a large class of first-order algorithms; (ii) it is compatible with composite objectives, meaning that it may provide exactly sparse solutions when a sparsity-inducing regularization is involved; (iii) it admits a linear convergence rate for strongly-convex problems; (iv) it is easy to use and it does not require any line search. Our work is inspired in part by the Catalyst meta-algorithm, which accelerates gradient-based techniques in the sense of Nesterov; here, we adopt a different strategy based on L-BFGS rules to learn and exploit the local curvature. In most practical cases, we observe significant improvements over Catalyst for solving large-scale high-dimensional machine learning problems. This is a joint work with Hongzhou Lin and Zaid Harchaoui.

Applications of Stochastic Control to Software Defined Networking

Apostolos Destounis (Huawei)

In this talk we discuss applications on Lyapunov drift ? based stochastic network optimization in two fundamental problems arising in Software Defined Networking with dynamic arrivals and departures of flows, namely: (i) Deciding when to reconfigure the routing of flows in the network according to the state of the solver of the min cost multicommodity flow problem and (ii) cost-minimizing fast dynamic routing of incoming demands. For the first problem, we note that since iterative methods are employed to solve the optimization problems, arrivals and departures of demands change the optimization instances and require further iterations. Applying the current iteration improves the optimality gap but results in frequent flow reconfigurations, which impacts QoS and stability. The problem then is to decide when to apply the current iteration or not in a way that a low cost is attained with infrequent reconfigurations. To limit the latter, we propose two policies that try to minimize the time average flow cost while respecting a reconfiguration budget. Experiments in practical SDN settings show that our policies provide a practical means to keep the optimality gap small while respecting a reconfiguration constraint. For the second problem, we propose the notion of a pathbook, which is a small subset of paths per commodity that is good for low cost routing under different traffic matrices. Once we find the pathbook, based on a derivative-free heuristic and past observations of traffic matrices, we use stochastic control theory and an adaptation of a randomized algorithm inspired by Glauber dynamics and used in VM scheduling to allocate demands to paths so that the cost is minimized subject to eventually routing all incoming flows. We showcase the performance of the proposed schemes on datasets from the GEANT network.

Organisateurs

P. Bianchi, Télécom ParisTech (pascal.bianchi@telecom-paristech.fr)
S. Moussaoui, IRCCyN (said.moussaoui@irccyn.ec-nantes.fr)
N. Pustelnik, CNRS/Ens de Lyon (nelly.pustelnik@ens-lyon.fr)
C. Richard, Université de Nice Sophia-Antipolis (cedric.richard@unice.fr)
F. Septier, Télécom Lille (francois.septier@telecom-lille.fr)

Identification