Tirer parti de la non-uniformité dans l'optimisation non convexe du premier ordre

Résumé :

Classical global convergence results for first-order methods rely on uniform smoothness and the Łojasiewicz inequality. Motivated by properties of objective functions that arise in machine learning, we propose a non-uniform refinement of these notions, leading to \emph{Non-uniform Smoothness} (NS) and \emph{Non-uniform Łojasiewicz inequality} (NŁ). The new definitions inspire new geometry-aware first-order methods that are able to converge to global optimality faster than the classical Ω(1/t2) lower bounds. To illustrate the power of these geometry-aware methods and their corresponding non-uniform analysis, we consider two important problems in machine learning: policy gradient optimization in reinforcement learning (PG), and generalized linear model training in supervised learning (GLM). For PG, we find that normalizing the gradient ascent method can accelerate convergence to O(e−t) while incurring less overhead than existing algorithms. For GLM, we show that geometry-aware normalized gradient descent can also achieve a linear convergence rate, which significantly improves the best known results. We additionally show that the proposed geometry-aware descent methods escape landscape plateaus faster than standard gradient descent. Experimental results are used to illustrate and complement the theoretical findings.

Tirer parti de la non-uniformité dans l'optimisation non convexe du premier ordre

Résumé :

Derniers documents de recherche

UCTransNet : Repenser les connexions de saut dans U-Net d'une perspective de canal avec Transformer

Habitat-Matterport 3D Dataset (HM3D) : 1000 environnements 3D à grande échelle pour l'IA incarnée

Roominoes : Génération de nouveaux plans d'étage 3D à partir de pièces 3D existantes

Laissez-nous vous aider

Connectez-vous avec la communauté

Explorer la formation et l'enseignement supérieur

Exploiter le potentiel de l'intelligence artificielle

Connectez-vous avec la communauté

Explorer la formation et l'enseignement supérieur

Exploiter le potentiel de l'intelligence artificielle