Skip to main navigation Skip to search Skip to main content

Bridging the gap between stochastic gradient MCMC and stochastic optimization

  • Changyou Chen
  • , David Carlson
  • , Zhe Gan
  • , Chunyuan Li
  • , Lawrence Carin

Research output: Contribution to conferencePaperpeer-review

57 Scopus citations

Abstract

Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian analogs to popular stochastic optimization methods; however, this connection is not well studied. We explore this relationship by applying simulated annealing to an SG-MCMC algorithm. Furthermore, we extend recent SG-MCMC methods with two key components: i) adaptive preconditioners (as in ADAgrad or RMSprop), and ii) adaptive element-wise momentum weights. The zero-temperature limit gives a novel stochastic optimization method with adaptive element-wise momentum weights, while conventional optimization methods only have a shared, static momentum weight. Under certain assumptions, our theoretical analysis suggests the proposed simulated annealing approach converges close to the global optima. Experiments on several deep neural network models show state-of-the-art results compared to related stochastic optimization algorithms.

Original languageEnglish
Pages1051-1060
Number of pages10
StatePublished - 2016
Event19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016 - Cadiz, Spain
Duration: May 9 2016May 11 2016

Conference

Conference19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016
Country/TerritorySpain
CityCadiz
Period05/9/1605/11/16

Fingerprint

Dive into the research topics of 'Bridging the gap between stochastic gradient MCMC and stochastic optimization'. Together they form a unique fingerprint.

Cite this