Skip to main navigation Skip to search Skip to main content

Data Sampling Affects the Complexity of Online SGD over Dependent Data

  • Shaocong Ma
  • , Ziyi Chen
  • , Yi Zhou
  • , Kaiyi Ji
  • , Yingbin Liang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Conventional machine learning applications typically assume that data samples are independently and identically distributed (i.i.d.). However, practical scenarios often involve a data-generating process that produces highly dependent data samples, which are known to heavily bias the stochastic optimization process and slow down the convergence of learning. In this paper, we conduct a fundamental study on how different stochastic data sampling schemes affect the sample complexity of online stochastic gradient descent (SGD) over highly dependent data. Specifically, with a φ-mixing process of data, we show that online SGD with proper periodic data-subsampling achieves an improved sample complexity over the standard online SGD in the full spectrum of the data dependence level. Interestingly, even subsampling a subset of data samples can accelerate the convergence of online SGD over highly dependent data. Moreover, we show that online SGD with mini-batch sampling can further substantially improve the sample complexity over online SGD with periodic data-subsampling over highly dependent data. Numerical experiments validate our theoretical results.

Original languageEnglish
Title of host publicationProceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022
PublisherAssociation For Uncertainty in Artificial Intelligence (AUAI)
Pages1296-1305
Number of pages10
ISBN (Electronic)9781713863298
StatePublished - 2022
Event38th Conference on Uncertainty in Artificial Intelligence, UAI 2022 - Eindhoven, Netherlands
Duration: Aug 1 2022Aug 5 2022

Publication series

NameProceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022

Conference

Conference38th Conference on Uncertainty in Artificial Intelligence, UAI 2022
Country/TerritoryNetherlands
CityEindhoven
Period08/1/2208/5/22

Fingerprint

Dive into the research topics of 'Data Sampling Affects the Complexity of Online SGD over Dependent Data'. Together they form a unique fingerprint.

Cite this