Skip to main navigation Skip to search Skip to main content

EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The use of Deep Neural Networks (DNNs) has skyrocketed in recent years. While its applications have brought many benefits and use cases, they also have a significant environmental impact due to the high energy consumption of DNN execution. It has already been acknowledged in the literature that training DNNs is computationally expensive and requires large amounts of energy. However, the energy consumption of DNN inference is still an area that has not received much attention, yet. With the increasing adoption of online tools, the usage of inference has significantly grown and will likely continue to grow. Unlike training, inference is user-facing, requires low latency, and is used more frequently. As such, edge devices are being considered for DNN inference due to their low latency and privacy benefits. In this context, inference on edge is a timely area that requires closer attention to regulate its energy consumption. We present EcoEdgeInfer, a system that balances performance and sustainability for DNN inference on edge devices. Our core component of EcoEdgeInfer is an adaptive optimization algorithm, EcoGD, that strategically and quickly sweeps through the hardware and software configuration space to find the jointly optimal configuration that can minimize energy consumption and latency. EcoGD is agile by design, and adapts the configuration parameters in response to time-varying and unpredictable inference workload. We evaluate EcoEdgeInfer on different DNN models using real-world traces and show that EcoGD consistently outperforms existing baselines, lowering energy consumption by 31% and reducing tail latency by 14%, on average.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/ACM Symposium on Edge Computing, SEC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages191-205
Number of pages15
ISBN (Electronic)9798350378283
DOIs
StatePublished - 2024
Event9th Annual IEEE/ACM Symposium on Edge Computing, SEC 2024 - Rome, Italy
Duration: Dec 4 2024Dec 7 2024

Publication series

NameProceedings - 2024 IEEE/ACM Symposium on Edge Computing, SEC 2024

Conference

Conference9th Annual IEEE/ACM Symposium on Edge Computing, SEC 2024
Country/TerritoryItaly
CityRome
Period12/4/2412/7/24

Keywords

  • energy
  • inference
  • latency
  • workload changes

Fingerprint

Dive into the research topics of 'EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices'. Together they form a unique fingerprint.

Cite this