Skip to main navigation Skip to search Skip to main content

Toward a Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators

  • Murali Emani
  • , Sam Foreman
  • , Varuni Sastry
  • , Zhen Xie
  • , Siddhisanket Raskar
  • , William Arnold
  • , Rajeev Thakur
  • , Venkatram Vishwanath
  • , Michael E. Papka
  • , Sanjif Shanmugavelu
  • , Darshan Gandhi
  • , Hengyu Zhao
  • , Dun Ma
  • , Kiran Ranganath
  • , Rick Weisner
  • , Jiunn Yeu Chen
  • , Yuting Yang
  • , Natalia Vassilieva
  • , Bin C. Zhang
  • , Sylvia Howland
  • Alexander Tsyplikhin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Artificial intelligence (AI) methods have become critical in scientific applications to help accelerate scientific discovery. Large language models (LLMs) are being considered a promising approach to address some challenging problems because of their superior generalization capabilities across domains. The effectiveness of the models and the accuracy of the applications are contingent upon their efficient execution on the underlying hardware infrastructure. Specialized AI accelerator hardware systems have recently become available for accelerating AI applications. However, the comparative performance of these AI accelerators on large language models has not been previously studied. In this paper, we systematically study LLMs on multiple AI accelerators and GPU s and evaluate their performance characteristics for these models. We evaluate these systems with (i) a micro-benchmark using a core transformer block, (ii) a GPT-2 model, and (iii) an LLM-driven science use case, GenSLM. We present our findings and analyses of the models' performance to better understand the intrinsic capabilities of AI accelerators. Furthermore, our analysis takes into account key factors such as sequence lengths, scaling behavior, and sensitivity to gradient accumulation steps.

Original languageEnglish
Title of host publication2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages48-57
Number of pages10
ISBN (Electronic)9798350364606
DOIs
StatePublished - 2024
Event2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024 - San Francisco, United States
Duration: May 27 2024May 31 2024

Publication series

Name2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024

Conference

Conference2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024
Country/TerritoryUnited States
CitySan Francisco
Period05/27/2405/31/24

Fingerprint

Dive into the research topics of 'Toward a Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators'. Together they form a unique fingerprint.

Cite this