Skip to main navigation Skip to search Skip to main content

TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling

  • Bang Di
  • , Daokun Hu
  • , Zhen Xie
  • , Jianhua Sun
  • , Hao Chen
  • , Jinkui Ren
  • , Dong Li
  • Hunan University
  • Alibaba Group Holding Ltd.
  • University of California Merced

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application's resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software-and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.

Original languageEnglish
Article number9
JournalTransactions on Architecture and Code Optimization
Volume19
Issue number1
DOIs
StatePublished - Mar 2022

Keywords

  • CUDA
  • GPU
  • High performance
  • TLB contention

Fingerprint

Dive into the research topics of 'TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling'. Together they form a unique fingerprint.

Cite this