TY - GEN
T1 - Managing High-Bandwidth Memory is a Parallel Scheduling Problem (full paper only)
AU - Agrawal, Kunal
AU - Bender, Michael A.
AU - Pruhs, Kirk
AU - Moseley, Benjamin
AU - Stein, Clifford
N1 - Publisher Copyright: © 2025 Association for Computing Machinery. All rights reserved.
PY - 2025/7/16
Y1 - 2025/7/16
N2 - High-Bandwidth Memory (HBM) is a decade-old memory technology that is increasingly commonly being used in highly-parallel machines such as GPUs and multicores. Comparatively, HBM has higher bandwidth, smaller capacity, and similar latency to other DRAM technologies. Many systems use both HBM and other DRAM technologies, where HBM is naturally closer to the processor in the conceptual memory hierarchy. Thus, a natural resulting question is how one should best manage a collection of processes running on a HBM/DRAM memory hierarchy. Prior work introduced a theoretical model for addressing this question, and gave a competitive policy for the objective of minimizing makespan. Our main technical contribution is to give a competitive policy for the more commonly appropriate total/average response/completion time objective. However, we believe the broader, and more important contribution, is to make explicit the case (hinted at in the prior literature) that managing an HBM/DRAM hierarchy should be thought of as a parallel scheduling problem. To that end, we introduce a new online scheduling model that we call the semi-normal model. We then show how to use a competitive algorithm for scheduling in the semi-normal model as a black box to obtain a competitive algorithm for managing a HBM/DRAM memory hierarchy. Thus, as a result of this black-box conversion, competitiveness results in the semi-normal model translate (essentially) automatically into competitiveness results in the HBM/DRAM management model. Our main technical result is then an application of such a translation. That is, we show that a natural variant of the Round Robin (processor sharing) algorithm, naturally adapted for the seminormal model, is competitive for the objective of average/total completion time. Thus, we obtain an algorithm for managing a HBM/DRAM hierarchy that is competitive for the objective of average/total completion time, using this black-box reduction.
AB - High-Bandwidth Memory (HBM) is a decade-old memory technology that is increasingly commonly being used in highly-parallel machines such as GPUs and multicores. Comparatively, HBM has higher bandwidth, smaller capacity, and similar latency to other DRAM technologies. Many systems use both HBM and other DRAM technologies, where HBM is naturally closer to the processor in the conceptual memory hierarchy. Thus, a natural resulting question is how one should best manage a collection of processes running on a HBM/DRAM memory hierarchy. Prior work introduced a theoretical model for addressing this question, and gave a competitive policy for the objective of minimizing makespan. Our main technical contribution is to give a competitive policy for the more commonly appropriate total/average response/completion time objective. However, we believe the broader, and more important contribution, is to make explicit the case (hinted at in the prior literature) that managing an HBM/DRAM hierarchy should be thought of as a parallel scheduling problem. To that end, we introduce a new online scheduling model that we call the semi-normal model. We then show how to use a competitive algorithm for scheduling in the semi-normal model as a black box to obtain a competitive algorithm for managing a HBM/DRAM memory hierarchy. Thus, as a result of this black-box conversion, competitiveness results in the semi-normal model translate (essentially) automatically into competitiveness results in the HBM/DRAM management model. Our main technical result is then an application of such a translation. That is, we show that a natural variant of the Round Robin (processor sharing) algorithm, naturally adapted for the seminormal model, is competitive for the objective of average/total completion time. Thus, we obtain an algorithm for managing a HBM/DRAM hierarchy that is competitive for the objective of average/total completion time, using this black-box reduction.
KW - Approximation algorithms
KW - High-bandwidth memory
KW - Multicore paging
KW - Online algorithms
KW - Paging
KW - Scheduling
UR - https://www.scopus.com/pages/publications/105012713894
U2 - 10.1145/3694906.3743336
DO - 10.1145/3694906.3743336
M3 - Conference contribution
T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures
SP - 171
EP - 180
BT - SPAA 2025 - Proceedings of the 2025 37th ACM Symposium on Parallelism in Algorithms and Architectures
PB - Association for Computing Machinery
T2 - 37th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2025
Y2 - 28 July 2025 through 1 August 2025
ER -