TY - GEN
T1 - Cuckoo directory
T2 - 17th International Symposium on High-Performance Computer Architecture, HPCA 2011
AU - Ferdman, Michael
AU - Lotfi-Kamran, Pejman
AU - Balet, Ken
AU - Falsafi, Babak
PY - 2011
Y1 - 2011
N2 - Growing core counts have highlighted the need for scalable on-chip coherence mechanisms. The increase in the number of on-chip cores exposes the energy and area costs of scaling the directories. Duplicate-tag-based directories require highly associative structures that grow with core count, precluding scalability due to prohibitive power consumption. Sparse directories overcome the power barrier by reducing directory associativity, but require storage area over-provisioning to avoid high invalidation rates. We propose the Cuckoo directory, a power- and area-efficient scalable distributed directory. The cuckoo directory scales to high core counts without the energy costs of wide associative lookup and without gross capacity over-provisioning. Simulation of a 16-core CMP with commercial server and scientific workloads shows that the Cuckoo directory eliminates invalidations while being up to four times more power-efficient than the Duplicate-tag directory and 24% more power-efficient and up to seven times more area-efficient than the Sparse directory organization. Analytical projections indicate that the Cuckoo directory retains its energy and area benefits with increasing core count, efficiently scaling to at least 1024 cores.
AB - Growing core counts have highlighted the need for scalable on-chip coherence mechanisms. The increase in the number of on-chip cores exposes the energy and area costs of scaling the directories. Duplicate-tag-based directories require highly associative structures that grow with core count, precluding scalability due to prohibitive power consumption. Sparse directories overcome the power barrier by reducing directory associativity, but require storage area over-provisioning to avoid high invalidation rates. We propose the Cuckoo directory, a power- and area-efficient scalable distributed directory. The cuckoo directory scales to high core counts without the energy costs of wide associative lookup and without gross capacity over-provisioning. Simulation of a 16-core CMP with commercial server and scientific workloads shows that the Cuckoo directory eliminates invalidations while being up to four times more power-efficient than the Duplicate-tag directory and 24% more power-efficient and up to seven times more area-efficient than the Sparse directory organization. Analytical projections indicate that the Cuckoo directory retains its energy and area benefits with increasing core count, efficiently scaling to at least 1024 cores.
UR - https://www.scopus.com/pages/publications/79955887509
U2 - 10.1109/HPCA.2011.5749726
DO - 10.1109/HPCA.2011.5749726
M3 - Conference contribution
SN - 9781424494323
T3 - Proceedings - International Symposium on High-Performance Computer Architecture
SP - 169
EP - 180
BT - Proceedings - 17th International Symposium on High-Performance Computer Architecture, HPCA 2011
Y2 - 12 February 2011 through 16 February 2011
ER -