TY - GEN
T1 - Optimizing reduction computations in a distributed environment
AU - Kurc, Tahsin
AU - Lee, Feng
AU - Agrawal, Gagan
AU - Catalyurek, Umit
AU - Ferreira, Renato
AU - Saltz, Joel
PY - 2003
Y1 - 2003
N2 - We investigate runtime strategies for data-intensive applications that invovle generalized reductions on large, distributed datasets.Our set of strategies includes replicated filter state, partitioned filter state, and hybrid options between these two extremes.We evaluate these strategies using emulators of three real applications, different query and output sizes, and a number of configurations.We consider execution in a homogeneous cluster and in a distributed environment where only a subset of nodes hst the data.Our results show replicating the filter state scales well and outperforms other schemes, if sufficient memory is available and sufficient computation is involved to offset the cost of global merge step.In other cases, hybrid is usually the best.Moreover, in almost all cases, the performance of the hybrid strategy is quite close to the best strategy. Thus, we believe that hybrid is an attractive approach when the relative performance of different schemes cannot be predicted.
AB - We investigate runtime strategies for data-intensive applications that invovle generalized reductions on large, distributed datasets.Our set of strategies includes replicated filter state, partitioned filter state, and hybrid options between these two extremes.We evaluate these strategies using emulators of three real applications, different query and output sizes, and a number of configurations.We consider execution in a homogeneous cluster and in a distributed environment where only a subset of nodes hst the data.Our results show replicating the filter state scales well and outperforms other schemes, if sufficient memory is available and sufficient computation is involved to offset the cost of global merge step.In other cases, hybrid is usually the best.Moreover, in almost all cases, the performance of the hybrid strategy is quite close to the best strategy. Thus, we believe that hybrid is an attractive approach when the relative performance of different schemes cannot be predicted.
UR - https://www.scopus.com/pages/publications/84877080088
U2 - 10.1145/1048935.1050160
DO - 10.1145/1048935.1050160
M3 - Conference contribution
SN - 1581136951
SN - 9781581136951
T3 - Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003
BT - Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003
T2 - 2003 ACM/IEEE Conference on Supercomputing, SC 2003
Y2 - 15 November 2003 through 21 November 2003
ER -