TY - GEN
T1 - DEQA
T2 - 17th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2019
AU - Cao, Qingqing
AU - Weber, Noah
AU - Balasubramanian, Niranjan
AU - Balasubramanian, Aruna
N1 - Publisher Copyright: © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2019/6/12
Y1 - 2019/6/12
N2 - Today there is no effective support for device-wide question answering on mobile devices. State-of-the-art QA models are deep learning behemoths designed for the cloud which run extremely slow and require more memory than available on phones. We present DeQA, a suite of latency- and memory- optimizations that adapts existing QA systems to run completely locally on mobile phones. Specifically, we design two latency optimizations that (1) stops processing documents if further processing cannot improve answer quality, and (2) identifies computation that does not depend on the question and moves it offline. These optimizations do not depend on the QA model internals and can be applied to several existing QA models. DeQA also implements a set of memory optimizations by (i) loading partial indexes in memory, (ii) working with smaller units of data, and (iii) replacing in-memory lookups with a key-value database. We use DeQA to port three state-of-the-art QA systems to the mobile device and evaluate over three datasets. The first is a large scale SQuAD dataset defined over Wikipedia collection. We also create two on-device QA datasets, one over a publicly available email data collection and the other using a cross-app data collection we obtain from two users. Our evaluations show that DeQA can run QA models with only a few hundred MBs of memory and provides at least 13x speedup on average on the mobile phone across all three datasets.
AB - Today there is no effective support for device-wide question answering on mobile devices. State-of-the-art QA models are deep learning behemoths designed for the cloud which run extremely slow and require more memory than available on phones. We present DeQA, a suite of latency- and memory- optimizations that adapts existing QA systems to run completely locally on mobile phones. Specifically, we design two latency optimizations that (1) stops processing documents if further processing cannot improve answer quality, and (2) identifies computation that does not depend on the question and moves it offline. These optimizations do not depend on the QA model internals and can be applied to several existing QA models. DeQA also implements a set of memory optimizations by (i) loading partial indexes in memory, (ii) working with smaller units of data, and (iii) replacing in-memory lookups with a key-value database. We use DeQA to port three state-of-the-art QA systems to the mobile device and evaluate over three datasets. The first is a large scale SQuAD dataset defined over Wikipedia collection. We also create two on-device QA datasets, one over a publicly available email data collection and the other using a cross-app data collection we obtain from two users. Our evaluations show that DeQA can run QA models with only a few hundred MBs of memory and provides at least 13x speedup on average on the mobile phone across all three datasets.
KW - Mobile Devices
KW - Mobile Systems
KW - Question Answering
UR - https://www.scopus.com/pages/publications/85069183447
U2 - 10.1145/3307334.3326071
DO - 10.1145/3307334.3326071
M3 - Conference contribution
T3 - MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services
SP - 27
EP - 40
BT - MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services
PB - Association for Computing Machinery
Y2 - 17 June 2019 through 21 June 2019
ER -