TY - GEN
T1 - Using variability as a guiding principle to reduce latency in web applications via OS profiling
AU - Suresh, Amoghvarsha
AU - Gandhi, Anshul
N1 - Publisher Copyright: © 2019 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.
PY - 2019/5/13
Y1 - 2019/5/13
N2 - Request latency is a critical metric in determining the usability of web services. The latency of a request includes service time - the time when the request is being actively serviced - and waiting time - the time when the request is waiting to be served. Most existing works aim to reduce request latency by focusing on reducing the mean service time (that is, shortening the critical path). In this paper, we explore an alternative approach to reducing latency - using variability as a guiding principle when designing web services. By tracking the service time variability of the request as it traverses across software layers within the user and kernel space of the web server, we identify the most critical stages of request processing. We then determine control knobs in the OS and application, such as thread scheduling and request batching, that regulate the variability in these stages, and demonstrate that tuning these specific knobs can significantly improve end-to-end request latency. Our experimental results with Memcached and Apache web server under different request rates, including real-world traces, show that this alternative approach can reduce mean and tail latency by 30-50%.
AB - Request latency is a critical metric in determining the usability of web services. The latency of a request includes service time - the time when the request is being actively serviced - and waiting time - the time when the request is waiting to be served. Most existing works aim to reduce request latency by focusing on reducing the mean service time (that is, shortening the critical path). In this paper, we explore an alternative approach to reducing latency - using variability as a guiding principle when designing web services. By tracking the service time variability of the request as it traverses across software layers within the user and kernel space of the web server, we identify the most critical stages of request processing. We then determine control knobs in the OS and application, such as thread scheduling and request batching, that regulate the variability in these stages, and demonstrate that tuning these specific knobs can significantly improve end-to-end request latency. Our experimental results with Memcached and Apache web server under different request rates, including real-world traces, show that this alternative approach can reduce mean and tail latency by 30-50%.
UR - https://www.scopus.com/pages/publications/85066914397
U2 - 10.1145/3308558.3313406
DO - 10.1145/3308558.3313406
M3 - Conference contribution
T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
SP - 1759
EP - 1770
BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PB - Association for Computing Machinery, Inc
T2 - 2019 World Wide Web Conference, WWW 2019
Y2 - 13 May 2019 through 17 May 2019
ER -