TY - GEN
T1 - Generalizable deep clustering based on Bi-LSTM with applications to sepsis and acute kidney disease populations
AU - Tan, Yongsen
AU - Huang, Jiahui
AU - Zhuang, Jinhu
AU - Liu, Yong
AU - Huang, Haofan
AU - Yu, Xiaxia
AU - Wang, Fusheng
N1 - Publisher Copyright: © 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Despite the abundance of subphenotype clustering studies on sepsis and acute kidney injury (AKI), few models consider the real-time information of clinical features. The lack of supervision may lead to patient subgroups being derived as clusters without the stratification of patients based on the outcome of interests. The sensitivity of the dimension in clustering methods is generally ignored, so clusters lack robustness. In this study, we propose an ensembled outcome-driven bidirectional long short-term memory autoencoder (BiLSTM-AE) architecture with high robustness and transferability to identify subphenotypes. BiLSTM-AE learns the advanced representation of the time-series clinical features by co-training the encoder and a weak predictor to achieve the risk-stratified clustering of patients. Clusters of a variety of dimensions are ensembled to combine global and local information. Four different datasets from three public datasets, MIMIC-III-AKI, MIMIC-IV-sepsis, eICU-AKI, and eICU-sepsis, were used to assess the method's effectiveness in clustering and prediction. Compared to baseline approaches including latent class analysis (LCA), subgroups generated by BiLSTMAE exhibited the highest mortality risk ratios between subgroups: the mortality for subphenotypes 1, 2, and 3 of BiLSTM and LCA was 6.91%, 17.53%, and 75.56% vs. 13.2%, 14.4%, and 19.7% for MIMIC-III-AKI. The prediction metric area under the receiver operating characteristic curve was 0.86 for MIMIC-IIIAKI, 0.91 for eICU-AKI, 0. SS for MIMIC-IV-sepsis, and 0. S9 for eICU-sepsis. Additionally, clinical evaluation of BiLSTM-AE generated subgroups revealed more meaningful distributions of member characteristics across subgroups. Thus, the method is an effective means to consider the real-time information of clinical features.
AB - Despite the abundance of subphenotype clustering studies on sepsis and acute kidney injury (AKI), few models consider the real-time information of clinical features. The lack of supervision may lead to patient subgroups being derived as clusters without the stratification of patients based on the outcome of interests. The sensitivity of the dimension in clustering methods is generally ignored, so clusters lack robustness. In this study, we propose an ensembled outcome-driven bidirectional long short-term memory autoencoder (BiLSTM-AE) architecture with high robustness and transferability to identify subphenotypes. BiLSTM-AE learns the advanced representation of the time-series clinical features by co-training the encoder and a weak predictor to achieve the risk-stratified clustering of patients. Clusters of a variety of dimensions are ensembled to combine global and local information. Four different datasets from three public datasets, MIMIC-III-AKI, MIMIC-IV-sepsis, eICU-AKI, and eICU-sepsis, were used to assess the method's effectiveness in clustering and prediction. Compared to baseline approaches including latent class analysis (LCA), subgroups generated by BiLSTMAE exhibited the highest mortality risk ratios between subgroups: the mortality for subphenotypes 1, 2, and 3 of BiLSTM and LCA was 6.91%, 17.53%, and 75.56% vs. 13.2%, 14.4%, and 19.7% for MIMIC-III-AKI. The prediction metric area under the receiver operating characteristic curve was 0.86 for MIMIC-IIIAKI, 0.91 for eICU-AKI, 0. SS for MIMIC-IV-sepsis, and 0. S9 for eICU-sepsis. Additionally, clinical evaluation of BiLSTM-AE generated subgroups revealed more meaningful distributions of member characteristics across subgroups. Thus, the method is an effective means to consider the real-time information of clinical features.
KW - BiLSTM
KW - acute kidney injury
KW - deep learning
KW - ensemble
KW - sepsis
KW - subphenotypes
UR - https://www.scopus.com/pages/publications/85146675179
U2 - 10.1109/BIBM55620.2022.9995179
DO - 10.1109/BIBM55620.2022.9995179
M3 - Conference contribution
T3 - Proceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
SP - 1745
EP - 1750
BT - Proceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
A2 - Adjeroh, Donald
A2 - Long, Qi
A2 - Shi, Xinghua
A2 - Guo, Fei
A2 - Hu, Xiaohua
A2 - Aluru, Srinivas
A2 - Narasimhan, Giri
A2 - Wang, Jianxin
A2 - Kang, Mingon
A2 - Mondal, Ananda M.
A2 - Liu, Jin
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
Y2 - 6 December 2022 through 8 December 2022
ER -