TY - GEN
T1 - A Mispronunciation-Based Voice-Omics Representation Framework for Screening Specific Language Impairments in Children
AU - Bo, Wei
AU - Rubino, Matthew
AU - Xu, Wenyao
N1 - Publisher Copyright: © 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - This paper introduces an innovative end-to-end (E2E) framework for screening Specific Language Impairment (SLI) in children, centralizing phoneme-level mispronunciation (PLM) detection to enhance the precision and reliability. We have developed a unique voice-omics representation that translates PLM predictions into symbolic sequences, yielding significant phenotyping biomarkers that provide objective and quantifiable assessments of children's speech patterns. Through meticulous fine-tuning of the Connectionist Temporal Classification (CTC) model on the L2-ARCTIC dataset and rigorous five-fold cross-validation, our E2E models have demonstrated remarkable ac-curacy, with Area Under the Curve (AUC) values exceeding 0.71 and a notable recall rate of up to 71.5 % on the CHILDES dataset. Our approach signifies a substantial advancement in SLI screening, leveraging cutting-edge technology to capture the complexities of spontaneous speech in children.
AB - This paper introduces an innovative end-to-end (E2E) framework for screening Specific Language Impairment (SLI) in children, centralizing phoneme-level mispronunciation (PLM) detection to enhance the precision and reliability. We have developed a unique voice-omics representation that translates PLM predictions into symbolic sequences, yielding significant phenotyping biomarkers that provide objective and quantifiable assessments of children's speech patterns. Through meticulous fine-tuning of the Connectionist Temporal Classification (CTC) model on the L2-ARCTIC dataset and rigorous five-fold cross-validation, our E2E models have demonstrated remarkable ac-curacy, with Area Under the Curve (AUC) values exceeding 0.71 and a notable recall rate of up to 71.5 % on the CHILDES dataset. Our approach signifies a substantial advancement in SLI screening, leveraging cutting-edge technology to capture the complexities of spontaneous speech in children.
KW - Connectionist Temporal Classification Model
KW - Phenotyping Biomarkers
KW - Phoneme-level Mispronunciation Detection
KW - SLI Screening
KW - Symbolic Sequence
UR - https://www.scopus.com/pages/publications/85203721678
U2 - 10.1109/ICHI61247.2024.00045
DO - 10.1109/ICHI61247.2024.00045
M3 - Conference contribution
T3 - Proceedings - 2024 IEEE 12th International Conference on Healthcare Informatics, ICHI 2024
SP - 294
EP - 304
BT - Proceedings - 2024 IEEE 12th International Conference on Healthcare Informatics, ICHI 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th IEEE International Conference on Healthcare Informatics, ICHI 2024
Y2 - 3 June 2024 through 6 June 2024
ER -