TY - GEN
T1 - TransASL
T2 - 28th International Conference on Intelligent User Interfaces, IUI 2023
AU - Jin, Yincheng
AU - Choi, Seokmin
AU - Gao, Yang
AU - Li, Jiyang
AU - Li, Zhengxiong
AU - Jin, Zhanpeng
N1 - Publisher Copyright: © 2023 ACM.
PY - 2023/3/27
Y1 - 2023/3/27
N2 - Sign language is a primary language used by deaf and hard-of-hearing (DHH) communities. However, existing sign language translation solutions primarily focus on recognizing manual markers. The non-manual markers, such as negative head shaking, question markers, and mouthing, are critical grammatical and semantic components of sign language for better usability and generalizability. Considering the significant role of non-manual markers, we propose the TransASL, a real-time, end-to-end system for sign language recognition and translation. TransASL extracts feature from both manual markers and non-manual markers via a customized eyeglasses-style wearable device with two parallel sensing modalities. Manual marker information is collected by two pairs of outward-facing microphones and speakers mounted to the legs of the eyeglasses. In contrast, non-manual marker information is acquired from a pair of inward-facing microphones and speakers connected to the eyeglasses. Both manual and non-manual marker features undergo a multi-modal, multi-channel fusion network and are eventually recognized as comprehensible ASL content. We evaluate the recognition performance of various sign language expressions at both the word and sentence levels. Given 80 frequently used ASL words and 40 meaningful sentences consisting of manual and non-manual markers, TransASL can achieve the WER of 8.3% and 7.1%, respectively. Our proposed work reveals a great potential for convenient ASL recognition in daily communications between ASL signers and hearing people.
AB - Sign language is a primary language used by deaf and hard-of-hearing (DHH) communities. However, existing sign language translation solutions primarily focus on recognizing manual markers. The non-manual markers, such as negative head shaking, question markers, and mouthing, are critical grammatical and semantic components of sign language for better usability and generalizability. Considering the significant role of non-manual markers, we propose the TransASL, a real-time, end-to-end system for sign language recognition and translation. TransASL extracts feature from both manual markers and non-manual markers via a customized eyeglasses-style wearable device with two parallel sensing modalities. Manual marker information is collected by two pairs of outward-facing microphones and speakers mounted to the legs of the eyeglasses. In contrast, non-manual marker information is acquired from a pair of inward-facing microphones and speakers connected to the eyeglasses. Both manual and non-manual marker features undergo a multi-modal, multi-channel fusion network and are eventually recognized as comprehensible ASL content. We evaluate the recognition performance of various sign language expressions at both the word and sentence levels. Given 80 frequently used ASL words and 40 meaningful sentences consisting of manual and non-manual markers, TransASL can achieve the WER of 8.3% and 7.1%, respectively. Our proposed work reveals a great potential for convenient ASL recognition in daily communications between ASL signers and hearing people.
KW - ASL recognition
KW - Acoustic sensing
KW - manual markers
KW - non-manual markers
KW - smart glasses
UR - https://www.scopus.com/pages/publications/85152141014
U2 - 10.1145/3581641.3584071
DO - 10.1145/3581641.3584071
M3 - Conference contribution
T3 - International Conference on Intelligent User Interfaces, Proceedings IUI
SP - 802
EP - 818
BT - IUI 2023 - Proceedings of the 28th International Conference on Intelligent User Interfaces
PB - Association for Computing Machinery
Y2 - 27 March 2023 through 31 March 2023
ER -