TY - GEN
T1 - MeLT
T2 - 2021 Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021
AU - Matero, Matthew
AU - Soni, Nikita
AU - Balasubramanian, Niranjan
AU - Schwartz, H. Andrew
N1 - Publisher Copyright: © 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - Much of natural language processing is focused on leveraging large capacity language models, typically trained over single messages with a task of predicting one or more tokens. However, modeling human language at higher-levels of context (i.e., sequences of messages) is underexplored. In stance detection and other social media tasks where the goal is to predict an attribute of a message, we have contextual data that is loosely semantically connected by authorship. Here, we introduce Message-Level Transformer (MeLT) - a hierarchical messageencoder pre-trained over Twitter and applied to the task of stance prediction. We focus on stance prediction as a task benefiting from knowing the context of the message (i.e., the sequence of previous messages). The model is trained using a variant of masked-language modeling; where instead of predicting tokens, it seeks to generate an entire masked (aggregated) message vector via reconstruction loss. We find that applying this pre-trained masked messagelevel transformer to the downstream task of stance detection achieves F1 performance of 67%.
AB - Much of natural language processing is focused on leveraging large capacity language models, typically trained over single messages with a task of predicting one or more tokens. However, modeling human language at higher-levels of context (i.e., sequences of messages) is underexplored. In stance detection and other social media tasks where the goal is to predict an attribute of a message, we have contextual data that is loosely semantically connected by authorship. Here, we introduce Message-Level Transformer (MeLT) - a hierarchical messageencoder pre-trained over Twitter and applied to the task of stance prediction. We focus on stance prediction as a task benefiting from knowing the context of the message (i.e., the sequence of previous messages). The model is trained using a variant of masked-language modeling; where instead of predicting tokens, it seeks to generate an entire masked (aggregated) message vector via reconstruction loss. We find that applying this pre-trained masked messagelevel transformer to the downstream task of stance detection achieves F1 performance of 67%.
UR - https://www.scopus.com/pages/publications/85129222039
U2 - 10.18653/v1/2021.findings-emnlp.253
DO - 10.18653/v1/2021.findings-emnlp.253
M3 - Conference contribution
T3 - Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021
SP - 2959
EP - 2966
BT - Findings of the Association for Computational Linguistics, Findings of ACL
A2 - Moens, Marie-Francine
A2 - Huang, Xuanjing
A2 - Specia, Lucia
A2 - Yih, Scott Wen-Tau
PB - Association for Computational Linguistics (ACL)
Y2 - 7 November 2021 through 11 November 2021
ER -