Skip to main navigation Skip to search Skip to main content

Abductive natural language inference by interactive model with structural loss

  • Linhao Li
  • , Ao Wang
  • , Ming Xu
  • , Yongfeng Dong
  • , Xin Li
  • Hebei University of Technology

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The abductive natural language inference task (αNLI) is proposed to infer the most plausible explanation between the cause and the event. In the αNLI task, two observations are given, and the most plausible hypothesis is asked to pick out from the candidates. Existing methods model the relation between each candidate hypothesis separately and penalize the inference network uniformly. In this paper, we argue that it is unnecessary to distinguish the reasoning abilities among correct hypotheses; and similarly, all wrong hypotheses contribute the same when explaining the reasons of the observations. Therefore, we propose to group instead of ranking the hypotheses and design a structural loss called “joint softmax focal loss” in this paper. Based on the observation that the hypotheses are generally semantically related, we design a novel interactive language model aiming at exploiting the rich interaction among competing hypotheses. We name this new model for αNLI: Interactive Model with Structural Loss (IMSL). The experimental results show that our IMSL has achieved the highest performance on the RoBERTa-large pretrained model, with ACC and AUC results increased by about 1% and 5% respectively. We also compared the performance in terms of precision and sensitivity with publicly available code, demonstrating the efficiency and robustness of the proposed approach.

Original languageEnglish
Pages (from-to)82-88
Number of pages7
JournalPattern Recognition Letters
Volume177
DOIs
StatePublished - Jan 2024

Keywords

  • Abductive inference
  • BiLSTM
  • Deep neural network
  • Natural language inference
  • Pretrained model(RoBERTa)

Fingerprint

Dive into the research topics of 'Abductive natural language inference by interactive model with structural loss'. Together they form a unique fingerprint.

Cite this