Skip to main navigation Skip to search Skip to main content

TRIGGER HUNTING WITH A TOPOLOGICAL PRIOR FOR TROJAN DETECTION

  • Xiaoling Hu
  • , Xiao Lin
  • , Michael Cogswell
  • , Yi Yao
  • , Susmit Jha
  • , Chao Chen

Research output: Contribution to conferencePaperpeer-review

25 Scopus citations

Abstract

Despite their success and popularity, deep neural networks (DNNs) are vulnerable when facing backdoor attacks. This impedes their wider adoption, especially in mission critical applications. This paper tackles the problem of Trojan detection, namely, identifying Trojaned models - models trained with poisoned data. One popular approach is reverse engineering, i.e., recovering the triggers on a clean image by manipulating the model's prediction. One major challenge of reverse engineering approach is the enormous search space of triggers. To this end, we propose innovative priors such as diversity and topological simplicity to not only increase the chances of finding the appropriate triggers but also improve the quality of the found triggers. Moreover, by encouraging a diverse set of trigger candidates, our method can perform effectively in cases with unknown target labels. We demonstrate that these priors can significantly improve the quality of the recovered triggers, resulting in substantially improved Trojan detection accuracy as validated on both synthetic and publicly available TrojAI benchmarks.

Original languageEnglish
StatePublished - 2022
Event10th International Conference on Learning Representations, ICLR 2022 - Virtual, Online
Duration: Apr 25 2022Apr 29 2022

Conference

Conference10th International Conference on Learning Representations, ICLR 2022
CityVirtual, Online
Period04/25/2204/29/22

Fingerprint

Dive into the research topics of 'TRIGGER HUNTING WITH A TOPOLOGICAL PRIOR FOR TROJAN DETECTION'. Together they form a unique fingerprint.

Cite this