Skip to main navigation Skip to search Skip to main content

Attribute De-biased Vision Transformer (AD-ViT) for Long-Term Person Re-identification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Person re-identification (re-ID) aims to retrieve images of the same identity from a gallery of person images across cameras and viewpoints. However, most works in person re-ID assume a short-term setting characterized by invariance in appearance. In contrast, a high visual variance can be frequently seen in a long-term setting due to changes in apparel and accessories, which makes the task more challenging. Therefore, learning identity-specific features agnostic of temporally variant features is crucial for robust long-term person Re-ID. To this end, we propose an Attribute De-biased Vision Transformer (AD-ViT) to provide direct supervision to learn identity-specific features. Specifically, we produce attribute labels for person instances and utilize them to guide our model to focus on identity features through gradient reversal. Our experiments on two long-term re-ID datasets - LTCC and NKUP show that the proposed work consistently outperforms current state-of-the-art methods.

Original languageEnglish
Title of host publicationAVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665463829
DOIs
StatePublished - 2022
Event18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022 - Virtual, Online, Spain
Duration: Nov 29 2022Dec 2 2022

Publication series

NameAVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance

Conference

Conference18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022
Country/TerritorySpain
CityVirtual, Online
Period11/29/2212/2/22

Fingerprint

Dive into the research topics of 'Attribute De-biased Vision Transformer (AD-ViT) for Long-Term Person Re-identification'. Together they form a unique fingerprint.

Cite this