TY - GEN
T1 - Attribute De-biased Vision Transformer (AD-ViT) for Long-Term Person Re-identification
AU - Lee, Kyung Won
AU - Jawade, Bhavin
AU - Mohan, Deen
AU - Setlur, Srirangaraj
AU - Govindaraju, Venu
N1 - Publisher Copyright: © 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Person re-identification (re-ID) aims to retrieve images of the same identity from a gallery of person images across cameras and viewpoints. However, most works in person re-ID assume a short-term setting characterized by invariance in appearance. In contrast, a high visual variance can be frequently seen in a long-term setting due to changes in apparel and accessories, which makes the task more challenging. Therefore, learning identity-specific features agnostic of temporally variant features is crucial for robust long-term person Re-ID. To this end, we propose an Attribute De-biased Vision Transformer (AD-ViT) to provide direct supervision to learn identity-specific features. Specifically, we produce attribute labels for person instances and utilize them to guide our model to focus on identity features through gradient reversal. Our experiments on two long-term re-ID datasets - LTCC and NKUP show that the proposed work consistently outperforms current state-of-the-art methods.
AB - Person re-identification (re-ID) aims to retrieve images of the same identity from a gallery of person images across cameras and viewpoints. However, most works in person re-ID assume a short-term setting characterized by invariance in appearance. In contrast, a high visual variance can be frequently seen in a long-term setting due to changes in apparel and accessories, which makes the task more challenging. Therefore, learning identity-specific features agnostic of temporally variant features is crucial for robust long-term person Re-ID. To this end, we propose an Attribute De-biased Vision Transformer (AD-ViT) to provide direct supervision to learn identity-specific features. Specifically, we produce attribute labels for person instances and utilize them to guide our model to focus on identity features through gradient reversal. Our experiments on two long-term re-ID datasets - LTCC and NKUP show that the proposed work consistently outperforms current state-of-the-art methods.
UR - https://www.scopus.com/pages/publications/85143907988
U2 - 10.1109/AVSS56176.2022.9959509
DO - 10.1109/AVSS56176.2022.9959509
M3 - Conference contribution
T3 - AVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance
BT - AVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022
Y2 - 29 November 2022 through 2 December 2022
ER -