Skip to main navigation Skip to search Skip to main content

Manifold regularized cross-modal embedding for zero-shot learning

Research output: Contribution to journalArticlepeer-review

35 Scopus citations

Abstract

Zero-Shot Learning (ZSL) aims at classifying previously unseen class samples and has gained its popularity in applications where samples of some categories are scarce for training. The basic idea to address this issue is transferring knowledge from the seen classes to the unseen classes through mapping the visual feature to an embedding space spanned by class semantic information. The class semantic information can be obtained from human-labeled attributes or text corpus in an unsupervised fashion. Therefore, the embedding function from visual space to the embedding space is extremely important. However, the existing embedding approaches to ZSL mainly focus on aligning pairwise semantic consistency from heterogeneous spaces but ignore the intrinsic structure of the locally homogeneous isomorph. In order to preserve the locally visual structure in the embedding process, this paper proposes a Manifold regularized Cross-Modal Embedding (MCME) approach for ZSL by formulating the manifold constraint for intrinsic structure of the visual features as well as aligning pairwise consistency. The linear, closed-form solution makes MCME efficient to compute. Furthermore, rather than applying the embedding function learned from the seen classes directly, we also propose a new domain adaptation strategy to overcome the domain-shift problem during the knowledge transfer process. The MCME with the domain adaptation method is called MCME-DA. Extensive experiments on the benchmark datasets of AwA and CUB validate the superiority and promise of MCME and MCME-DA.

Original languageEnglish
Pages (from-to)48-58
Number of pages11
JournalInformation Sciences
Volume378
DOIs
StatePublished - Feb 1 2017

Keywords

  • Cross-modal embedding
  • Domain adaptation
  • Image classification
  • Manifold
  • Zero-shot learning

Fingerprint

Dive into the research topics of 'Manifold regularized cross-modal embedding for zero-shot learning'. Together they form a unique fingerprint.

Cite this