Skip to main navigation Skip to search Skip to main content

Cross-media semantic representation via bi-directional learning to rank

  • Fei Wu
  • , Xinyan Lu
  • , Zhongfei Zhang
  • , Shuicheng Yan
  • , Yong Rui
  • , Yueting Zhuang
  • Zhejiang University
  • National University of Singapore
  • Microsoft USA

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

75 Scopus citations

Abstract

In multimedia information retrieval, most classic approaches tend to represent different modalities of media in the same feature space. Existing approaches take either one-to-one paired data or uni-directional ranking examples (i.e., utilizing only text-query-image ranking examples or image-query text ranking examples) as training examples, which do not make full use of bi-directional ranking examples (bi-directional ranking means that both text-query-image and image-query text ranking examples are utilized in the training period) to achieve a better performance. In this paper, we consider learning a cross-media representation model from the perspective of optimizing a listwise ranking problem while taking advantage of bi-directional ranking examples. We propose a general cross-media ranking algorithm to optimize the bi-directional listwise ranking loss with a latent space embedding, which we call Bi-directional Cross-Media Semantic Representation Model (Bi-CMSRM). The latent space embedding is discriminatively learned by the structural large margin learning for optimization with certain ranking criteria (mean average precision in this paper) directly. We evaluate Bi-CMSRM on the Wikipedia and NUS-WIDE datasets and show that the utilization of the bi-directional ranking examples achieves a much better performance than only using the unidirectional ranking examples.

Original languageEnglish
Title of host publicationMM 2013 - Proceedings of the 2013 ACM Multimedia Conference
Pages877-886
Number of pages10
DOIs
StatePublished - 2013
Event21st ACM International Conference on Multimedia, MM 2013 - Barcelona, Spain
Duration: Oct 21 2013Oct 25 2013

Publication series

NameMM 2013 - Proceedings of the 2013 ACM Multimedia Conference

Conference

Conference21st ACM International Conference on Multimedia, MM 2013
Country/TerritorySpain
CityBarcelona
Period10/21/1310/25/13

Keywords

  • Bi-directional learning to rank
  • Cross-media representation
  • Latent space embedding

Fingerprint

Dive into the research topics of 'Cross-media semantic representation via bi-directional learning to rank'. Together they form a unique fingerprint.

Cite this