RMIT University
Browse

Deep Top-k Ranking for Image-Sentence Matching

journal contribution
posted on 2024-11-02, 17:48 authored by Lingling Zhang, Minnan Luo, Jun Liu, Xiaojun ChangXiaojun Chang, Yi Yang, Alexander Hauptmann
Image-sentence matching is a challenging task for the heterogeneity-gap between different modalities. Ranking-based methods have achieved excellent performance in this task in past decades. Given an image query, these methods typically assume that the correct matched image-sentence pair must rank before all other mismatched ones. However, this assumption may be too strict and prone to the overfitting problem, especially when some sentences in a massive database are similar and confusable with one another. In this paper, we relax the traditional ranking loss and propose a novel deep multi-modal network with a top-k ranking loss to mitigate the data ambiguity problem. With this strategy, query results will not be penalized unless the index of ground truth is outside the range of top-k query results. Considering the non-smoothness and non-convexity of the initial top-k ranking loss, we exploit a tight convex upper bound to approximate the loss and then utilize the traditional back-propagation algorithm to optimize the deep multi-modal network. Finally, we apply the method on three benchmark datasets, namely, Flickr8k, Flickr30k, and MSCOCO. Empirical results on metrics R@K (K = 1, 5, 10) show that our method achieves comparable performance in comparison to state-of-the-art methods.

Funding

Towards data-efficient future action prediction in the wild

Australian Research Council

Find out more...

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/TMM.2019.2931352
  2. 2.
    ISSN - Is published in 15209210

Journal

IEEE Transactions on Multimedia

Volume

22

Number

8777191

Issue

3

Start page

775

End page

785

Total pages

11

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2019 IEEE

Former Identifier

2006109324

Esploro creation date

2021-08-28

Usage metrics

    Scholarly Works

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC