RMIT University
Browse

Memory-Based Network for Scene Graph with Unbalanced Relations

conference contribution
posted on 2024-11-03, 14:44 authored by Weitao Wang, Ruyang Liu, Meng Wang, Sen Wang, Xiaojun ChangXiaojun Chang, Yang Chen
The scene graph which can be represented by a set of visual triples is composed of objects and the relations between object pairs. It is vital for image captioning, visual question answering, and many other applications. However, there is a long tail distribution on the scene graph dataset, and the tail relation cannot be accurately identified due to the lack of training samples. The problem of the nonstandard label and feature overlap on the scene graph affects the extraction of discriminative features and exacerbates the effect of data imbalance on the model. For these reasons, we propose a novel scene graph generation model that can effectively improve the detection of low-frequency relations. We use the method of memory features to realize the transfer of high-frequency relation features to low-frequency relation features. Extensive experiments on scene graph datasets show that our model significantly improved the performance of two evaluation metrics R@K and mR@K compared with state-of-the-art baselines.

History

Start page

2400

End page

2408

Total pages

9

Outlet

Proceedings of the 28th ACM International Conference on Multimedia (MM 2020)

Name of conference

MM 2020

Publisher

Association for Computing Machinery

Place published

United States

Start date

2020-10-12

End date

2020-10-16

Language

English

Copyright

© 2020 Association for Computing Machinery.

Former Identifier

2006109364

Esploro creation date

2021-08-29

Usage metrics

    Scholarly Works

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC