RMIT University
Browse

Reinforcement Cutting-Agent Learning for Video Object Segmentation

conference contribution
posted on 2024-11-03, 14:43 authored by Junwei Han, Le Yang, Dingwen Zhang, Xiaojun ChangXiaojun Chang, Xiaodan Liang
Video object segmentation is a fundamental yet challenging task in computer vision community. In this paper, we formulate this problem as a Markov Decision Process, where agents are learned to segment object regions under a deep reinforcement learning framework. Essentially, learning agents for segmentation is nontrivial as segmentation is a nearly continuous decision-making process, where the number of the involved agents (pixels or superpixels) and action steps from the seed (super)pixels to the whole object mask might be incredibly huge. To overcome this difficulty, this paper simplifies the learning of segmentation agents to the learning of a cutting-agent, which only has a limited number of action units and can converge in just a few action steps. The basic assumption is that object segmentation mainly relies on the interaction between object regions and their context. Thus, with an optimal object (box) region and context (box) region, we can obtain the desirable segmentation mask through further inference. Based on this assumption, we establish a novel reinforcement cutting-agent learning framework, where the cutting-agent consists of a cutting-policy network and a cutting-execution network. The former learns policies for deciding optimal object-context box pair, while the latter executes the cutting function based on the inferred object-context box pair. With the collaborative interaction between the two networks, our method can achieve the outperforming VOS performance on two public benchmarks, which demonstrates the rationality of our assumption as well as the effectiveness of the proposed learning framework.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/CVPR.2018.00946
  2. 2.
    ISBN - Is published in 9781538664216 (urn:isbn:9781538664216)

Start page

9080

End page

9089

Total pages

10

Outlet

Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018)

Name of conference

CVPR 2018

Publisher

IEEE

Place published

United States

Start date

2018-06-18

End date

2018-06-23

Language

English

Copyright

© 2018 IEEE.

Former Identifier

2006109426

Esploro creation date

2021-08-29

Usage metrics

    Scholarly Works

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC