RMIT University
Browse

Identification of moving objects in complex dynamic scenes using semantics

Download (17.28 MB)
thesis
posted on 2024-11-25, 19:06 authored by Sundaram Muthu

Identification of independently moving objects in a complex dynamic scene and estimation of their individual motion parameters (i.e. motion segmentation) are important tasks in many computer vision applications. Applications of motion segmentation include autonomous driving, robotics, surveillance and tracking, etc. However, solving the motion segmentation problem is challenging when there is no prior knowledge of the number of moving objects. Prior methods suffer from issues like poor generalization (inability to segment objects not previously seen in training data), over-reliance on inaccurate motion information, and a problem towards capturing fine details at object boundaries. Segmentation of small or slow moving objects in a complex dynamic scene also remains a particular challenge. The significant research gaps in the field were identified and this work is aimed at bridging those gaps, and robustly segmenting and tracking multiple moving objects. The proposed research is specifically intended to enhance the existing machine vision technology towards developing solutions for safe vehicle operation in constrained industrial environments.

In order to achieve these aims, this research focuses on the utilization of higher order visual information (object detections) to improve the performance of motion segmentation. In this thesis, a novel statistical inference method to avoid the over-segmentation problem is presented. Then a model that uses multi-cue based neural attention to improve segmentation and tracking in videos is developed. The accuracy of object segmentation boundaries is refined and improved using edge information. To make the model suitable for real-time applications, an efficient clustering algorithm using message passing graph neural networks that aggregates higher order information is designed. The algorithms formulated were tested on challenging datasets. The results show that the method is capable of detecting multiple unseen objects with accurate object boundaries, complex object motions, camouflaged object motions, occlusions, articulated non-rigid object motions, background clutter, etc.

Firstly, the thesis briefly elaborates the aims and objectives of the research, and the motivation behind it. Secondly, the background information and the foundations for understanding motion segmentation, video object segmentation and clustering tasks are built through a literature review provided in the second chapter. The main research gaps like how to handle small motions, multiple moving objects, objects from unseen test videos, and how to improve the efficiency of the solution are highlighted. Next, the thesis proposes a method using semantic object detection and statistical inference to improve motion segmentation in the third chapter. Further, a video object segmentation method combining multiple available cues (appearance, motion, image edge, flow edge and tracking information) is developed in the fourth chapter. Next, the thesis contributes a generic efficient method suitable for real-world applications by the use of message passing graph neural networks to speed up the clustering process. Finally, the thesis presents the summary of contributions, conclusions and future research directions.

History

Degree Type

Doctorate by Research

Imprint Date

2021-01-01

School name

School of Engineering, RMIT University

Former Identifier

9922088233601341

Open access

  • Yes