RMIT University
Browse

Hybrid CNN-transformer network for interactive learning of challenging musculoskeletal images

journal contribution
posted on 2024-11-03, 10:51 authored by Lei Bi, Ulrich Buehner, Xiaohang Fu, Tom Williamson, Peter Choong, Jinman Kim
Background and objectives: Segmentation of regions of interest (ROIs) such as tumors and bones plays an essential role in the analysis of musculoskeletal (MSK) images. Segmentation results can help with orthopaedic surgeons in surgical outcomes assessment and patient's gait cycle simulation. Deep learning-based automatic segmentation methods, particularly those using fully convolutional networks (FCNs), are considered as the state-of-the-art. However, in scenarios where the training data is insufficient to account for all the variations in ROIs, these methods struggle to segment the challenging ROIs that with less common image characteristics. Such characteristics might include low contrast to the background, inhomogeneous textures, and fuzzy boundaries. Methods: we propose a hybrid convolutional neural network – transformer network (HCTN) for semi-automatic segmentation to overcome the limitations of segmenting challenging MSK images. Specifically, we propose to fuse user-inputs (manual, e.g., mouse clicks) with high-level semantic image features derived from the neural network (automatic) where the user-inputs are used in an interactive training for uncommon image characteristics. In addition, we propose to leverage the transformer network (TN) – a deep learning model designed for handling sequence data, in together with features derived from FCNs for segmentation; this addresses the limitation of FCNs that can only operate on small kernels, which tends to dismiss global context and only focus on local patterns. Results: We purposely selected three MSK imaging datasets covering a variety of structures to evaluate the generalizability of the proposed method. Our semi-automatic HCTN method achieved a dice coefficient score (DSC) of 88.46 ± 9.41 for segmenting the soft-tissue sarcoma tumors from magnetic resonance (MR) images, 73.32 ± 11.97 for segmenting the osteosarcoma tumors from MR images and 93.93 ± 1.84 for segmenting the clavicle bones from chest radiographs. When compared to the current state-of-the-art automatic segmentation method, our HCTN method is 11.7%, 19.11% and 7.36% higher in DSC on the three datasets, respectively. Conclusion: Our experimental results demonstrate that HCTN achieved more generalizable results than the current methods, especially with challenging MSK studies.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1016/j.cmpb.2023.107875
  2. 2.
    ISSN - Is published in 01692607

Journal

Computer Methods and Programs in Biomedicine

Volume

243

Number

107875

Start page

1

End page

10

Total pages

10

Publisher

Elsevier

Place published

Ireland

Language

English

Copyright

© 2023 Elsevier B.V. All rights reserved.

Former Identifier

2006126703

Esploro creation date

2023-12-09

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC