RMIT University
Browse

CHEESE: Distributed Clustering-Based Hybrid Federated Split Learning Over Edge Networks

journal contribution
posted on 2024-11-03, 10:54 authored by Zhipeng Cheng, Xiaoyu XiaXiaoyu Xia, Minghui Liwang, Xuwei Fan, Yanglong Sun, Xianbin Wang, Lianfen Huang
Implementing either Federated learning (FL) or split learning (SL) over clients with limited computation/communication resources faces challenges on achieving delay-efficient model training. To overcome such challenges, we investigate a novel distributed Clustering-based Hybrid fEdErated Split lEarning (CHEESE) framework, consolidating distributed resources among clients by device-to-device (D2D) communications, working in an intra-serial inter-parallel manner. In CHEESE, each learning client can form a cluster with its neighboring helping clients via D2D communications to train an FL model collaboratively. Inside each cluster, the model is split into multiple segments via a model splitting and allocation (MSA) strategy, while each cluster member trains one segment. After completing intra-cluster training, a transmission client (TC) is determined from each cluster to upload a complete model to the base station for global model aggregation under allocated bandwidth. Accordingly, an overall training delay cost minimization problem is formulated, involving the following subproblems: client clustering, MSA, TC selection, and bandwidth allocation. Due to its NP-Hardness, the problem is decoupled and solved iteratively. The client clustering problem is first transformed into a distributed clustering game based on potential game theory, where each cluster further investigates the remaining three subproblems to evaluate the utility of each clustering strategy. Specifically, a heuristic algorithm is proposed to solve the MSA problem under a given clustering strategy, while a greedy-based convex optimization approach is introduced to solve the joint TC selection and bandwidth allocation problem. Extensive experiments on practical models and datasets demonstrate that CHEESE can significantly reduce training delay costs.

History

Journal

IEEE Transactions on Parallel and Distributed Systems

Volume

34

Issue

12

Start page

3174

End page

3191

Total pages

18

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2023 IEEE.

Former Identifier

2006126357

Esploro creation date

2023-11-12

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC