Federated Learning with Momentum Acceleration in Multi-tier Networks

Yang, Zhengjie

Access status:

USyd Access

Field	Value	Language
dc.contributor.author	Yang, Zhengjie
dc.date.accessioned	2023-12-20T04:55:47Z
dc.date.available	2023-12-20T04:55:47Z
dc.date.issued	2023	en
dc.identifier.uri	https://hdl.handle.net/2123/32023
dc.description	Includes publication
dc.description.abstract	Federated learning (FL) is a fast-developing technique that allows multiple workers to train a global model based on a distributed dataset. Conventional FL (FedAvg) employs the gradient descent algorithm, which may not be efficient enough. Momentum can improve the situation by adding an additional momentum step to accelerate the convergence. While existing momentum-based FL algorithms have demonstrated desirable performance, they still encounter challenges related to data heterogeneity, out-of-date momentum, infrequent utilization of momentum, and disagreement between worker and aggregator momenta. Additionally, the advantages of Nesterov Accelerated Gradient (NAG), a more advantageous form of momentum, have not been quantified in the context of FL. In this thesis, we investigate how to efficiently address the aforementioned issues. Firstly, we propose FedNAG, which focuses on NAG momentum acceleration on workers in each local iteration. FedNAG incorporates aggregation and redistribution of both worker models and momenta. Secondly, we introduce FastSlowMo, a novel algorithm that combines worker and aggregator momenta. By leveraging momentum acceleration on both workers and the aggregator, FastSlowMo improves the overall efficiency of the training process. Furthermore, considering the advantages of three-tier hierarchical architecture in reducing communication burdens within local networks, we propose HierMo, a momentum-based FL algorithm which accelerates three-tier FL systems and enhances their performance. Finally, we develop HierAdMo, which dynamically adjusts the momentum factor to mitigate the negative effects of disagreement between workers and edge nodes, ultimately improving the long-run performance. In summary, this thesis proposes four novel algorithms to address the aforementioned challenges. Theoretical analyses and experimental validations are conducted to support the effectiveness of these algorithms in real-world FL scenarios.	en
dc.language.iso	en	en
dc.subject	federated learning	en
dc.subject	edge computing	en
dc.subject	momentum	en
dc.subject	distributed machine learning	en
dc.title	Federated Learning with Momentum Acceleration in Multi-tier Networks	en
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Computer Science	en
usyd.degree	Doctor of Philosophy Ph.D.	en
usyd.awardinginst	The University of Sydney	en
usyd.advisor	Bao, Wei
usyd.include.pub	Yes	en