Faster Federated Learning with decaying number of local SGD steps

Mills, J; Hu, J; Min, G

dc.contributor.author	Mills, J
dc.contributor.author	Hu, J
dc.contributor.author	Min, G
dc.date.accessioned	2023-05-17T08:03:35Z
dc.date.issued	2023-05-17
dc.date.updated	2023-05-16T14:33:12Z
dc.description.abstract	In Federated Learning (FL) client devices connected over the internet collaboratively train a machine learning model without sharing their private data with a central server or with other clients. The seminal Federated Averaging (FedAvg) algorithm trains a single global model by performing rounds of local training on clients followed by model averaging. FedAvg can improve the communication-efficiency of training by performing more steps of Stochastic Gradient Descent (SGD) on clients in each round. However, client data in real-world FL is highly heterogeneous, which has been extensively shown to slow model convergence and harm final performance when K > 1 steps of SGD are performed on clients per round. In this work we propose decaying K as training progresses, which can jointly improve the final performance of the FL model whilst reducing the wall-clock time and the total computational cost of training compared to using a fixed K. We analyse the convergence of FedAvg with decaying K for strongly-convex objectives, providing novel insights into the convergence properties, and derive three theoretically-motivated decay schedules for K. We then perform thorough experiments on four benchmark FL datasets (FEMNIST, CIFAR100, Sentiment140, Shakespeare) to show the real-world benefit of our approaches in terms of real-world convergence time, computational cost, and generalisation performance	en_GB
dc.description.sponsorship	Engineering and Physical Sciences Research Council (EPSRC)	en_GB
dc.description.sponsorship	UK Research and Innovation	en_GB
dc.description.sponsorship	European Union Horizon 2020	en_GB
dc.identifier.citation	Published online 17 May 2023	en_GB
dc.identifier.doi	10.1109/TPDS.2023.3277367
dc.identifier.grantnumber	EP/X019160/1	en_GB
dc.identifier.grantnumber	EP/X038866/1	en_GB
dc.identifier.grantnumber	101086159	en_GB
dc.identifier.uri	http://hdl.handle.net/10871/133155
dc.identifier	ORCID: 0000-0001-5406-8420 (Hu, Jia)
dc.language.iso	en	en_GB
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en_GB
dc.rights	© 2023 IEEE
dc.subject	Federated Learning	en_GB
dc.subject	Deep Learning	en_GB
dc.subject	Edge Computing	en_GB
dc.subject	Computational Efficiency	en_GB
dc.title	Faster Federated Learning with decaying number of local SGD steps	en_GB
dc.type	Article	en_GB
dc.date.available	2023-05-17T08:03:35Z
dc.identifier.issn	1045-9219
dc.description	This is the author accepted manuscript. The final version is available from IEEE via the DOI in this record	en_GB
dc.identifier.eissn	1558-2183
dc.identifier.journal	IEEE Transactions on Parallel and Distributed Systems	en_GB
dc.rights.uri	http://www.rioxx.net/licenses/all-rights-reserved	en_GB
dcterms.dateAccepted	2023-05-09
dcterms.dateSubmitted	2021-10-27
rioxxterms.version	AM	en_GB
rioxxterms.licenseref.startdate	2023-05-09
rioxxterms.type	Journal Article/Review	en_GB
refterms.dateFCD	2023-05-16T14:33:14Z
refterms.versionFCD	AM
refterms.dateFOA	2023-05-26T13:56:54Z
refterms.panel	B	en_GB

Files in this item

Name:: TPDS-Faster-FL-final.pdf
Size:: 1.350Mb
Format:: PDF
Description:: Faster Federated Learning with ...

View/Open

This item appears in the following Collection(s)

Computer Science

Show simple item record

Show Statistical Information