Show simple item record

dc.contributor.authorMills, J
dc.contributor.authorHu, J
dc.contributor.authorMin, G
dc.date.accessioned2023-05-17T08:03:35Z
dc.date.issued2023-05-17
dc.date.updated2023-05-16T14:33:12Z
dc.description.abstractIn Federated Learning (FL) client devices connected over the internet collaboratively train a machine learning model without sharing their private data with a central server or with other clients. The seminal Federated Averaging (FedAvg) algorithm trains a single global model by performing rounds of local training on clients followed by model averaging. FedAvg can improve the communication-efficiency of training by performing more steps of Stochastic Gradient Descent (SGD) on clients in each round. However, client data in real-world FL is highly heterogeneous, which has been extensively shown to slow model convergence and harm final performance when K > 1 steps of SGD are performed on clients per round. In this work we propose decaying K as training progresses, which can jointly improve the final performance of the FL model whilst reducing the wall-clock time and the total computational cost of training compared to using a fixed K. We analyse the convergence of FedAvg with decaying K for strongly-convex objectives, providing novel insights into the convergence properties, and derive three theoretically-motivated decay schedules for K. We then perform thorough experiments on four benchmark FL datasets (FEMNIST, CIFAR100, Sentiment140, Shakespeare) to show the real-world benefit of our approaches in terms of real-world convergence time, computational cost, and generalisation performanceen_GB
dc.description.sponsorshipEngineering and Physical Sciences Research Council (EPSRC)en_GB
dc.description.sponsorshipUK Research and Innovationen_GB
dc.description.sponsorshipEuropean Union Horizon 2020en_GB
dc.identifier.citationPublished online 17 May 2023en_GB
dc.identifier.doi10.1109/TPDS.2023.3277367
dc.identifier.grantnumberEP/X019160/1en_GB
dc.identifier.grantnumberEP/X038866/1en_GB
dc.identifier.grantnumber101086159en_GB
dc.identifier.urihttp://hdl.handle.net/10871/133155
dc.identifierORCID: 0000-0001-5406-8420 (Hu, Jia)
dc.language.isoenen_GB
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_GB
dc.rights© 2023 IEEE
dc.subjectFederated Learningen_GB
dc.subjectDeep Learningen_GB
dc.subjectEdge Computingen_GB
dc.subjectComputational Efficiencyen_GB
dc.titleFaster Federated Learning with decaying number of local SGD stepsen_GB
dc.typeArticleen_GB
dc.date.available2023-05-17T08:03:35Z
dc.identifier.issn1045-9219
dc.descriptionThis is the author accepted manuscript. The final version is available from IEEE via the DOI in this recorden_GB
dc.identifier.eissn1558-2183
dc.identifier.journalIEEE Transactions on Parallel and Distributed Systemsen_GB
dc.rights.urihttp://www.rioxx.net/licenses/all-rights-reserveden_GB
dcterms.dateAccepted2023-05-09
dcterms.dateSubmitted2021-10-27
rioxxterms.versionAMen_GB
rioxxterms.licenseref.startdate2023-05-09
rioxxterms.typeJournal Article/Reviewen_GB
refterms.dateFCD2023-05-16T14:33:14Z
refterms.versionFCDAM
refterms.dateFOA2023-05-26T13:56:54Z
refterms.panelBen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record