论文信息 - Federated Over-the-Air Subspace Learning from Incomplete Data

Federated Over-the-Air Subspace Learning from Incomplete Data

Federated learning refers to a distributed learning scenario in which users/nodes keep their data private but only share intermediate locally computed iterates with the master node. The master, in turn, shares a global aggregate of these iterates with all the nodes at each iteration. In this work, we consider a wireless federated learning scenario where the nodes communicate to and from the master node via a wireless channel. Current and upcoming technologies such as 5G (and beyond) will operate mostly in a non-orthogonal multiple access (NOMA) mode where transmissions from the users occupy the same bandwidth and interfere at the access point. These technologies naturally lend themselves to an "over-the-air" superposition whereby information received from the user nodes can be directly summed at the master node. However, over-the-air aggregation also means that the channel noise can corrupt the algorithm iterates at the time of aggregation at the master. This iteration noise introduces a novel set of challenges that have not been previously studied in the literature. It needs to be treated differently from the well-studied setting of noise or corruption in the dataset itself. In this work, we first study the subspace learning problem in a federated over-the-air setting. Subspace learning involves computing the subspace spanned by the top $r$ singular vectors of a given matrix. We develop a federated over-the-air version of the power method (FedPM) and show that its iterates converge as long as (i) the channel noise is very small compared to the $r$-th singular value of the matrix; and (ii) the ratio between its $(r+1)$-th and $r$-th singular value is smaller than a constant less than one. The second important contribution of this work is to show how over-the-air FedPM can be used to obtain a provably accurate federated solution for subspace tracking in the presence of missing data.

Aditya Ramamoorthy | Namrata Vaswani | Praneeth Narayanamurthy

[1] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[2] Zhi Ding,et al. Federated Learning via Over-the-Air Computation , 2018, IEEE Transactions on Wireless Communications.

[3] Prateek Jain,et al. Nearly Optimal Robust Matrix Completion , 2016, ICML.

[4] Maria-Florina Balcan,et al. An Improved Gap-Dependency Analysis of the Noisy Power Method , 2016, COLT.

[5] Namrata Vaswani,et al. NEARLY OPTIMAL ROBUST SUBSPACE TRACKING: A UNIFIED APPROACH , 2017, 2018 IEEE Data Science Workshop (DSW).

[6] Deniz Gündüz,et al. Federated Learning Over Wireless Fading Channels , 2019, IEEE Transactions on Wireless Communications.

[7] Namrata Vaswani,et al. PCA in Sparse Data-Dependent Noise , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[8] Yonina C. Eldar,et al. Subspace Learning with Partial Information , 2014, J. Mach. Learn. Res..

[9] Indranil Gupta,et al. Fall of Empires: Breaking Byzantine-tolerant SGD by Inner Product Manipulation , 2019, UAI.

[10] Moritz Hardt,et al. The Noisy Power Method: A Meta Algorithm with Applications , 2013, NIPS.

[11] W. Kahan,et al. The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[12] Cecilia Mascolo,et al. Federated PCA with Adaptive Rank Estimation , 2019, ArXiv.

[13] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[14] David P. Woodruff,et al. Improved Distributed Principal Component Analysis , 2014, NIPS.

[15] Deniz Gündüz,et al. Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[16] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.

[17] Dan Alistarh,et al. Byzantine Stochastic Gradient Descent , 2018, NeurIPS.

[18] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[19] Anass Benjebbour,et al. Non-Orthogonal Multiple Access (NOMA) for Cellular Future Radio Access , 2013, 2013 IEEE 77th Vehicular Technology Conference (VTC Spring).

[20] Dejiao Zhang,et al. Global Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation , 2015, AISTATS.

[21] A. Robert Calderbank,et al. PETRELS: Parallel Subspace Estimation and Tracking by Recursive Least Squares From Partial Observations , 2012, IEEE Transactions on Signal Processing.

[22] Kin K. Leung,et al. Adaptive Federated Learning in Resource Constrained Edge Computing Systems , 2018, IEEE Journal on Selected Areas in Communications.

[23] Qing Ling,et al. Byzantine-Robust Stochastic Gradient Descent for Distributed Low-Rank Matrix Completion , 2019, 2019 IEEE Data Science Workshop (DSW).

[24] Mihaela van der Schaar,et al. Machine Learning in the Air , 2019, IEEE Journal on Selected Areas in Communications.

[25] Roman Vershynin,et al. High-Dimensional Probability , 2018 .

[26] Benjamin Recht,et al. A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[27] Qiang Yang,et al. Federated Machine Learning , 2019, ACM Trans. Intell. Syst. Technol..

[28] M. Rudelson,et al. The smallest singular value of a random rectangular matrix , 2008, 0802.3956.

[29] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[30] Namrata Vaswani,et al. Provable Subspace Tracking From Missing Data and Matrix Completion , 2018, IEEE Transactions on Signal Processing.

[31] Anna Scaglione,et al. A Review of Distributed Algorithms for Principal Component Analysis , 2018, Proceedings of the IEEE.

[32] Rainer Gemulla,et al. Distributed Matrix Completion , 2012, 2012 IEEE 12th International Conference on Data Mining.