Digital Twin for Federated Analytics Using a Bayesian Approach

We are now in an information era and the volume of data is growing explosively. However, due to privacy issues, it is very common that data cannot be freely shared among the data generating. Federated analytics was recently proposed aiming at deriving analytical insights among data-generating devices without exposing the raw data, but the intermediate analytics results. Note that the computing resources at the data generating devices are limited, thus making on-device execution of computing-intensive tasks challenging. We thus propose to apply the digital twin technique, which emulates the resource-limited physical/end side, while utilizing the rich resource at the virtual/computing side. Nevertheless, how to use the digital twin technique to assist federated analytics while preserving distributed data privacy is challenging. To address such a challenge, this work first formulates a problem on digital twin-assisted federated distribution discovery. Then, we propose a federated Markov chain Monte Carlo with a delayed rejection (FMCMC-DR) method to estimate the representative parameters of the global distribution. We combine a rejection–acceptance sampling technique and a delayed rejection technique, allowing our method to be able to explore the full state space. Finally, we evaluate FMCMC-DR against the Metropolis–Hastings (MH) algorithm and random walk Markov chain Monte Carlo method (RW-MCMC) using numerical experiments. The results show our algorithm outperforms the other two methods by 50% and 95% contour accuracy, respectively, and has a better convergence rate.