Multiagent Systems: Learning, Strategic Behavior, Cooperation, and Network Formation

Abstract Many applications ranging from crowdsourcing to recommender systems involve informationally decentralized agents repeatedly interacting with each other in order to reach their goals. These networked agents base their decisions on incomplete information, which they gather through interactions with their neighbors or through cooperation, which is often costly. This chapter presents a discussion on decentralized learning algorithms that enable the agents to achieve their goals through repeated interaction. First, we discuss cooperative online learning algorithms that help the agents to discover beneficial connections with each other and exploit these connections to maximize the reward. For this case, we explain the relation between the learning speed, network topology, and cooperation cost. Then, we focus on how informationally decentralized agents form cooperation networks through learning. We explain how learning features prominently in many real-world interactions, and greatly affects the evolution of social networks. Links that otherwise would not have formed may now appear, and a much greater variety of network configurations can be reached. We show that the impact of learning on efficiency and social welfare could be both positive or negative. We also demonstrate the use of the aforementioned methods in popularity prediction, recommender systems, expert selection, and multimedia content aggregation.

[1]  Tad Hogg,et al.  Using a model of social dynamics to predict popularity of news , 2010, WWW '10.

[2]  Wolfgang Kellerer,et al.  Outtweeting the Twitterers - Predicting Information Cascades in Microblogs , 2010, WOSN.

[3]  I. Brown Social Media Surveillance , 2014 .

[4]  Michael Timmers,et al.  On the Use of Reservoir Computing in Popularity Prediction , 2010, 2010 2nd International Conference on Evolving Internet.

[5]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[6]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[7]  M. Jackson,et al.  A Strategic Model of Social and Economic Networks , 1996 .

[8]  Mihaela van der Schaar,et al.  Dynamic network formation with incomplete information , 2013, ArXiv.

[9]  Sanjeev Goyal,et al.  A Noncooperative Model of Network Formation , 2000 .

[10]  Cem Tekin,et al.  Multi-Objective contextual bandits with a dominant objective , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[11]  Aleksandrs Slivkins,et al.  Contextual Bandits with Similarity Information , 2009, COLT.

[12]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[13]  Yangbo Song,et al.  Social Learning with Endogenous Network Formation , 2015, ArXiv.

[14]  Anne-Marie Kermarrec,et al.  FStream: A Decentralized and Social Music Streamer , 2013, NETYS.

[15]  Feng Wang,et al.  Understand Instant Video Clip Sharing on Mobile Platforms: Twitter's Vine as a Case Study , 2014, NOSSDAV.

[16]  Mihaela van der Schaar,et al.  A Distributed Approach for Optimizing Cascaded Classifier Topologies in Real-Time Stream Mining Systems , 2010, IEEE Transactions on Image Processing.

[17]  Ann Nowé,et al.  Designing multi-objective multi-armed bandits algorithms: A study , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[18]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[19]  John R. Smith,et al.  Adapting Multimedia Internet Content for Universal Access , 1999, IEEE Trans. Multim..

[20]  Mihaela van der Schaar,et al.  Distributed Multi-Agent Online Learning Based on Global Feedback , 2015, IEEE Transactions on Signal Processing.

[21]  Mihaela van der Schaar,et al.  A Rules-Based Approach for Configuring Chains of Classifiers in Real-Time Stream Mining Systems , 2009, EURASIP J. Adv. Signal Process..

[22]  J. Elmore,et al.  Screening mammograms by community radiologists: variability in false-positive rates. , 2002, Journal of the National Cancer Institute.

[23]  Mihaela van der Schaar,et al.  From Acquaintances to Friends: Homophily and Learning in Networks , 2015, IEEE Journal on Selected Areas in Communications.

[24]  Thomas P. Hayes,et al.  Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.

[25]  Matthew O. Jackson,et al.  Naïve Learning in Social Networks and the Wisdom of Crowds , 2010 .

[26]  Nicolò Cesa-Bianchi,et al.  Combinatorial Bandits , 2012, COLT.

[27]  Mihaela van der Schaar,et al.  Dynamic network formation with foresighted agents , 2015, Int. J. Game Theory.

[28]  Brian D. Davison,et al.  Predicting popular messages in Twitter , 2011, WWW.

[29]  Ananthram Swami,et al.  Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[30]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[31]  Bhaskar Krishnamachari,et al.  Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations , 2010, IEEE/ACM Transactions on Networking.

[32]  Ke Xu,et al.  On popularity prediction of videos shared in online social networks , 2013, CIKM.

[33]  Cem Tekin,et al.  Multi-objective Contextual Bandit Problem with Similarity Information , 2018, AISTATS.

[34]  Mihaela van der Schaar,et al.  Distributed Online Learning via Cooperative Contextual Bandits , 2013, IEEE Transactions on Signal Processing.

[35]  J. Walrand,et al.  Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .

[36]  Schahram Dustdar,et al.  Building an Integrated Pan-European News Distribution Network , 2005, PRO-VE.

[37]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[38]  John N. Tsitsiklis,et al.  Linearly Parameterized Bandits , 2008, Math. Oper. Res..

[39]  Jussara M. Almeida,et al.  Using early view patterns to predict the popularity of youtube videos , 2013, WSDM.

[40]  W. Chou,et al.  Social Media Use in the United States: Implications for Health Communication , 2009, Journal of medical Internet research.

[41]  Matthew O. Jackson,et al.  The Evolution of Social and Economic Networks , 2002, J. Econ. Theory.

[42]  István Hegedüs,et al.  Gossip-based distributed stochastic bandit algorithms , 2013, ICML.

[43]  Martin Pál,et al.  Contextual Multi-Armed Bandits , 2010, AISTATS.

[44]  Mihaela van der Schaar,et al.  Reputational Dynamics in Financial Networks During a Crisis , 2018, Journal of Financial Stability.

[45]  Douglas Gale,et al.  Bayesian learning in social networks , 2003, Games Econ. Behav..

[46]  Mihaela van der Schaar,et al.  Contextual Online Learning for Multimedia Content Aggregation , 2015, IEEE Transactions on Multimedia.

[47]  Wei Chen,et al.  Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.

[48]  Chen Tian,et al.  Optimizing cost and performance for content multihoming , 2012, SIGCOMM '12.

[49]  Pablo Rodriguez,et al.  I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system , 2007, IMC '07.

[50]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .