Reinforcement learning in a Multi-agent Framework for Pedestrian Simulation

[1]  Michael Scott Ramming,et al.  NETWORK KNOWLEDGE AND ROUTE CHOICE , 2002 .

[2]  Javier García,et al.  Probabilistic Policy Reuse for inter-task transfer learning , 2010, Robotics Auton. Syst..

[3]  John Morrall,et al.  Analysis of factors affecting the choice of route of pedestrians , 1985 .

[4]  Marc Carreras,et al.  Policy gradient based Reinforcement Learning for real autonomous underwater cable tracking , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[6]  Ajith Abraham,et al.  Emotional ant based modeling of crowd dynamics , 2005, Seventh International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC'05).

[7]  Andreas Schadschneider,et al.  Empirical Results for Pedestrian Dynamics and their Implications for Cellular Automata Models , 2009 .

[8]  C. Rogsch,et al.  Basics of Software-Tools for Pedestrian Movement—Identification and Results , 2012 .

[9]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[10]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[11]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1998 .

[12]  Stefania Bandini,et al.  Agent Based Modeling and Simulation: An Informatics Perspective , 2009, J. Artif. Soc. Soc. Simul..

[13]  Andreas Schadschneider,et al.  VALIDATION OF CA MODELS OF PEDESTRIAN DYNAMICS WITH FUNDAMENTAL DIAGRAMS , 2009, Cybern. Syst..

[14]  Jian Li,et al.  Simulation of the kin behavior in building occupant evacuation based on Cellular Automaton , 2005 .

[15]  Sandip Sen,et al.  Multiagent Coordination with Learning Classifier Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[16]  Prasad Tadepalli,et al.  Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery , 2006, ECML.

[17]  Mark H. Overmars,et al.  A Predictive Collision Avoidance Model for Pedestrian Simulation , 2009, MIG.

[18]  Daniel Thalmann,et al.  Hierarchical Model for Real Time Simulation of Virtual Human Crowds , 2001, IEEE Trans. Vis. Comput. Graph..

[19]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[20]  Soraia Raupp Musse,et al.  Modeling individual behaviors in crowd simulation , 2003, Proceedings 11th IEEE International Workshop on Program Comprehension.

[21]  Richard S. Sutton,et al.  Training and Tracking in Robotics , 1985, IJCAI.

[22]  Shimon Whiteson,et al.  Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs , 2009, 2009 International Conference on Machine Learning and Applications.

[23]  Peter Stone,et al.  Graph-Based Domain Mapping for Transfer Learning in General Games , 2007, ECML.

[24]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[25]  Zarita Zainuddin,et al.  Incorporating Decision Making Capability into the Social Force Model in Unidirectional Flow , 2010 .

[26]  Yangsheng Xu,et al.  Dynamic energy management for hybrid electric vehicle based on approximate dynamic programming , 2008, 2008 7th World Congress on Intelligent Control and Automation.

[27]  Lambert Schomaker,et al.  Dedicated TD-learning for Stronger Gameplay: applications to Go , 2004 .

[28]  Daniel Thalmann,et al.  Semantics-based representation of virtual environments , 2005, Int. J. Comput. Appl. Technol..

[29]  Robert Givan,et al.  Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.

[30]  Bruce Blumberg,et al.  Integrated learning for interactive synthetic characters , 2002, SIGGRAPH.

[31]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[32]  Lorenza Manenti,et al.  Crystals of Crowd: Modelling Pedestrian Groups Using MAS-based Approach , 2011, WOA.

[33]  José Rogan,et al.  Cellular automaton model for evacuation process with obstacles , 2007 .

[34]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[35]  Fernando Fernández,et al.  Two steps reinforcement learning , 2008, Int. J. Intell. Syst..

[36]  Sébastien Paris,et al.  Pedestrian Reactive Navigation for Crowd Simulation: a Predictive Approach , 2007, Comput. Graph. Forum.

[37]  Peter Dayan,et al.  Structure in the Space of Value Functions , 2002, Machine Learning.

[38]  Joëlle Thollot,et al.  A physically-based particle model of emergent crowd behaviors , 2010, ArXiv.

[39]  A. Seyfried,et al.  The fundamental diagram of pedestrian movement revisited , 2005, physics/0506170.

[40]  Takeshi Sakuma,et al.  Psychological model for animating crowded pedestrians: Virtual Humans and Social Agents , 2005 .

[41]  A. Seyfried,et al.  Methods for measuring pedestrian density, flow, speed and direction with minimal scatter , 2009, 0911.2165.

[42]  A. Schadschneider,et al.  Simulation of pedestrian dynamics using a two dimensional cellular automaton , 2001 .

[43]  Kardi Teknomo,et al.  Microscopic Pedestrian Flow Characteristics: Development of an Image Processing Data Collection and Simulation Model , 2016, ArXiv.

[44]  Stephen Chenney,et al.  Flow tiles , 2004, SCA '04.

[45]  Dirk Helbing,et al.  Specification of the Social Force Pedestrian Model by Evolutionary Adjustment to Video Tracking Data , 2007, Adv. Complex Syst..

[46]  Scott E. Page,et al.  Agent-Based Models , 2014, Encyclopedia of GIS.

[47]  M. Masen,et al.  A systems based experimental approach to tactile friction. , 2011, Journal of the mechanical behavior of biomedical materials.

[48]  A. Johansson,et al.  Constant-net-time headway as a key mechanism behind pedestrian flow dynamics. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  Majid Sarvi,et al.  Animal dynamics based approach for modeling pedestrian crowd egress under panic conditions , 2011 .

[50]  Wang Bing-Hong,et al.  Evacuation behaviors at exit in CA model with force essentials: A comparison with social force model , 2006 .

[51]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[52]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[53]  Tobias Kretz,et al.  Pedestrian Traffic - Simulation and Experiments , 2007 .

[54]  Suiping Zhou,et al.  Modeling and simulation of pedestrian behaviors in crowded places , 2011, TOMC.

[55]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[56]  Andreas Schadschneider,et al.  Simulation of evacuation processes using a bionics-inspired cellular automaton model for pedestrian dynamics , 2002 .

[57]  Bikramjit Banerjee,et al.  General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[58]  Michael Schreckenberg,et al.  A microscopic model for simulating mustering and evacuation processes onboard passenger ships , 2002 .

[59]  E. Thorndike,et al.  The influence of improvement in one mental function upon the efficiency of other functions. II. The estimation of magnitudes. , 1901 .

[60]  S. Whiteson,et al.  Adaptive Tile Coding for Value Function Approximation , 2007 .

[61]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[62]  Lubos Buzna,et al.  Self-Organized Pedestrian Crowd Dynamics: Experiments, Simulations, and Design Solutions , 2005, Transp. Sci..

[63]  Peter Stone,et al.  Value-Function-Based Transfer for Reinforcement Learning Using Structure Mapping , 2006, AAAI.

[64]  Tomoichi Takahashi,et al.  BDI Agent model Based Evacuation Simulation (Demonstration) , 2011 .

[65]  Dirk Helbing,et al.  Simulating dynamical features of escape panic , 2000, Nature.

[66]  Natalie Fridman,et al.  Modeling pedestrian crowd behavior based on a cognitive model of social comparison theory , 2010, Comput. Math. Organ. Theory.

[67]  Markus Schneider,et al.  The Teaching-Box: A universal robot learning framework , 2009, 2009 International Conference on Advanced Robotics.

[68]  John J Fruin,et al.  DESIGNING FOR PEDESTRIANS: A LEVEL-OF-SERVICE CONCEPT , 1971 .

[69]  Michael Batty,et al.  Pedestrian Behaviour Modelling An application to retail movements using a genetic algorithm , 2004 .

[70]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[71]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[72]  Maja J. Mataric,et al.  A general algorithm for robot formations using local sensing and minimal communication , 2002, IEEE Trans. Robotics Autom..

[73]  Hajime Inamura,et al.  MEASURING MICROSCOPIC FLOW PERFORMANCE FOR PEDESTRIANS , 2001 .

[74]  Kai Nagel,et al.  The MATSim Network Flow Model for Traffic Simulation Adapted to Large-Scale Emergency Egress and an Application to the Evacuation of the Indonesian City of Padang in Case of a Tsunami Warning , 2009 .

[75]  Frank Dignum Agents for games and simulations , 2011, Autonomous Agents and Multi-Agent Systems.

[76]  Ameya Shendarkar,et al.  Crowd simulation for emergency response using BDI agents based on immersive virtual reality , 2008, Simul. Model. Pract. Theory.

[77]  Hiroshi Tsukaguchi,et al.  A new method for evaluation of level of service in pedestrian facilities , 1987 .

[78]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[79]  Robert S. Sobel,et al.  Determinants of Nonstationary Personal Space Invasion , 1975 .

[80]  Sergey Levine,et al.  Space-time planning with parameterized locomotion controllers , 2011, TOGS.

[81]  Andrea Lockerd Thomaz,et al.  Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.

[82]  Jean-Michel Auberlet,et al.  Towards Contextual Goal-oriented Perception for Pedestrian Simulation , 2012, ICAART.

[83]  Daniel Thalmann,et al.  Crowds of Moving Objects: Navigation Planning and Simulation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[84]  Katsuhiro Nishinari,et al.  Physics of Transport and Traffic Phenomena in Biology: from molecular motors and cells to organisms , 2005 .

[85]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[86]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[87]  Kathryn E. Merrick,et al.  Motivated reinforcement learning for adaptive characters in open-ended simulation games , 2007, ACE '07.

[88]  Biao Leng,et al.  An extended floor field model based on regular hexagonal cells for pedestrian simulation , 2014 .

[89]  Dinesh Manocha,et al.  ClearPath: highly parallel collision avoidance for multi-agent simulation , 2009, SCA '09.

[90]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[91]  J. Lévêque,et al.  Age-related mechanical properties of human skin: an in vivo study. , 1989, The Journal of investigative dermatology.

[92]  Norman I. Badler,et al.  Modeling Crowd and Trained Leader Behavior during Building Evacuation , 2006, IEEE Computer Graphics and Applications.

[93]  Demetri Terzopoulos,et al.  A physical model of facial tissue and muscle articulation , 1990, [1990] Proceedings of the First Conference on Visualization in Biomedical Computing.

[94]  Mohcine Chraibi,et al.  Efficient and validated simulation of crowds for an evacuation assistant , 2012, Comput. Animat. Virtual Worlds.

[95]  Nicolas Courty,et al.  Crowd motion capture , 2007 .

[96]  Richard L. Lewis,et al.  Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.

[97]  Brian Tanner,et al.  RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[98]  Jude W. Shavlik,et al.  Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another , 2005, ECML.

[99]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[100]  R. P. Mccall Physics of the human body , 2010 .

[101]  Matthew Saffell,et al.  Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.

[102]  Mykel J. Kochenderfer Adaptive modelling and planning for learning intelligent behaviour , 2006 .

[103]  D. Helbing,et al.  Computer Simulations of Pedestrian Dynamics and Trail Formation , 1998, cond-mat/9805074.

[104]  Dirk Helbing,et al.  Self-Organizing Pedestrian Movement , 2001 .

[105]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[106]  John J. Fruin,et al.  Pedestrian planning and design , 1971 .

[107]  Tobias Kretz,et al.  Pedestrian traffic: on the quickest path , 2009, ArXiv.

[108]  R. Bellman,et al.  Dynamic Programming and Markov Processes , 1960 .

[109]  Dirk Heidemann,et al.  QUEUEING AT UNSIGNALIZED INTERSECTIONS , 1997 .

[110]  Peter Stone,et al.  Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[111]  Oliver Lemon,et al.  Using dialogue acts to learn better repair strategies for spoken dialogue systems , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[112]  Dylan A. Shell,et al.  Extending Open Dynamics Engine for Robotics Simulation , 2010, SIMPAR.

[113]  Peter Stone,et al.  Batch reinforcement learning in a complex domain , 2007, AAMAS '07.

[114]  Matthias Zwicker,et al.  Learning motion controllers with adaptive depth perception , 2012, SCA '12.

[115]  Stéphane Donikian,et al.  A synthetic-vision based steering approach for crowd simulation , 2010, ACM Transactions on Graphics.

[116]  Tze-Yun Leong,et al.  Online Feature Selection for Model-based Reinforcement Learning , 2013, ICML.

[117]  Dirk Helbing,et al.  Dynamics of crowd disasters: an empirical study. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[118]  Preben Alstrøm,et al.  Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.

[119]  Cécile Appert-Rolland,et al.  Realistic following behaviors for crowd simulation , 2012, Comput. Graph. Forum.

[120]  Winnie Daamen,et al.  Modelling passenger flows in public transport facilities , 2004 .

[121]  Takamasa Iryo,et al.  Microscopic pedestrian simulation model combined with a tactical model for route choice behaviour , 2010 .

[122]  Jonathan Dinerstein,et al.  Learning Policies for Embodied Virtual Agents through Demonstration , 2007, IJCAI.

[123]  A. Barto,et al.  An algebraic approach to abstraction in reinforcement learning , 2004 .

[124]  L. F. Henderson On the fluid mechanics of human crowd motion , 1974 .

[125]  Martin T. Pietrucha,et al.  FIELD STUDIES OF PEDESTRIAN WALKING SPEED AND START-UP TIME , 1996 .

[126]  Eric Bonabeau,et al.  Agent-based modeling: Methods and techniques for simulating human systems , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[127]  Mark D. Uncles,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1987 .

[128]  R. J. Wheeler,et al.  PEDESTRIAN FLOW CHARACTERISTICS , 1969 .

[129]  R. Hughes The flow of human crowds , 2003 .

[130]  Nadia Magnenat-Thalmann,et al.  A computational skin model: fold and wrinkle formation , 2002, IEEE Transactions on Information Technology in Biomedicine.

[131]  D. Wolf,et al.  Traffic and Granular Flow , 1996 .

[132]  Pieter Wijn,et al.  The alinear viscoelastic properties of human skin in vivo for small deformations , 1980 .

[133]  H. James Hoover,et al.  Limits to Parallel Computation: P-Completeness Theory , 1995 .

[134]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[135]  K. Yuen Improved social force model for building evacuation simulation , 2010 .

[136]  Andreas Schadschneider,et al.  Empirical results for pedestrian dynamics and their implications for modeling , 2011, Networks Heterog. Media.

[137]  Fernando Fernández,et al.  A Reinforcement Learning Approach for Multiagent Navigation , 2010, ICAART.

[138]  De Wei Li,et al.  Modeling Queue Service System in Pedestrian Simulation , 2011 .

[139]  Peter Stone,et al.  Representation Transfer for Reinforcement Learning , 2007, AAAI Fall Symposium: Computational Approaches to Representation Change during Learning and Development.

[140]  Andrew Crooks,et al.  Agent-based Models of Geographical Systems , 2012 .

[141]  Lukas Furst Cities And Complexity Understanding Cities With Cellular Automata Agent Based Models And Fractals , 2016 .

[142]  Dimitris N. Metaxas,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2007) Group Behavior from Video: a Data-driven Approach to Crowd Simulation , 2022 .

[143]  Soraia Raupp Musse,et al.  A Model of Human Crowd Behavior : Group Inter-Relationship and Collision Detection Analysis , 1997, Computer Animation and Simulation.

[144]  M. Schreckenberg,et al.  Microscopic Simulation of Pedestrian Crowd Motion , 2002 .

[145]  Armin Seyfried,et al.  Steps Toward the Fundamental Diagram — Empirical Results and Modelling , 2007 .

[146]  Andrea Castelletti,et al.  Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[147]  Michael L. Littman,et al.  Multi-resolution Exploration in Continuous Spaces , 2008, NIPS.

[148]  Jean-Claude Latombe,et al.  Fast synthetic vision, memory, and learning models for virtual humans , 1999, Proceedings Computer Animation 1999.

[149]  M. Matarić Learning to Behave Socially , 1994 .

[150]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[151]  Yann Chevaleyre,et al.  Adaptive Probabilistic Policy Reuse , 2012, ICONIP.

[152]  David Chapman,et al.  Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[153]  Zoran Popović,et al.  Compact character controllers , 2009, SIGGRAPH 2009.

[154]  Dana H. Ballard,et al.  Learning to perceive and act by trial and error , 1991, Machine Learning.

[155]  Hubert Klüpfel,et al.  Evacuation Dynamics: Empirical Results, Modeling and Applications , 2009, Encyclopedia of Complexity and Systems Science.

[156]  Nico Vandaele,et al.  A QUEUEING BASED TRAFFIC FLOW MODEL , 2000 .

[157]  Manuela M. Veloso,et al.  Learning of coordination: exploiting sparse interactions in multiagent systems , 2009, AAMAS.

[158]  Christopher E. Peters,et al.  Perceptual evaluation of position and orientation context rules for pedestrian formations , 2008, APGV '08.

[159]  Sonia Chernova,et al.  Using Human Demonstrations to Improve Reinforcement Learning , 2011, AAAI Spring Symposium: Help Me Help You: Bridging the Gaps in Human-Agent Collaboration.

[160]  Ian Foster,et al.  Designing and building parallel programs , 1994 .

[161]  Sridhar Mahadevan,et al.  Robot Learning , 1993 .

[162]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[163]  Sridhar Mahadevan,et al.  Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.

[164]  Soraia Raupp Musse,et al.  Using computer vision to simulate the motion of virtual agents , 2007, Comput. Animat. Virtual Worlds.

[165]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[166]  P. Stone,et al.  TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.

[167]  Kathryn E. Merrick,et al.  Motivated reinforcement learning for non-player characters in persistent computer game worlds , 2006, ACE '06.

[168]  Josh C. Bongard,et al.  Evolutionary robotics , 2013, CACM.

[169]  Zoran Popović,et al.  Learning behavior styles with inverse reinforcement learning , 2010, SIGGRAPH 2010.

[170]  Luis Ferreira,et al.  Modeling pedestrian queuing using micro-simulation , 2013 .

[171]  Von-Wun Soo,et al.  Cascading Decomposition and State Abstractions for Reinforcement Learning , 2008, 2008 Seventh Mexican International Conference on Artificial Intelligence.

[172]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[173]  P. Dayan,et al.  TD(λ) converges with probability 1 , 2004, Machine Learning.

[174]  Barbara Yersin,et al.  Steering a Virtual Crowd Based on a Semantically Augmented Navigation Graph , 2005 .

[175]  Dinesh Manocha,et al.  Directing Crowd Simulations Using Navigation Fields , 2011, IEEE Transactions on Visualization and Computer Graphics.

[176]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[177]  Christopher E. Peters,et al.  Bottom-up visual attention for virtual human animation , 2003, Proceedings 11th IEEE International Workshop on Program Comprehension.

[178]  M. Haklay,et al.  Agent-Based Models and Individualism: Is the World Agent-Based? , 2000 .

[179]  Serge P. Hoogendoorn,et al.  Experimental Research of Pedestrian Walking Behavior , 2003 .

[180]  Eric John Ward Urban movement : models of pedestrian activity , 2006 .

[181]  Sarit Kraus,et al.  Modeling Agents through Bounded Rationality Theories , 2009, IJCAI.

[182]  Wentong Cai,et al.  Crowd modeling and simulation technologies , 2010, TOMC.

[183]  Scott Stevens,et al.  Reinforcement Learning in Nonstationary Environment Navigation Tasks , 2007, Canadian Conference on AI.

[184]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[185]  Dinesh Manocha,et al.  Interactive navigation of multiple agents in crowded environments , 2008, I3D '08.

[186]  F. Fernández,et al.  A COMPARATIVE STUDY OF DISCRETIZATION APPROACHES FOR STATE SPACE GENERALIZATION , 2010 .

[187]  Mohcine Chraibi,et al.  Force-based models of pedestrian dynamics , 2011, Networks Heterog. Media.

[188]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[189]  Badler,et al.  Techniques for Generating the Goal-Directed Motion of Articulated Structures , 1982, IEEE Computer Graphics and Applications.

[190]  James S. Albus,et al.  New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[191]  Serge P Hoogendorn,et al.  EXTRACTING MICROSCOPIC PEDESTRIAN CHARACTERISTICS FROM VIDEO DATA : RESULTS FROM EXPERIMENTAL RESEARCH INTO PEDESTRIAN WALKING BEHAVIOR , 2003 .

[192]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[193]  Jose L. Torero,et al.  SFPE handbook of fire protection engineering , 2016 .

[194]  Michel Bierlaire,et al.  Specification, estimation and validation of a pedestrian walking behaviour model , 2007 .

[195]  Gerta Köster,et al.  Natural discretization of pedestrian movement in continuous space. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[196]  Michael Wooldridge,et al.  Reasoning about rational agents , 2000, Intelligent robots and autonomous agents.

[197]  Dirk Helbing,et al.  How simple rules determine pedestrian behavior and crowd disasters , 2011, Proceedings of the National Academy of Sciences.

[198]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[199]  Mark H. Overmars,et al.  Simulating and Evaluating the Local Behavior of Small Pedestrian Groups , 2012, IEEE Transactions on Visualization and Computer Graphics.

[200]  Franziska Klügl,et al.  A Case Study of the Bern Railway Station , 2007 .

[201]  Manuela M. Veloso,et al.  Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[202]  Kathryn E. Merrick,et al.  Motivated Reinforcement Learning - Curious Characters for Multiuser Games , 2009 .

[203]  Peter Stone,et al.  Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[204]  William H. Warren,et al.  A behavioral dynamics approach to modeling realistic pedestrian behavior , 2012 .

[205]  Peter Stone,et al.  Intrinsically motivated model learning for a developing curious agent , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[206]  Andrew W. Moore,et al.  Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[207]  Adli Mustafa,et al.  Modelling Pedestrian Travel Time and the Design of Facilities: A Queuing Approach , 2013, PloS one.

[208]  Albert Steiner,et al.  Parameter estimation for a pedestrian simulation model , 2007 .

[209]  Lihong Li,et al.  PAC model-free reinforcement learning , 2006, ICML.

[210]  Miguel Lozano,et al.  A new system architecture for crowd simulation , 2009, J. Netw. Comput. Appl..

[211]  Daniel Thalmann,et al.  A vision-based approach to behavioural animation , 1990, Comput. Animat. Virtual Worlds.

[212]  Matthias Nussbaum,et al.  Pedestrian And Evacuation Dynamics , 2016 .

[213]  R. Ranjan,et al.  Reinforcement learning for dynamic channel allocation in mobile cellular systems , 2008, 2008 International Conference on Recent Advances in Microwave Theory and Applications.

[214]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[215]  Fernando Fernández,et al.  Reinforcement Learning for Decision-Making in a Business Simulator , 2012, Int. J. Inf. Technol. Decis. Mak..

[216]  N. Mutrie,et al.  A Workplace Intervention to Promote Stair Climbing: Greater Effects in the Overweight , 2006, Obesity.

[217]  G. Zipf,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. , 1949 .

[218]  Lynne E. Parker,et al.  A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains , 2005, J. Intell. Robotic Syst..

[219]  Rahul Narain,et al.  Aggregate dynamics for dense crowd simulation , 2009, SIGGRAPH 2009.

[220]  Michael Gleicher,et al.  Scalable behaviors for crowd simulation , 2004, Comput. Graph. Forum.

[221]  Dinesh Manocha,et al.  Least-effort trajectories lead to emergent crowd behaviors. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[222]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[223]  Daniel Thalmann,et al.  Crowd modelling in collaborative virtual environments , 1998, VRST '98.

[224]  S. Marsella,et al.  Assessing the validity of appraisal-based models of emotion , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[225]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[226]  Craig W. Reynolds Steering Behaviors For Autonomous Characters , 1999 .

[227]  Thomas C. Schelling,et al.  Dynamic models of segregation , 1971 .

[228]  Michel Bierlaire,et al.  BIOGEME: a free package for the estimation of discrete choice models , 2003 .

[229]  Daniel Thalmann,et al.  Navigation for digital actors based on synthetic vision, memory, and learning , 1995, Comput. Graph..

[230]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[231]  Javier García,et al.  Safe Exploration of State and Action Spaces in Reinforcement Learning , 2012, J. Artif. Intell. Res..

[232]  L. Izquierdo,et al.  The impact of quality uncertainty without asymmetric information on market efficiency , 2007 .

[233]  Fasheng Qiu,et al.  Modeling Social Group Structures in Pedestrian Crowds , 2010 .

[234]  Serge P. Hoogendoorn,et al.  A Novel Calibration Approach of Microscopic Pedestrian Models , 2009 .

[235]  Danny Weyns,et al.  TOWARDS ACTIVE PERCEPTION IN SITUATED MULTI-AGENT SYSTEMS , 2004, Appl. Artif. Intell..

[236]  Kincho H. Law,et al.  A Multi-Agent Based Simulation Framework for the Study of Human and Social Behavior in Egress Analysis , 2005 .

[237]  Benjamin Kuipers,et al.  Computer power and human reason , 1976, SGAR.

[238]  H. Timmermans,et al.  A Model of Pedestrian Route Choice and Demand for Retail Facilities within Inner-City Shopping Areas , 2010 .

[239]  Norman I. Badler,et al.  Controlling individual agents in high-density crowd simulation , 2007, SCA '07.

[240]  Dirk Helbing A Fluid-Dynamic Model for the Movement of Pedestrians , 1992, Complex Syst..

[241]  Doina Precup,et al.  Using Options for Knowledge Transfer in Reinforcement Learning , 1999 .

[242]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[243]  S. Dai,et al.  Centrifugal force model for pedestrian dynamics. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[244]  Jehee Lee,et al.  Precomputing avatar behavior from human motion data , 2004, SCA '04.

[245]  Dinesh Manocha,et al.  Velocity-based modeling of physical interactions in multi-agent simulations , 2013, SCA '13.

[246]  Peter Vrancx,et al.  Reinforcement Learning: State-of-the-Art , 2012 .

[247]  Manuela M. Veloso,et al.  Learning domain structure through probabilistic policy reuse in reinforcement learning , 2013, Progress in Artificial Intelligence.

[248]  Francisco Javier García-Polo,et al.  Safe reinforcement learning in high-risk tasks through policy improvement , 2011, ADPRL.

[249]  Ulrich Weidmann,et al.  Transporttechnik der Fussgänger , 1992 .

[250]  Michael Kearns,et al.  Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[251]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[252]  Serge P. Hoogendoorn,et al.  Pedestrian route-choice and activity scheduling theory and models , 2004 .

[253]  J. Pettré,et al.  Minimal predicted distance: a common metric for collision avoidance during pairwise interactions between walkers. , 2012, Gait & posture.

[254]  Jessica K. Hodgins,et al.  Reactive pedestrian path following from examples , 2004, The Visual Computer.

[255]  L. E. ParkerCenter Learning in Large Cooperative Multi-Robot Domains , 2001 .

[256]  Michel Bierlaire,et al.  Scenario Analysis of Pedestrian Flow in Public Spaces , 2012 .

[257]  Mubarak Shah,et al.  Floor Fields for Tracking in High Density Crowd Scenes , 2008, ECCV.

[258]  Dinesh Manocha,et al.  Multi-robot coordination using generalized social potential fields , 2009, 2009 IEEE International Conference on Robotics and Automation.

[259]  Zhigang Deng,et al.  Context-Aware Motion Diversification for Crowd Simulation , 2011, IEEE Computer Graphics and Applications.

[260]  Alireza Esfahani,et al.  Widest K-Shortest Paths Q-Routing: A New QoS Routing Algorithm in Telecommunication Networks , 2008, 2008 International Conference on Computer Science and Software Engineering.

[261]  Michael L. Littman,et al.  A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.

[262]  D. Thalmann,et al.  Handbook of Virtual Humans: Magnenat-Thalma/Handbook of Virtual Humans , 2006 .

[263]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[264]  Maurice Bruynooghe,et al.  Learning Relational Options for Inductive Transfer in Relational Reinforcement Learning , 2007, ILP.

[265]  Leon Sterling,et al.  A BDI approach to agent-based modelling of pedestrians , 2005 .

[266]  Alberto A. C. C. Pais,et al.  In vivo friction study of human skin: Influence of moisturizers on different anatomical sites , 2007 .

[267]  Demetri Terzopoulos,et al.  Animat vision: Active vision in artificial animals , 1995, Proceedings of IEEE International Conference on Computer Vision.

[268]  Demetri Terzopoulos,et al.  Autonomous pedestrians , 2007, Graph. Model..

[269]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[270]  Minoru Asada,et al.  Vision-Based Behavior Acquisition For A Shooting Robot By Using A Reinforcement Learning , 1994 .

[271]  David A. Forsyth,et al.  Learning to move autonomously in a hostile world , 2005, SIGGRAPH '05.

[272]  Dirk Helbing,et al.  Collective phenomena and states in traffic and self-driven many-particle systems , 2004 .

[273]  Peter Stone,et al.  Layered Learning in Multiagent Systems , 1997, AAAI/IAAI.

[274]  Corporate The MPI Forum,et al.  MPI: a message passing interface , 1993, Supercomputing '93.

[275]  Daniel Thalmann,et al.  Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cav.147 , 2022 .

[276]  Michael Schreckenberg,et al.  Two lane traffic simulations using cellular automata , 1995, cond-mat/9512119.

[277]  L. F. Henderson,et al.  The Statistics of Crowd Fluids , 1971, Nature.

[278]  Victor J. Blue,et al.  Cellular automata microsimulation for modeling bi-directional pedestrian walkways , 2001 .

[279]  L. Buşoniu,et al.  A comprehensive survey of multi-agent reinforcement learning , 2011 .

[280]  Stéphane Donikian,et al.  Experiment-based modeling, simulation and validation of interactions between virtual walkers , 2009, SCA '09.

[281]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[282]  D I Robertson,et al.  User guide to TRANSYT version 6 , 1976 .

[283]  Rosaldo J. F. Rossetti,et al.  Towards a Framework for Pedestrian Simulation for Intermodal Interfaces , 2013, 2013 European Modelling Symposium.

[284]  Seth B. Young,et al.  Evaluation of Pedestrian Walking Speeds in Airport Terminals , 1999 .

[285]  Adolf D. May,et al.  Traffic Flow Fundamentals , 1989 .

[286]  Manuela M. Veloso,et al.  Learning equivalent action choices from demonstration , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[287]  K. Kwiatkowski,et al.  Friction and deformation behaviour of human skin , 2009 .

[288]  G. Tesauro Practical Issues in Temporal Difference Learning , 1992 .

[289]  J. Hernández-Orallo,et al.  Policy Reuse in a General Learning Framework , 2013 .

[290]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[291]  Dirk Hartmann,et al.  Adaptive pedestrian dynamics based on geodesics , 2010 .

[292]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[293]  Martin A. Riedmiller,et al.  Batch Reinforcement Learning , 2012, Reinforcement Learning.

[294]  Sergey Levine,et al.  Physically plausible simulation for character animation , 2012, SCA '12.

[295]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[296]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[297]  E. Reed The Ecological Approach to Visual Perception , 1989 .

[298]  Eduardo F. Morales,et al.  Dynamic Reward Shaping: Training a Robot by Voice , 2010, IBERAMIA.

[299]  Andrew Tridgell,et al.  KnightCap: A Chess Programm That Learns by Combining TD(lambda) with Game-Tree Search , 1998, ICML.

[300]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[301]  William H. Greene,et al.  School Location and Student Travel Analysis of Factors Affecting Mode Choice , 2004 .

[302]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[303]  Takashi Chikayama,et al.  Simulating the Collision Avoidance Behavior of Pedestrians , 2000 .

[304]  Andreas Schadschneider,et al.  Extended Floor Field CA Model for Evacuation Dynamics , 2004, IEICE Trans. Inf. Syst..

[305]  P. Glynn,et al.  Opportunities and challenges in using online preference data for vehicle pricing: A case study at General Motors , 2006 .

[306]  P G Gipps,et al.  A micro simulation model for pedestrian flows , 1985 .

[307]  Allen Newell,et al.  Chunking in Soar: The anatomy of a general learning mechanism , 1985, Machine Learning.

[308]  Michael Batty,et al.  Advanced Spatial Analysis: The CASA Book of GIS , 2003 .

[309]  Nancy S. Pollard,et al.  Responsive characters from motion fragments , 2007, SIGGRAPH 2007.

[310]  Gunnar G. Løvås,et al.  Modeling and Simulation of Pedestrian Traffic Flow , 1994 .

[311]  Evan Herbst,et al.  Character animation in two-player adversarial games , 2010, TOGS.

[312]  Renata Vieira,et al.  Ontology-based crowd simulation for normal life situations , 2005, International 2005 Computer Graphics.

[313]  Lisa Torrey,et al.  Crowd Simulation Via Multi-Agent Reinforcement Learning , 2010, AIIDE.

[314]  Xiaolin Hu,et al.  Spatial activity-based modeling for pedestrian crowd simulation , 2013, Simul..

[315]  Andrew M. Day,et al.  Stream-based animation of real-time crowd scenes , 2012, Comput. Graph..

[316]  Guillaume Deffuant,et al.  Mixing beliefs among interacting agents , 2000, Adv. Complex Syst..

[317]  Florian Heiss,et al.  Discrete Choice Methods with Simulation , 2016 .

[318]  Christopher E. Peters,et al.  Modeling Groups of Plausible Virtual Pedestrians , 2009, IEEE Computer Graphics and Applications.

[319]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[320]  Tuomas Sandholm,et al.  Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[321]  William H. K. Lam,et al.  Pedestrian speed/flow relationships for walking facilities in Hong Kong , 2000 .

[322]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[323]  David Andre,et al.  State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.