论文信息 - Reinforcement learning in a Multi-agent Framework for Pedestrian Simulation - 字舞流文

Reinforcement learning in a Multi-agent Framework for Pedestrian Simulation

Francisco A. Martinez-Gil | Francisco Martínez-Gil

[1] Michael Scott Ramming,et al. NETWORK KNOWLEDGE AND ROUTE CHOICE , 2002 .

[2] Javier García,et al. Probabilistic Policy Reuse for inter-task transfer learning , 2010, Robotics Auton. Syst..

[3] John Morrall,et al. Analysis of factors affecting the choice of route of pedestrians , 1985 .

[4] Marc Carreras,et al. Policy gradient based Reinforcement Learning for real autonomous underwater cable tracking , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[6] Ajith Abraham,et al. Emotional ant based modeling of crowd dynamics , 2005, Seventh International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC'05).

[7] Andreas Schadschneider,et al. Empirical Results for Pedestrian Dynamics and their Implications for Cellular Automata Models , 2009 .

[8] C. Rogsch,et al. Basics of Software-Tools for Pedestrian Movement—Identification and Results , 2012 .

[9] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[10] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[11] Craig W. Reynolds. Flocks, herds, and schools: a distributed behavioral model , 1998 .

[12] Stefania Bandini,et al. Agent Based Modeling and Simulation: An Informatics Perspective , 2009, J. Artif. Soc. Soc. Simul..

[13] Andreas Schadschneider,et al. VALIDATION OF CA MODELS OF PEDESTRIAN DYNAMICS WITH FUNDAMENTAL DIAGRAMS , 2009, Cybern. Syst..

[14] Jian Li,et al. Simulation of the kin behavior in building occupant evacuation based on Cellular Automaton , 2005 .

[15] Sandip Sen,et al. Multiagent Coordination with Learning Classifier Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[16] Prasad Tadepalli,et al. Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery , 2006, ECML.

[17] Mark H. Overmars,et al. A Predictive Collision Avoidance Model for Pedestrian Simulation , 2009, MIG.

[18] Daniel Thalmann,et al. Hierarchical Model for Real Time Simulation of Virtual Human Crowds , 2001, IEEE Trans. Vis. Comput. Graph..

[19] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[20] Soraia Raupp Musse,et al. Modeling individual behaviors in crowd simulation , 2003, Proceedings 11th IEEE International Workshop on Program Comprehension.

[21] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.

[22] Shimon Whiteson,et al. Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs , 2009, 2009 International Conference on Machine Learning and Applications.

[23] Peter Stone,et al. Graph-Based Domain Mapping for Transfer Learning in General Games , 2007, ECML.

[24] Ramin Mehran,et al. Abnormal crowd behavior detection using social force model , 2009, CVPR.

[25] Zarita Zainuddin,et al. Incorporating Decision Making Capability into the Social Force Model in Unidirectional Flow , 2010 .

[26] Yangsheng Xu,et al. Dynamic energy management for hybrid electric vehicle based on approximate dynamic programming , 2008, 2008 7th World Congress on Intelligent Control and Automation.

[27] Lambert Schomaker,et al. Dedicated TD-learning for Stronger Gameplay: applications to Go , 2004 .

[28] Daniel Thalmann,et al. Semantics-based representation of virtual environments , 2005, Int. J. Comput. Appl. Technol..

[29] Robert Givan,et al. Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.

[30] Bruce Blumberg,et al. Integrated learning for interactive synthetic characters , 2002, SIGGRAPH.

[31] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[32] Lorenza Manenti,et al. Crystals of Crowd: Modelling Pedestrian Groups Using MAS-based Approach , 2011, WOA.

[33] José Rogan,et al. Cellular automaton model for evacuation process with obstacles , 2007 .

[34] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[35] Fernando Fernández,et al. Two steps reinforcement learning , 2008, Int. J. Intell. Syst..

[36] Sébastien Paris,et al. Pedestrian Reactive Navigation for Crowd Simulation: a Predictive Approach , 2007, Comput. Graph. Forum.

[37] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.

[38] Joëlle Thollot,et al. A physically-based particle model of emergent crowd behaviors , 2010, ArXiv.

[39] A. Seyfried,et al. The fundamental diagram of pedestrian movement revisited , 2005, physics/0506170.

[40] Takeshi Sakuma,et al. Psychological model for animating crowded pedestrians: Virtual Humans and Social Agents , 2005 .

[41] A. Seyfried,et al. Methods for measuring pedestrian density, flow, speed and direction with minimal scatter , 2009, 0911.2165.

[42] A. Schadschneider,et al. Simulation of pedestrian dynamics using a two dimensional cellular automaton , 2001 .

[43] Kardi Teknomo,et al. Microscopic Pedestrian Flow Characteristics: Development of an Image Processing Data Collection and Simulation Model , 2016, ArXiv.

[44] Stephen Chenney,et al. Flow tiles , 2004, SCA '04.

[45] Dirk Helbing,et al. Specification of the Social Force Pedestrian Model by Evolutionary Adjustment to Video Tracking Data , 2007, Adv. Complex Syst..

[46] Scott E. Page,et al. Agent-Based Models , 2014, Encyclopedia of GIS.

[47] M. Masen,et al. A systems based experimental approach to tactile friction. , 2011, Journal of the mechanical behavior of biomedical materials.

[48] A. Johansson,et al. Constant-net-time headway as a key mechanism behind pedestrian flow dynamics. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49] Majid Sarvi,et al. Animal dynamics based approach for modeling pedestrian crowd egress under panic conditions , 2011 .

[50] Wang Bing-Hong,et al. Evacuation behaviors at exit in CA model with force essentials: A comparison with social force model , 2006 .

[51] Helbing,et al. Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[52] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[53] Tobias Kretz,et al. Pedestrian Traffic - Simulation and Experiments , 2007 .

[54] Suiping Zhou,et al. Modeling and simulation of pedestrian behaviors in crowded places , 2011, TOMC.

[55] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[56] Andreas Schadschneider,et al. Simulation of evacuation processes using a bionics-inspired cellular automaton model for pedestrian dynamics , 2002 .

[57] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[58] Michael Schreckenberg,et al. A microscopic model for simulating mustering and evacuation processes onboard passenger ships , 2002 .

[59] E. Thorndike,et al. The influence of improvement in one mental function upon the efficiency of other functions. II. The estimation of magnitudes. , 1901 .

[60] S. Whiteson,et al. Adaptive Tile Coding for Value Function Approximation , 2007 .

[61] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[62] Lubos Buzna,et al. Self-Organized Pedestrian Crowd Dynamics: Experiments, Simulations, and Design Solutions , 2005, Transp. Sci..

[63] Peter Stone,et al. Value-Function-Based Transfer for Reinforcement Learning Using Structure Mapping , 2006, AAAI.

[64] Tomoichi Takahashi,et al. BDI Agent model Based Evacuation Simulation (Demonstration) , 2011 .

[65] Dirk Helbing,et al. Simulating dynamical features of escape panic , 2000, Nature.

[66] Natalie Fridman,et al. Modeling pedestrian crowd behavior based on a cognitive model of social comparison theory , 2010, Comput. Math. Organ. Theory.

[67] Markus Schneider,et al. The Teaching-Box: A universal robot learning framework , 2009, 2009 International Conference on Advanced Robotics.

[68] John J Fruin,et al. DESIGNING FOR PEDESTRIANS: A LEVEL-OF-SERVICE CONCEPT , 1971 .

[69] Michael Batty,et al. Pedestrian Behaviour Modelling An application to retail movements using a genetic algorithm , 2004 .

[70] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[71] S. Singh,et al. Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[72] Maja J. Mataric,et al. A general algorithm for robot formations using local sensing and minimal communication , 2002, IEEE Trans. Robotics Autom..

[73] Hajime Inamura,et al. MEASURING MICROSCOPIC FLOW PERFORMANCE FOR PEDESTRIANS , 2001 .

[74] Kai Nagel,et al. The MATSim Network Flow Model for Traffic Simulation Adapted to Large-Scale Emergency Egress and an Application to the Evacuation of the Indonesian City of Padang in Case of a Tsunami Warning , 2009 .

[75] Frank Dignum. Agents for games and simulations , 2011, Autonomous Agents and Multi-Agent Systems.

[76] Ameya Shendarkar,et al. Crowd simulation for emergency response using BDI agents based on immersive virtual reality , 2008, Simul. Model. Pract. Theory.

[77] Hiroshi Tsukaguchi,et al. A new method for evaluation of level of service in pedestrian facilities , 1987 .

[78] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[79] Robert S. Sobel,et al. Determinants of Nonstationary Personal Space Invasion , 1975 .

[80] Sergey Levine,et al. Space-time planning with parameterized locomotion controllers , 2011, TOGS.

[81] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.

[82] Jean-Michel Auberlet,et al. Towards Contextual Goal-oriented Perception for Pedestrian Simulation , 2012, ICAART.

[83] Daniel Thalmann,et al. Crowds of Moving Objects: Navigation Planning and Simulation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[84] Katsuhiro Nishinari,et al. Physics of Transport and Traffic Phenomena in Biology: from molecular motors and cells to organisms , 2005 .

[85] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[86] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .

[87] Kathryn E. Merrick,et al. Motivated reinforcement learning for adaptive characters in open-ended simulation games , 2007, ACE '07.

[88] Biao Leng,et al. An extended floor field model based on regular hexagonal cells for pedestrian simulation , 2014 .

[89] Dinesh Manocha,et al. ClearPath: highly parallel collision avoidance for multi-agent simulation , 2009, SCA '09.

[90] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[91] J. Lévêque,et al. Age-related mechanical properties of human skin: an in vivo study. , 1989, The Journal of investigative dermatology.

[92] Norman I. Badler,et al. Modeling Crowd and Trained Leader Behavior during Building Evacuation , 2006, IEEE Computer Graphics and Applications.

[93] Demetri Terzopoulos,et al. A physical model of facial tissue and muscle articulation , 1990, [1990] Proceedings of the First Conference on Visualization in Biomedical Computing.

[94] Mohcine Chraibi,et al. Efficient and validated simulation of crowds for an evacuation assistant , 2012, Comput. Animat. Virtual Worlds.

[95] Nicolas Courty,et al. Crowd motion capture , 2007 .

[96] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.

[97] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[98] Jude W. Shavlik,et al. Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another , 2005, ECML.

[99] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[100] R. P. Mccall. Physics of the human body , 2010 .

[101] Matthew Saffell,et al. Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.

[102] Mykel J. Kochenderfer. Adaptive modelling and planning for learning intelligent behaviour , 2006 .

[103] D. Helbing,et al. Computer Simulations of Pedestrian Dynamics and Trail Formation , 1998, cond-mat/9805074.

[104] Dirk Helbing,et al. Self-Organizing Pedestrian Movement , 2001 .

[105] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[106] John J. Fruin,et al. Pedestrian planning and design , 1971 .

[107] Tobias Kretz,et al. Pedestrian traffic: on the quickest path , 2009, ArXiv.

[108] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .

[109] Dirk Heidemann,et al. QUEUEING AT UNSIGNALIZED INTERSECTIONS , 1997 .

[110] Peter Stone,et al. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[111] Oliver Lemon,et al. Using dialogue acts to learn better repair strategies for spoken dialogue systems , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[112] Dylan A. Shell,et al. Extending Open Dynamics Engine for Robotics Simulation , 2010, SIMPAR.

[113] Peter Stone,et al. Batch reinforcement learning in a complex domain , 2007, AAMAS '07.

[114] Matthias Zwicker,et al. Learning motion controllers with adaptive depth perception , 2012, SCA '12.

[115] Stéphane Donikian,et al. A synthetic-vision based steering approach for crowd simulation , 2010, ACM Transactions on Graphics.

[116] Tze-Yun Leong,et al. Online Feature Selection for Model-based Reinforcement Learning , 2013, ICML.

[117] Dirk Helbing,et al. Dynamics of crowd disasters: an empirical study. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[118] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.

[119] Cécile Appert-Rolland,et al. Realistic following behaviors for crowd simulation , 2012, Comput. Graph. Forum.

[120] Winnie Daamen,et al. Modelling passenger flows in public transport facilities , 2004 .

[121] Takamasa Iryo,et al. Microscopic pedestrian simulation model combined with a tactical model for route choice behaviour , 2010 .

[122] Jonathan Dinerstein,et al. Learning Policies for Embodied Virtual Agents through Demonstration , 2007, IJCAI.

[123] A. Barto,et al. An algebraic approach to abstraction in reinforcement learning , 2004 .

[124] L. F. Henderson. On the fluid mechanics of human crowd motion , 1974 .

[125] Martin T. Pietrucha,et al. FIELD STUDIES OF PEDESTRIAN WALKING SPEED AND START-UP TIME , 1996 .

[126] Eric Bonabeau,et al. Agent-based modeling: Methods and techniques for simulating human systems , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[127] Mark D. Uncles,et al. Discrete Choice Analysis: Theory and Application to Travel Demand , 1987 .

[128] R. J. Wheeler,et al. PEDESTRIAN FLOW CHARACTERISTICS , 1969 .

[129] R. Hughes. The flow of human crowds , 2003 .

[130] Nadia Magnenat-Thalmann,et al. A computational skin model: fold and wrinkle formation , 2002, IEEE Transactions on Information Technology in Biomedicine.

[131] D. Wolf,et al. Traffic and Granular Flow , 1996 .

[132] Pieter Wijn,et al. The alinear viscoelastic properties of human skin in vivo for small deformations , 1980 .

[133] H. James Hoover,et al. Limits to Parallel Computation: P-Completeness Theory , 1995 .

[134] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[135] K. Yuen. Improved social force model for building evacuation simulation , 2010 .

[136] Andreas Schadschneider,et al. Empirical results for pedestrian dynamics and their implications for modeling , 2011, Networks Heterog. Media.

[137] Fernando Fernández,et al. A Reinforcement Learning Approach for Multiagent Navigation , 2010, ICAART.

[138] De Wei Li,et al. Modeling Queue Service System in Pedestrian Simulation , 2011 .

[139] Peter Stone,et al. Representation Transfer for Reinforcement Learning , 2007, AAAI Fall Symposium: Computational Approaches to Representation Change during Learning and Development.

[140] Andrew Crooks,et al. Agent-based Models of Geographical Systems , 2012 .

[141] Lukas Furst. Cities And Complexity Understanding Cities With Cellular Automata Agent Based Models And Fractals , 2016 .

[142] Dimitris N. Metaxas,et al. Eurographics/ Acm Siggraph Symposium on Computer Animation (2007) Group Behavior from Video: a Data-driven Approach to Crowd Simulation , 2022 .

[143] Soraia Raupp Musse,et al. A Model of Human Crowd Behavior : Group Inter-Relationship and Collision Detection Analysis , 1997, Computer Animation and Simulation.

[144] M. Schreckenberg,et al. Microscopic Simulation of Pedestrian Crowd Motion , 2002 .

[145] Armin Seyfried,et al. Steps Toward the Fundamental Diagram — Empirical Results and Modelling , 2007 .

[146] Andrea Castelletti,et al. Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[147] Michael L. Littman,et al. Multi-resolution Exploration in Continuous Spaces , 2008, NIPS.

[148] Jean-Claude Latombe,et al. Fast synthetic vision, memory, and learning models for virtual humans , 1999, Proceedings Computer Animation 1999.

[149] M. Matarić. Learning to Behave Socially , 1994 .

[150] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[151] Yann Chevaleyre,et al. Adaptive Probabilistic Policy Reuse , 2012, ICONIP.

[152] David Chapman,et al. Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[153] Zoran Popović,et al. Compact character controllers , 2009, SIGGRAPH 2009.

[154] Dana H. Ballard,et al. Learning to perceive and act by trial and error , 1991, Machine Learning.

[155] Hubert Klüpfel,et al. Evacuation Dynamics: Empirical Results, Modeling and Applications , 2009, Encyclopedia of Complexity and Systems Science.

[156] Nico Vandaele,et al. A QUEUEING BASED TRAFFIC FLOW MODEL , 2000 .

[157] Manuela M. Veloso,et al. Learning of coordination: exploiting sparse interactions in multiagent systems , 2009, AAMAS.

[158] Christopher E. Peters,et al. Perceptual evaluation of position and orientation context rules for pedestrian formations , 2008, APGV '08.

[159] Sonia Chernova,et al. Using Human Demonstrations to Improve Reinforcement Learning , 2011, AAAI Spring Symposium: Help Me Help You: Bridging the Gaps in Human-Agent Collaboration.

[160] Ian Foster,et al. Designing and building parallel programs , 1994 .

[161] Sridhar Mahadevan,et al. Robot Learning , 1993 .

[162] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[163] Sridhar Mahadevan,et al. Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.

[164] Soraia Raupp Musse,et al. Using computer vision to simulate the motion of virtual agents , 2007, Comput. Animat. Virtual Worlds.

[165] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[166] P. Stone,et al. TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.

[167] Kathryn E. Merrick,et al. Motivated reinforcement learning for non-player characters in persistent computer game worlds , 2006, ACE '06.

[168] Josh C. Bongard,et al. Evolutionary robotics , 2013, CACM.

[169] Zoran Popović,et al. Learning behavior styles with inverse reinforcement learning , 2010, SIGGRAPH 2010.

[170] Luis Ferreira,et al. Modeling pedestrian queuing using micro-simulation , 2013 .

[171] Von-Wun Soo,et al. Cascading Decomposition and State Abstractions for Reinforcement Learning , 2008, 2008 Seventh Mexican International Conference on Artificial Intelligence.

[172] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[173] P. Dayan,et al. TD(λ) converges with probability 1 , 2004, Machine Learning.

[174] Barbara Yersin,et al. Steering a Virtual Crowd Based on a Semantically Augmented Navigation Graph , 2005 .

[175] Dinesh Manocha,et al. Directing Crowd Simulations Using Navigation Fields , 2011, IEEE Transactions on Visualization and Computer Graphics.

[176] R. Gray,et al. Vector quantization , 1984, IEEE ASSP Magazine.

[177] Christopher E. Peters,et al. Bottom-up visual attention for virtual human animation , 2003, Proceedings 11th IEEE International Workshop on Program Comprehension.

[178] M. Haklay,et al. Agent-Based Models and Individualism: Is the World Agent-Based? , 2000 .

[179] Serge P. Hoogendoorn,et al. Experimental Research of Pedestrian Walking Behavior , 2003 .

[180] Eric John Ward. Urban movement : models of pedestrian activity , 2006 .

[181] Sarit Kraus,et al. Modeling Agents through Bounded Rationality Theories , 2009, IJCAI.

[182] Wentong Cai,et al. Crowd modeling and simulation technologies , 2010, TOMC.

[183] Scott Stevens,et al. Reinforcement Learning in Nonstationary Environment Navigation Tasks , 2007, Canadian Conference on AI.

[184] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[185] Dinesh Manocha,et al. Interactive navigation of multiple agents in crowded environments , 2008, I3D '08.

[186] F. Fernández,et al. A COMPARATIVE STUDY OF DISCRETIZATION APPROACHES FOR STATE SPACE GENERALIZATION , 2010 .

[187] Mohcine Chraibi,et al. Force-based models of pedestrian dynamics , 2011, Networks Heterog. Media.

[188] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[189] Badler,et al. Techniques for Generating the Goal-Directed Motion of Articulated Structures , 1982, IEEE Computer Graphics and Applications.

[190] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[191] Serge P Hoogendorn,et al. EXTRACTING MICROSCOPIC PEDESTRIAN CHARACTERISTICS FROM VIDEO DATA : RESULTS FROM EXPERIMENTAL RESEARCH INTO PEDESTRIAN WALKING BEHAVIOR , 2003 .

[192] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[193] Jose L. Torero,et al. SFPE handbook of fire protection engineering , 2016 .

[194] Michel Bierlaire,et al. Specification, estimation and validation of a pedestrian walking behaviour model , 2007 .

[195] Gerta Köster,et al. Natural discretization of pedestrian movement in continuous space. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[196] Michael Wooldridge,et al. Reasoning about rational agents , 2000, Intelligent robots and autonomous agents.

[197] Dirk Helbing,et al. How simple rules determine pedestrian behavior and crowd disasters , 2011, Proceedings of the National Academy of Sciences.

[198] Robert M. Gray,et al. An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[199] Mark H. Overmars,et al. Simulating and Evaluating the Local Behavior of Small Pedestrian Groups , 2012, IEEE Transactions on Visualization and Computer Graphics.

[200] Franziska Klügl,et al. A Case Study of the Bern Railway Station , 2007 .

[201] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[202] Kathryn E. Merrick,et al. Motivated Reinforcement Learning - Curious Characters for Multiuser Games , 2009 .

[203] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[204] William H. Warren,et al. A behavioral dynamics approach to modeling realistic pedestrian behavior , 2012 .

[205] Peter Stone,et al. Intrinsically motivated model learning for a developing curious agent , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[206] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[207] Adli Mustafa,et al. Modelling Pedestrian Travel Time and the Design of Facilities: A Queuing Approach , 2013, PloS one.

[208] Albert Steiner,et al. Parameter estimation for a pedestrian simulation model , 2007 .

[209] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.

[210] Miguel Lozano,et al. A new system architecture for crowd simulation , 2009, J. Netw. Comput. Appl..

[211] Daniel Thalmann,et al. A vision-based approach to behavioural animation , 1990, Comput. Animat. Virtual Worlds.

[212] Matthias Nussbaum,et al. Pedestrian And Evacuation Dynamics , 2016 .

[213] R. Ranjan,et al. Reinforcement learning for dynamic channel allocation in mobile cellular systems , 2008, 2008 International Conference on Recent Advances in Microwave Theory and Applications.

[214] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[215] Fernando Fernández,et al. Reinforcement Learning for Decision-Making in a Business Simulator , 2012, Int. J. Inf. Technol. Decis. Mak..

[216] N. Mutrie,et al. A Workplace Intervention to Promote Stair Climbing: Greater Effects in the Overweight , 2006, Obesity.

[217] G. Zipf,et al. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. , 1949 .

[218] Lynne E. Parker,et al. A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains , 2005, J. Intell. Robotic Syst..

[219] Rahul Narain,et al. Aggregate dynamics for dense crowd simulation , 2009, SIGGRAPH 2009.

[220] Michael Gleicher,et al. Scalable behaviors for crowd simulation , 2004, Comput. Graph. Forum.

[221] Dinesh Manocha,et al. Least-effort trajectories lead to emergent crowd behaviors. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[222] Steven M. LaValle,et al. Planning algorithms , 2006 .

[223] Daniel Thalmann,et al. Crowd modelling in collaborative virtual environments , 1998, VRST '98.

[224] S. Marsella,et al. Assessing the validity of appraisal-based models of emotion , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[225] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[226] Craig W. Reynolds. Steering Behaviors For Autonomous Characters , 1999 .

[227] Thomas C. Schelling,et al. Dynamic models of segregation , 1971 .

[228] Michel Bierlaire,et al. BIOGEME: a free package for the estimation of discrete choice models , 2003 .

[229] Daniel Thalmann,et al. Navigation for digital actors based on synthetic vision, memory, and learning , 1995, Comput. Graph..

[230] D.M. Mount,et al. An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[231] Javier García,et al. Safe Exploration of State and Action Spaces in Reinforcement Learning , 2012, J. Artif. Intell. Res..

[232] L. Izquierdo,et al. The impact of quality uncertainty without asymmetric information on market efficiency , 2007 .

[233] Fasheng Qiu,et al. Modeling Social Group Structures in Pedestrian Crowds , 2010 .

[234] Serge P. Hoogendoorn,et al. A Novel Calibration Approach of Microscopic Pedestrian Models , 2009 .

[235] Danny Weyns,et al. TOWARDS ACTIVE PERCEPTION IN SITUATED MULTI-AGENT SYSTEMS , 2004, Appl. Artif. Intell..

[236] Kincho H. Law,et al. A Multi-Agent Based Simulation Framework for the Study of Human and Social Behavior in Egress Analysis , 2005 .

[237] Benjamin Kuipers,et al. Computer power and human reason , 1976, SGAR.

[238] H. Timmermans,et al. A Model of Pedestrian Route Choice and Demand for Retail Facilities within Inner-City Shopping Areas , 2010 .

[239] Norman I. Badler,et al. Controlling individual agents in high-density crowd simulation , 2007, SCA '07.

[240] Dirk Helbing. A Fluid-Dynamic Model for the Movement of Pedestrians , 1992, Complex Syst..

[241] Doina Precup,et al. Using Options for Knowledge Transfer in Reinforcement Learning , 1999 .

[242] S. P. Lloyd,et al. Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[243] S. Dai,et al. Centrifugal force model for pedestrian dynamics. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[244] Jehee Lee,et al. Precomputing avatar behavior from human motion data , 2004, SCA '04.

[245] Dinesh Manocha,et al. Velocity-based modeling of physical interactions in multi-agent simulations , 2013, SCA '13.

[246] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .

[247] Manuela M. Veloso,et al. Learning domain structure through probabilistic policy reuse in reinforcement learning , 2013, Progress in Artificial Intelligence.

[248] Francisco Javier García-Polo,et al. Safe reinforcement learning in high-risk tasks through policy improvement , 2011, ADPRL.

[249] Ulrich Weidmann,et al. Transporttechnik der Fussgänger , 1992 .

[250] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[251] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[252] Serge P. Hoogendoorn,et al. Pedestrian route-choice and activity scheduling theory and models , 2004 .

[253] J. Pettré,et al. Minimal predicted distance: a common metric for collision avoidance during pairwise interactions between walkers. , 2012, Gait & posture.

[254] Jessica K. Hodgins,et al. Reactive pedestrian path following from examples , 2004, The Visual Computer.

[255] L. E. ParkerCenter. Learning in Large Cooperative Multi-Robot Domains , 2001 .

[256] Michel Bierlaire,et al. Scenario Analysis of Pedestrian Flow in Public Spaces , 2012 .

[257] Mubarak Shah,et al. Floor Fields for Tracking in High Density Crowd Scenes , 2008, ECCV.

[258] Dinesh Manocha,et al. Multi-robot coordination using generalized social potential fields , 2009, 2009 IEEE International Conference on Robotics and Automation.

[259] Zhigang Deng,et al. Context-Aware Motion Diversification for Crowd Simulation , 2011, IEEE Computer Graphics and Applications.

[260] Alireza Esfahani,et al. Widest K-Shortest Paths Q-Routing: A New QoS Routing Algorithm in Telecommunication Networks , 2008, 2008 International Conference on Computer Science and Software Engineering.

[261] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.

[262] D. Thalmann,et al. Handbook of Virtual Humans: Magnenat-Thalma/Handbook of Virtual Humans , 2006 .

[263] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[264] Maurice Bruynooghe,et al. Learning Relational Options for Inductive Transfer in Relational Reinforcement Learning , 2007, ILP.

[265] Leon Sterling,et al. A BDI approach to agent-based modelling of pedestrians , 2005 .

[266] Alberto A. C. C. Pais,et al. In vivo friction study of human skin: Influence of moisturizers on different anatomical sites , 2007 .

[267] Demetri Terzopoulos,et al. Animat vision: Active vision in artificial animals , 1995, Proceedings of IEEE International Conference on Computer Vision.

[268] Demetri Terzopoulos,et al. Autonomous pedestrians , 2007, Graph. Model..

[269] Sebastian Thrun,et al. Learning to Learn , 1998, Springer US.

[270] Minoru Asada,et al. Vision-Based Behavior Acquisition For A Shooting Robot By Using A Reinforcement Learning , 1994 .

[271] David A. Forsyth,et al. Learning to move autonomously in a hostile world , 2005, SIGGRAPH '05.

[272] Dirk Helbing,et al. Collective phenomena and states in traffic and self-driven many-particle systems , 2004 .

[273] Peter Stone,et al. Layered Learning in Multiagent Systems , 1997, AAAI/IAAI.

[274] Corporate The MPI Forum,et al. MPI: a message passing interface , 1993, Supercomputing '93.

[275] Daniel Thalmann,et al. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cav.147 , 2022 .

[276] Michael Schreckenberg,et al. Two lane traffic simulations using cellular automata , 1995, cond-mat/9512119.

[277] L. F. Henderson,et al. The Statistics of Crowd Fluids , 1971, Nature.

[278] Victor J. Blue,et al. Cellular automata microsimulation for modeling bi-directional pedestrian walkways , 2001 .

[279] L. Buşoniu,et al. A comprehensive survey of multi-agent reinforcement learning , 2011 .

[280] Stéphane Donikian,et al. Experiment-based modeling, simulation and validation of interactions between virtual walkers , 2009, SCA '09.

[281] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.

[282] D I Robertson,et al. User guide to TRANSYT version 6 , 1976 .

[283] Rosaldo J. F. Rossetti,et al. Towards a Framework for Pedestrian Simulation for Intermodal Interfaces , 2013, 2013 European Modelling Symposium.

[284] Seth B. Young,et al. Evaluation of Pedestrian Walking Speeds in Airport Terminals , 1999 .

[285] Adolf D. May,et al. Traffic Flow Fundamentals , 1989 .

[286] Manuela M. Veloso,et al. Learning equivalent action choices from demonstration , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[287] K. Kwiatkowski,et al. Friction and deformation behaviour of human skin , 2009 .

[288] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[289] J. Hernández-Orallo,et al. Policy Reuse in a General Learning Framework , 2013 .

[290] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[291] Dirk Hartmann,et al. Adaptive pedestrian dynamics based on geodesics , 2010 .

[292] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[293] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.

[294] Sergey Levine,et al. Physically plausible simulation for character animation , 2012, SCA '12.

[295] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[296] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[297] E. Reed. The Ecological Approach to Visual Perception , 1989 .

[298] Eduardo F. Morales,et al. Dynamic Reward Shaping: Training a Robot by Voice , 2010, IBERAMIA.

[299] Andrew Tridgell,et al. KnightCap: A Chess Programm That Learns by Combining TD(lambda) with Game-Tree Search , 1998, ICML.

[300] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[301] William H. Greene,et al. School Location and Student Travel Analysis of Factors Affecting Mode Choice , 2004 .

[302] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[303] Takashi Chikayama,et al. Simulating the Collision Avoidance Behavior of Pedestrians , 2000 .

[304] Andreas Schadschneider,et al. Extended Floor Field CA Model for Evacuation Dynamics , 2004, IEICE Trans. Inf. Syst..

[305] P. Glynn,et al. Opportunities and challenges in using online preference data for vehicle pricing: A case study at General Motors , 2006 .

[306] P G Gipps,et al. A micro simulation model for pedestrian flows , 1985 .

[307] Allen Newell,et al. Chunking in Soar: The anatomy of a general learning mechanism , 1985, Machine Learning.

[308] Michael Batty,et al. Advanced Spatial Analysis: The CASA Book of GIS , 2003 .

[309] Nancy S. Pollard,et al. Responsive characters from motion fragments , 2007, SIGGRAPH 2007.

[310] Gunnar G. Løvås,et al. Modeling and Simulation of Pedestrian Traffic Flow , 1994 .

[311] Evan Herbst,et al. Character animation in two-player adversarial games , 2010, TOGS.

[312] Renata Vieira,et al. Ontology-based crowd simulation for normal life situations , 2005, International 2005 Computer Graphics.

[313] Lisa Torrey,et al. Crowd Simulation Via Multi-Agent Reinforcement Learning , 2010, AIIDE.

[314] Xiaolin Hu,et al. Spatial activity-based modeling for pedestrian crowd simulation , 2013, Simul..

[315] Andrew M. Day,et al. Stream-based animation of real-time crowd scenes , 2012, Comput. Graph..

[316] Guillaume Deffuant,et al. Mixing beliefs among interacting agents , 2000, Adv. Complex Syst..

[317] Florian Heiss,et al. Discrete Choice Methods with Simulation , 2016 .

[318] Christopher E. Peters,et al. Modeling Groups of Plausible Virtual Pedestrians , 2009, IEEE Computer Graphics and Applications.

[319] Monica N. Nicolescu,et al. Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[320] Tuomas Sandholm,et al. Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[321] William H. K. Lam,et al. Pedestrian speed/flow relationships for walking facilities in Hong Kong , 2000 .

[322] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[323] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.