Emergent behavior and neural dynamics in artificial agents tracking turbulent plumes

Tracking a turbulent plume to locate its source is a complex control problem because it requires multi-sensory integration and must be robust to intermittent odors, changing wind direction, and variable plume statistics. This task is routinely performed by flying insects, often over long distances, in pursuit of food or mates. Several aspects of this remarkable behavior have been studied in detail in many experimental studies. Here, we take a complementary in silico approach, using artificial agents trained with reinforcement learning to develop an integrated understanding of the behaviors and neural computations that support plume tracking. Specifically, we use deep reinforcement learning (DRL) to train recurrent neural network (RNN) agents to locate the source of simulated turbulent plumes. Interestingly, the agents’ emergent behaviors resemble those of flying insects, and the RNNs learn to represent task-relevant variables, such as head direction and time since last odor encounter. Our analyses suggest an intriguing experimentally testable hypothesis for tracking plumes in changing wind direction—that agents follow local plume shape rather than the current wind direction. While reflexive short-memory behaviors are sufficient for tracking plumes in constant wind, longer timescales of memory are essential for tracking plumes that switch direction. At the level of neural dynamics, the RNNs’ population activity is low-dimensional and organized into distinct dynamical structures, with some correspondence to behavioral modules. Our in silico approach provides key intuitions for turbulent plume tracking strategies and motivates future targeted experimental and theoretical developments.

[1]  Gaby Maimon,et al.  A neural circuit architecture for angular integration in Drosophila , 2017, Nature.

[2]  Claus C. Hilgetag,et al.  Bio-instantiated recurrent neural networks: Integrating neurobiology-based network topology in artificial networks , 2021, Neural Networks.

[3]  Uri Hasson,et al.  Keep it real: rethinking the primacy of experimental control in cognitive neuroscience , 2020, NeuroImage.

[4]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[5]  Kevin M. Brink,et al.  The “Smellicopter,” a bio-hybrid odor localizing nano air vehicle , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Michael H. Dickinson,et al.  The long-distance flight behavior of Drosophila suggests a general model for wind-assisted dispersal in insects , 2020 .

[7]  Byron M. Yu,et al.  Dimensionality reduction for large-scale neural recordings , 2014, Nature Neuroscience.

[8]  Joel Z. Leibo,et al.  Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.

[9]  S. Kurosawa,et al.  The synthetic moth: a neuromorphic approach toward artificial olfaction in robots , 1990 .

[10]  Ji Hyun Bak,et al.  Inferring learning rules from animal decision-making , 2020, NeurIPS.

[11]  Boris I. Shraiman,et al.  Olfactory search at high Reynolds number , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Joshua W. Shaevitz,et al.  Predictability and hierarchy in Drosophila behavior , 2016, Proceedings of the National Academy of Sciences.

[13]  Dario Izzo,et al.  Evolutionary robotics approach to odor source localization , 2013, Neurocomputing.

[14]  J. Kennedy,et al.  Pheromone-Regulated Anemotaxis in Flying Moths , 1974, Science.

[15]  Xiao-Jing Wang,et al.  Reward-based training of recurrent neural networks for cognitive and value-based tasks , 2016, bioRxiv.

[16]  Sreekanth H. Chalasani,et al.  Maximally informative foraging by Caenorhabditis elegans , 2014, eLife.

[17]  Surya Ganguli,et al.  Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics , 2019, NeurIPS.

[18]  L. Abbott,et al.  Eigenvalue spectra of random matrices for neural networks. , 2006, Physical review letters.

[19]  Ingmar Kanitscheider,et al.  Training recurrent networks to generate hypotheses about how the brain solves hard navigation problems , 2016, NIPS.

[20]  Ari Weinstein,et al.  Structure Learning in Motor Control: A Deep Reinforcement Learning Model , 2017, CogSci.

[21]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[22]  Massimo Vergassola,et al.  ‘Infotaxis’ as a strategy for searching without gradients , 2007, Nature.

[23]  Yann LeCun,et al.  Recurrent Orthogonal Networks and Long-Memory Tasks , 2016, ICML.

[24]  Risto Miikkulainen,et al.  Designing neural networks through neuroevolution , 2019, Nat. Mach. Intell..

[25]  Radoslaw Martin Cichy,et al.  Deep Neural Networks as Scientific Models , 2019, Trends in Cognitive Sciences.

[26]  Xue-Xin Wei,et al.  Emergence of grid-like representations by training recurrent neural networks to perform spatial localization , 2018, ICLR.

[27]  Petros Koumoutsakos,et al.  Efficient collective swimming by harnessing vortices through deep reinforcement learning , 2018, Proceedings of the National Academy of Sciences.

[28]  B. Hayden,et al.  The population doctrine in cognitive neuroscience , 2021, Neuron.

[29]  M. Dickinson,et al.  Free-flight responses of Drosophila melanogaster to attractive odors , 2006, Journal of Experimental Biology.

[30]  Leenoy Meshulam,et al.  Reverse-engineering Recurrent Neural Network solutions to a hierarchical inference task for mice , 2020, bioRxiv.

[31]  M. Ahrens Zebrafish Neuroscience: Using Artificial Neural Networks to Help Understand Brains , 2019, Current Biology.

[32]  M. Vergassola,et al.  Odor Landscapes in Turbulent Environments , 2014, 1411.3507.

[33]  Uri Hasson,et al.  Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks , 2019, Neuron.

[34]  J. Adler Chemotaxis in Bacteria , 1966, Science.

[35]  Saurabh Daptardar,et al.  Inverse Rational Control with Partially Observable Continuous Nonlinear Dynamics , 2019, NeurIPS.

[36]  Michael Mangan,et al.  An Analysis of a Ring Attractor Model for Cue Integration , 2018, Living Machines.

[37]  A. Ghazanfar,et al.  The Life of Behavior , 2019, Neuron.

[38]  Adrienne L. Fairhall,et al.  Dimensionality reduction in neuroscience , 2016, Current Biology.

[39]  Surya Ganguli,et al.  A deep learning framework for neuroscience , 2019, Nature Neuroscience.

[40]  Joe N. Perry,et al.  Range of action of moth sex‐attractant sources , 1987 .

[41]  Feng Li,et al.  A connectome and analysis of the adult Drosophila central brain , 2020, bioRxiv.

[42]  Shane Legg,et al.  Meta-trained agents implement Bayes-optimal agents , 2020, NeurIPS.

[43]  J. Nathan Kutz,et al.  Biological Mechanisms for Learning: A Computational Model of Olfactory Learning in the Manduca sexta Moth, With Applications to Neural Nets , 2018, Front. Comput. Neurosci..

[44]  Adrienne L. Fairhall,et al.  History dependence in insect flight decisions during odor tracking , 2018, PLoS Comput. Biol..

[45]  Matthew T. Kaufman,et al.  A neural network that finds a naturalistic solution for the production of muscle activity , 2015, Nature Neuroscience.

[46]  Odor motion sensing enables complex plume navigation , 2021 .

[47]  J. Kennedy Zigzagging and casting as a programmed response to wind‐borne odour: a review , 1983 .

[48]  R. Andrew Russell,et al.  Robot Odor Localization: A Taxonomy and Survey , 2008, Int. J. Robotics Res..

[49]  F. Jülicher,et al.  Chemotaxis of sperm cells , 2007, Proceedings of the National Academy of Sciences.

[50]  Katherine I. Nagel,et al.  Multisensory control of navigation in the fruit fly , 2019, Current Opinion in Neurobiology.

[51]  A. Neville Insects as machines , 1993 .

[52]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[53]  Surya Ganguli,et al.  Universality and individuality in neural dynamics across large populations of recurrent networks , 2019, NeurIPS.

[54]  Maneesh Sahani,et al.  Organizing recurrent network dynamics by task-computation to enable continual learning , 2020, NeurIPS.

[55]  Daniel Grünbaum,et al.  Spatial memory-based behaviors for locating sources of odor plumes , 2015, Movement ecology.

[56]  Nikolaus Kriegeskorte,et al.  Deep Neural Networks in Computational Neuroscience , 2019 .

[57]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[58]  Ruslan Salakhutdinov,et al.  Recurrent Model-Free RL is a Strong Baseline for Many POMDPs , 2021, ArXiv.

[59]  Kathryn Bonnen,et al.  Beyond Trial-Based Paradigms: Continuous Behavior, Ongoing Neural Activity, and Natural Stimuli , 2018, The Journal of Neuroscience.

[60]  Rajesh P. N. Rao,et al.  Mining naturalistic human behaviors in long-term video and neural recordings , 2021, Journal of Neuroscience Methods.

[61]  John D. Murray,et al.  PsychRNN: An Accessible and Flexible Python Package for Training Recurrent Neural Network Models on Cognitive Tasks , 2020, eNeuro.

[62]  U. Homberg,et al.  Organization and functional roles of the central complex in the insect brain. , 2014, Annual review of entomology.

[63]  Shreya Saxena,et al.  Towards the neural population doctrine , 2019, Current Opinion in Neurobiology.

[64]  Eric Shea-Brown,et al.  Dimensionality in recurrent spiking networks: global trends in activity and local origins in connectivity , 2019, PLoS Comput. Biol..

[65]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[66]  Xiao-Jing Wang,et al.  Task representations in neural networks trained to perform many cognitive tasks , 2019, Nature Neuroscience.

[67]  Hannes Rapp,et al.  A spiking neural program for sensorimotor control during foraging in flying insects , 2020, Proceedings of the National Academy of Sciences.

[68]  Gerald M. Rubin,et al.  A connectome of the Drosophila central complex reveals network motifs suitable for flexible navigation and context-dependent action selection , 2020, bioRxiv.

[69]  J. Farrell,et al.  Filament-Based Atmospheric Dispersion Model to Achieve Short Time-Scale Structure of Odor Plumes , 2002 .

[70]  Michael H. Dickinson,et al.  Plume-Tracking Behavior of Flying Drosophila Emerges from a Set of Distinct Sensory-Motor Reflexes , 2014, Current Biology.

[71]  L. Abbott,et al.  Neural network dynamics. , 2005, Annual review of neuroscience.

[72]  Matthew Crosby,et al.  Building Thinking Machines by Solving Animal Cognition Tasks , 2020, Minds and Machines.

[73]  Logan Cross,et al.  Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments , 2020, Neuron.

[74]  Alison I Weber,et al.  The role of adaptation in neural coding , 2019, Current Opinion in Neurobiology.

[75]  S. Gershman,et al.  The neurobiology of deep reinforcement learning , 2020, Current Biology.

[76]  Alcherio Martinoli,et al.  Theoretical analysis of three bio-inspired plume tracking algorithms , 2009, 2009 IEEE International Conference on Robotics and Automation.

[77]  Antoine Wystrach,et al.  Towards a multi-level understanding in insect navigation. , 2020, Current opinion in insect science.

[78]  Gautam Reddy,et al.  Sector search strategies for odor trail tracking , 2021, Proceedings of the National Academy of Sciences.

[79]  Razvan Pascanu,et al.  Vector-based navigation using grid-like representations in artificial agents , 2018, Nature.

[80]  Xi Wu,et al.  Reinforcement Learning in Spiking Neural Networks with Stochastic and Deterministic Synapses , 2019, Neural Computation.

[81]  David Sussillo,et al.  Neural circuits as computational dynamical systems , 2014, Current Opinion in Neurobiology.

[82]  Florian Engert,et al.  Convergent temperature representations in artificial and biological neural networks , 2018 .

[83]  Zeb Kurth-Nelson,et al.  Deep Reinforcement Learning and Its Neuroscientific Implications , 2020, Neuron.

[84]  Rachel I. Wilson,et al.  A Neural Network for Wind-Guided Compass Navigation , 2020, Neuron.

[85]  Sung Soo Kim,et al.  Generation of stable heading representations in diverse visual scenes , 2019, Nature.

[86]  R. Cardé,et al.  Navigational Strategies Used by Insects to Find Distant, Wind-Borne Sources of Odor , 2008, Journal of Chemical Ecology.

[87]  Krishna V. Shenoy,et al.  Computation Through Neural Population Dynamics. , 2020, Annual review of neuroscience.

[88]  Ann M Hermundstad,et al.  Adaptive coding for dynamic sensory inference , 2017, bioRxiv.

[89]  Michael Crawshaw,et al.  Multi-Task Learning with Deep Neural Networks: A Survey , 2020, ArXiv.

[90]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[91]  Benjamin Beyret,et al.  The Animal-AI Olympics , 2019, Nature Machine Intelligence.

[92]  Michael Dickinson,et al.  Algorithms for Olfactory Search across Species , 2018, The Journal of Neuroscience.

[93]  N. Kriegeskorte,et al.  Neural tuning and representational geometry , 2021, Nature Reviews Neuroscience.

[94]  M.P. Michaelides,et al.  Plume Source Position Estimation Using Sensor Networks , 2005, Proceedings of the 2005 IEEE International Symposium on, Mediterrean Conference on Control and Automation Intelligent Control, 2005..

[95]  Nirag Kadakia,et al.  Walking Drosophila navigate complex plumes using stochastic decisions biased by the timing of odor encounters , 2020, bioRxiv.

[96]  A. Ijspeert,et al.  NeuroMechFly, a neuromechanical model of adult Drosophila melanogaster , 2021, Nature Methods.

[97]  Christopher J. Cueva,et al.  Emergence of functional and structural properties of the head direction system by optimization of recurrent neural networks , 2019, ICLR.

[98]  Jane X. Wang,et al.  Reinforcement Learning, Fast and Slow , 2019, Trends in Cognitive Sciences.

[99]  Yuval Tassa,et al.  Deep neuroethology of a virtual rodent , 2019, ICLR.

[100]  Simona Cocco,et al.  On the trajectories and performance of Infotaxis, an information-based greedy search algorithm , 2010, 1010.2728.

[101]  Michael Breakspear,et al.  Naturalistic Stimuli in Neuroscience: Critically Acclaimed , 2019, Trends in Cognitive Sciences.

[102]  Antonio Celani,et al.  Flow Navigation by Smart Microswimmers via Reinforcement Learning , 2017, Physical review letters.

[103]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[104]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[105]  Surya Ganguli,et al.  Embodied intelligence via learning and evolution , 2021, Nature Communications.

[106]  Blake A. Richards,et al.  Time cell encoding in deep reinforcement learning agents depends on mnemonic demands , 2021, bioRxiv.

[107]  José Carlos Príncipe,et al.  Neurally Encoding Time for Olfactory Navigation , 2016, PLoS Comput. Biol..

[108]  D. Labonte,et al.  scAnt—an open-source platform for the creation of 3D models of arthropods (and other small objects) , 2021, PeerJ.

[109]  V. Murthy,et al.  Olfactory Sensing and Navigation in Turbulent Environments , 2021, Annual Review of Condensed Matter Physics.

[110]  Massimo Vergassola,et al.  Chasing information to search in random environments , 2009 .

[111]  Terence Hwa,et al.  Chemotaxis as a navigation strategy to boost range expansion , 2019, Nature.

[112]  Johannes D. Seelig,et al.  Neural dynamics for landmark orientation and angular path integration , 2015, Nature.

[113]  L. F. Abbott,et al.  Generating Coherent Patterns of Activity from Chaotic Neural Networks , 2009, Neuron.