The Active Sensing Testbed

The Active Sensing Testbed (AST) is a novel framework for research in machine perception and world-view reasoning. The AST supports exploratory development of perception systems that can build internal models of the world by combining multi-view and multi-modal analytics, utilize these models to form hypotheses about a scene, and potentially take action to fill in gaps in knowledge or make predictions about future world states. As a modular software framework, the AST is intended to lower the barrier to entry for researchers and developers in applying state-of-the-art computer vision techniques to real-world problems. Background and Project Goals The Active Sensing Testbed (AST) provides a development environment to facilitate novel research in machine perception, particularly research involving active perception where a system can take actions such as changing sensor positions or adjusting parameters to investigate hypotheses, fill in gaps in knowledge, or make predictions about future states. The AST is intended to help advance machine perception research beyond narrowly-trained, single-purpose computer vision algorithms that may provide state-of the-art pattern recognition but can be fragile in handling the complexities of real-world scenes. The AST provides a modular architecture for synthesizing multiple views, multiple sensing modalities, and complementary analytics to help provide robust inference and prediction under real-world conditions. Framework Architecture The architecture of the AST centers around a server that receives data feeds from multiple sensors, computes selected analytics and transformations on input data, and then sends analytics and metadata to subscribers for visualization. An example of this architecture is shown in Figure 1. We have created a research testbed around the AST at the Intelligent Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Systems Center of the Johns Hopkins Applied Physics Laboratory (JHU/APL)1. Our testbed includes four ceilingmounted pan-tilt-zoom cameras to facilitate data collection as well as algorithm design and evaluation. Our AST implementation also includes an operator interface to enable human-machine interaction. Through the operator interface and the associated application programming interface (API), a system operator or other remote user can view data feeds, overlay computed analytics and metadata, and issue commands to the system. We envision that our AST software framework can support similar setups in additional locations for research or other applications. Figure 1: Active Sensing Testbed server architecture. To allow other projects to easily interface with the AST server functions, we have implemented both a REST API and a Python library. Clients can subscribe to server workflows consisting of user-defined graphs that connect input data sources to sequences of operations. For example, a user may specify a networked camera as a data source, apply a series of transformations to the video from that camera, and then perform analytics on the transformed images. Using these tools allows researchers to build higher-level analytics and reasoning on top of baseline analytics, or to quickly investigate how different transformations affect a scene. 1 For more information on the Active Sensing Testbed, visit https://www.jhuapl.edu/isc The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

[1]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[2]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[3]  Jaakko Lehtinen,et al.  Few-Shot Unsupervised Image-to-Image Translation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).