Hierarchical Manifold Sensing with Foveation and Adaptive Partitioning of the Dataset

The authors present a novel method, Hierarchical Manifold Sensing, for adaptive and efficient visual sensing. As opposed to the previously introduced Manifold Sensing algorithm, the new version introduces a way of learning a hierarchical partitioning of the dataset based on k-means clustering. The algorithm can perform on whole images but also on a foveated dataset, where only salient regions are sensed. The authors evaluate the proposed algorithms on the COIL, ALOI, and MNIST datasets. Although they use a very simple nearest-neighbor classifier, on the easier benchmarks, COIL and ALOI, perfect recognition is possible with only six or ten sensing values. Moreover, they show that their sensing scheme yields a better recognition performance than compressive sensing with random projections. On MNIST, state-of-the-art performance cannot be reached, but they show that a large number of test images can be recognized with only very few sensing values. However, for many applications, performance on challenging benchmarks may be less relevant than the simplicity of the solution (processing power, bandwidth) when solving a less challenging problem. c © 2016 Society for Imaging Science and Technology. [DOI: 10.2352/J.ImagingSci.Technol.2016.60.2.020402] INTRODUCTION We present a novel method called Hierarchical Manifold Sensing (HMS). The objective is to develop appropriate sensing algorithms such as to increase the efficiency of visual sensing by adopting adaptive sensing strategies. The algorithms can also be used for resampling and compression before transmitting a densely sampled signal over a lowbandwidth channel. In other words, we address the question of how to efficiently sample the visual world under the constraint of a limited bandwidth. For example, the bandwidth is limited in human vision by the capacity of the optic nerve and in technical systems by the performance and cost of hardware. As opposed to classical sensing and compression schemes, HMS is based on unsupervised learning, which involves a hierarchical partitioning of the dataset. Hierarchical Manifold Sensing is inspired by Compressive Sensing (CS).1 Compressive Sensing is based on the fact that natural images can be encoded sparsely,2 and thus the number of samples used for representing an image accurately can be reduced by sensing with a random matrix.3 As opposed to classical sampling, with CS each acquired sensing value is a weighted sum of the original unknown signal. HierarchicalManifold Sensing works in a similar way, i.e., CS Received June 30, 2015; accepted for publication Nov. 8, 2015; published online Dec. 10, 2015. Associate Editor: Chunghui Kuo. 1062-3701/2016/60(2)/020402/10/$25.00 and HMS both make use of a sensing matrix. As opposed to CS, where the sensing matrix does not depend on the sensed data, HMS introduces a two-fold adaptivity: (i) the sensing algorithm adapts to a particular dataset, and (ii) every new sensing value depends on the already acquired sensing values. Thus, sensing in HMS is performed adaptively with optimized weights, and not randomly as in CS. Schütze et al.4 presented an alternative adaptive hierarchical sensing (AHS) scheme for efficiently obtaining the sparse coefficients of an image. The sensing process is performed by partially traversing a binary tree and making a measurement at each visited node. The method is adaptive in the sense that after each sensing action, depending on how much gain the sensing operation brings, it is decidedwhether the entire subtree of the current node is further traversed or whether it is omitted. Adaptive hierarchical sensing was applied on patches of natural images and it was shown that the performance of themethod can be improved by choosing an appropriate sparse coding basis and by properly arranging the AHS tree. The results of the method strongly depend on the decision step where a threshold is compared with the measurement values corresponding to the binary tree. Baraniuk presented a theoretical analysis of CS for manifolds in 20095 and showed that, similarly to the theory of CS, only a small number of random linear projections is sufficient to preserve the key information on a signal modeled by a manifold. Later, Chen et al.6 proposed a statistical framework for CS on manifolds. Their article presents a nonparametric hierarchical Bayesian algorithm that learns a mixture of factor analyzers for manifolds based on the training data. Afterwards, the signal is reconstructed using a limited number of random projections. The method is validated on synthetic and on real datasets, but it is evaluated only for a subset of the MNIST database. The Manifold Sensing concept was introduced before by Burciu et al. as visual Manifold Sensing (MS),7 and it was extended afterwards to the foveated version of Manifold Sensing (FMS),8 which was inspired by the sampling strategy of biological systems. Like Manifold Sensing, HMS is based on a geometric approach. Both MS and FMS are based on learning manifolds of increasing but low dimensionality by using a nonlinear algorithm, namely Locally Linear Embedding (LLE).9While sensing, the dataset is continuously adapted and the corresponding embedding is learned. As a further and optional feature, HMS can involve foveation as in the FMS approach. J. Imaging Sci. Technol. 020402-1 Mar.-Apr. 2016 Burciu, Martinetz, and Barth: Hierarchical Manifold Sensing with foveation and adaptive partitioning of the dataset Both MS and FMS strongly depend on the choice of the following parameters: (i) the number of neighbors used for LLE, (ii) the decreasing sizes of the adaptive dataset, and (iii) the dimensions of the manifolds at each iteration of the algorithm. Moreover, MS and FMS are operating on the entire dataset while sensing. As highlighted in the FMS article,8 in a real-time sensing scenario one would need to learn all of the manifolds corresponding to all possible subsets of data before performing the actual sensing. In this article we therefore provide an extended version of MS and FMS, which includes an adequate partitioning learned prior to sensing. Hierarchical Manifold Sensing as used in this article is also based on learning manifolds of different and low dimensionality. However, for simplicity we here use a linear method Principal Component Analysis (PCA) to learn the low-dimensional representations of the foveated dataset. The hierarchical partitioning of the dataset is performed by clustering the data in the low-dimensional manifolds using the k-means algorithm. Several approaches aim at developing efficient clustering methods for high-dimensional data; see, for example, Ref. 10. In this work we focus on solving the sensing problem and not on optimizing the approach for hierarchical partitioning of the data. Therefore, we just combine two simple approaches: PCA for dimensionality reduction and k-means for clustering (k-means++ implementation11). LikeMS and FMS,HMS is optimized and evaluatedwith respect to particular recognition tasks and not with respect to the reconstruction error. In the following section we present the Hierarchical Manifold Sensing method: we first explain how the foveated dataset is created, we then present in detail the steps of hierarchical partitioning of the dataset, and we finally show how the unknown scenes are sensed in a hierarchical way. After that we present the results of this work and conclusions. HIERARCHICALMANIFOLD SENSING Hierarchical Manifold Sensing (HMS) is based on a geometric approach to the problem of efficient sensing. A particular type of environment is represented by the images I i in a datasetD= {I1, . . . , Ip}, with p data points of dimension D. In the foveated version of HMS, which is considered here, the dataset D is first transformed into a foveated dataset Dfoveated that contains only regions of interest out of the original dataset. The goal is to learn efficient features for classification. This problem is, however, not approached by just unsupervised learning on the whole dataset Dfoveated. Instead, a tree structure that involves a hierarchical partitioning of the dataset is learned. The resulting partitioning is used to solve the sensing problem more efficiently, i.e., to use as few sensing actions as possible in order to sense and classify an unknown scene or object. In the following subsections we first review the procedure of creating the foveated dataset, which was presented in more detail in Ref. 8. Next, we describe the approach for the hierarchical partitioning of the dataset. These two steps define the offline part of the HMS algorithm. After we have learned the foveated hierarchical representation of the given dataset, we can project on it an unknown scene, i.e., a test point outside Dfoveated that we wish to sense. Hierarchical Manifold Sensing thus includes the following main steps. Creating a Foveated Dataset based on a dataset containing images of known scenes. Hierarchical Partitioning of the Dataset. Hierarchical Sensing of Unknown Scenes (here implemented by resampling of unknown test images). Creating a Foveated Dataset The foveated dataset Dfoveated contains only the pixels that are salient on average over the dataset. Although these pixels do not necessarily form a compact region of interest (ROI), we will denote the collection of salient pixels as the ROI. The ROI is extracted by using a saliency model based on the geometric invariants of the structure tensor of the images in the dataset D. The invariants of the structure tensor are known to be good predictors of human eye movements for static scenes.12 In Ref. 12 the properties of the image regions selected by the saccadic eye movements during experiments were analyzed in terms of higher-order statistics. It was shown that image regions with a statistically less redundant structure, such as the ones given by the signals with intrinsic dimension two, contain all the necessary information of a static scene. Therefore