Nonparametric Entropy Estimation for Stationary Processesand Random Fields, with Applications to English Text
We discuss a family of estimators for the entropy rate of a stationary ergodic process and prove their pointwise and mean consistency under a Doeblin-type mixing condition. The estimators are Cesaro averages of longest match-lengths, and their consistency follows from a generalized ergodic theorem due to Maker (1940). We provide examples of their performance on English text, and we generalize our results to countable alphabet processes and to random fields.
Diagram structure recognition by Bayesian conditional random fields
Hand-drawn diagrams present a complex recognition problem. Elements of the diagram are often individually ambiguous, and require context to be interpreted. We present a recognition method based on Bayesian conditional random fields (BCRFs) that jointly analyzes all drawing elements in order to incorporate contextual cues. The classification of each object affects the classification of its neighbors. BCRFs allow flexible and correlated features, and take both spatial and temporal information into account. BCRFs estimate the posterior distribution of parameters during training, and average predictions over the posterior for testing. As a result of model averaging, BCRFs avoid the overfitting problems associated with maximum likelihood training. We also incorporate automatic relevance determination (ARD), a Bayesian feature selection technique, into BCRFs. The result is significantly lower error rates compared to ML- and MAP-trained CRFs.
neural network sensor network wireless sensor network wireless sensor deep learning comparative study base station information retrieval feature extraction sensor node programming language cellular network random field digital video number theory rate control network lifetime river basin hyperspectral imaging distributed algorithm chemical reaction carnegie mellon university fly ash visual feature boundary detection video retrieval diabetes mellitu semantic indexing oryza sativa water storage user association efficient wireles shot boundary shot boundary detection data assimilation system retrieval task controlled trial terrestrial television video search gps network sensor network consist efficient wireless sensor information retrieval task concept detection video captioning retrieval evaluation rice seed safety equipment endangered species station operation case study involving dublin city university high-level feature seed germination brown coal high plain study involving structure recognition climate experiment gravity recovery table structure land data assimilation instance search combinatorial number randomised controlled trial recovery and climate randomised controlled combinatorial number theory adult male high-level feature extraction complete proof music perception robust computation optimization-based method perception and cognition global land datum social perception terrestrial water storage trec video retrieval terrestrial water object-oriented conceptual video retrieval evaluation trec video seed variety base station operation table structure recognition transgenic rice concept detector total water storage groundwater storage regional gp grace gravity randomized distributed algorithm ibm tivoli workload scheduler cerebrovascular accident case study united state