Estimation of distributions involving unobservable events: the case of optimal search with unknown Target Distributions

We consider the problem of estimating the parameters of a distribution when the underlying events are themselves unobservable. The aim of the exercise is to perform a task (for example, search a web-site or query a distributed database) based on a distribution involving the state of nature, except that we are not allowed to observe the various “states of nature” involved in this phenomenon. In particular, we concentrate on the task of searching for an object in a set of N locations (or bins) {C1, C2,…, CN }, in which the probability of the object being in the location Ci is pi, where P = [p1, p2,…, pN]T is called the Target Distribution. Also, the probability of locating the object in the bin within a specified time, given that it is in the bin, is given by a function called the Detection function, which, in its most common instantiation, is typically, specified by an exponential function. The intention is to allocate the available resources so as to maximize the probability of locating the object. The handicap, however, is that the time allowed is limited, and thus the fact that the object is not located in bin Ci within a specified time does not necessarily imply that the object is not in Ci. This problem has applications in searching large databases, distributed databases, and the world-wide web, where the location of the files sought for are unknown, and in developing various military and strategic policies. All of the research done in this area has assumed the knowledge of the {pi}. In this paper we consider the problem of obtaining error bounds, estimating the Target Distribution, and allocating the search times when the {pi} are unknown. To the best of our knowledge, these results are of a pioneering sort - they are the first available results in this area, and are particularly interesting because, as mentioned earlier, the events concerning the Target Distribution, in themselves, are unobservable.

[1]  Matthew K. Franklin,et al.  Mutual search , 1998, SODA '98.

[2]  Joseph B. Kadane,et al.  Optimal Whereabouts Search , 1971, Oper. Res..

[3]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[4]  Armand M. Makowski,et al.  From Optimal Search Theory to Sequential Paging in Cellular Networks , 1997, IEEE J. Sel. Areas Commun..

[5]  Lawrence D. Stone Incremental and Total Optimization of Separable Functionals with Constraints , 1976 .

[6]  B. O. Koopman The Theory of Search , 1957 .

[7]  Douglas W. Gage Proceedings of the Autonomous Vehicles in Mine Countermeasures Symposium, Monterey CA, 4-7 April 1995 Many-Robot MCM Search Systems , 1995 .

[8]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[9]  O. V. Staroverov On a Searching Problem , 1963 .

[10]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[11]  Lawrence D. Stone Incremental approximation of optimal allocations , 1972 .

[12]  Jacques de Guenin Optimum Distribution of Effort: An Extension of the Koopman Basic Theory , 1961 .

[13]  Sheldon M. Ross Introduction to probability models , 1998 .

[14]  O Bénichou,et al.  Optimal search strategies for hidden targets. , 2005, Physical review letters.

[15]  J. Kadane Discrete search and the Neyman-Pearson Lemma , 1968 .

[16]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[17]  R. C. Sprinthall Basic Statistical Analysis , 1982 .

[18]  B. O. Koopman The Theory of Search. I. Kinematic Bases , 1956 .

[19]  E. Sontag,et al.  Computational complexities of honey-pot searching with local sensory information , 2004, Proceedings of the 2004 American Control Conference.

[20]  Philip D. Wasserman,et al.  Neural computing - theory and practice , 1989 .

[21]  Kenji Onaga Optimal Search for Detecting a Hidden Object , 1971 .

[22]  Theodosios Pavlidis,et al.  Structural pattern recognition , 1977 .

[23]  B. John Oommen,et al.  Periodicity and stability issues of a chaotic pattern recognition neural network , 2007, Pattern Analysis and Applications.

[24]  E. N. Gilbert Optimal Search Strategies , 1959 .

[25]  Singiresu S. Rao,et al.  Optimization Theory and Applications , 1980, IEEE Transactions on Systems, Man, and Cybernetics.

[26]  David G. Stork,et al.  Pattern Classification , 1973 .

[27]  Ingo Wegener The construction of an optimal distribution of search effort , 1981 .

[28]  B. John Oommen,et al.  Automata learning and intelligent tertiary searching for stochastic point location , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[29]  V. I. Arkin A Problem of Optimum Distribution of Search Effort , 1964 .

[30]  Donald F. Mela Letter to the Editor---Information Theory and Search Theory as Special Cases of Decision Theory , 1961 .

[31]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[32]  Douglas W. Gage Many-Robot MCM Search Systems , 1995 .

[33]  Carolyn Pillers Dobler,et al.  Mathematical Statistics , 2002 .

[34]  B. Gluss An Optimum Policy for Detecting a Fault in a Complex System , 1959 .

[35]  Brian Gluss Approximately optimal one-dimensional search policies in which search costs vary through time , 1961 .

[36]  Rajarathnam Chandramouli Web search steganalysis: some challenges and approaches , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[37]  James R. Weisinger,et al.  Optimal layered search , 1989 .

[38]  Yoh-Han Pao,et al.  Adaptive pattern recognition and neural networks , 1989 .

[39]  P. S. Sastry,et al.  Continuous action set learning automata for stochastic optimization , 1994 .

[40]  Paul H. Garthwaite,et al.  Statistical Inference , 2002 .

[41]  Andrew Chi-Chih Yao,et al.  An Almost Optimal Algorithm for Unbounded Searching , 1976, Inf. Process. Lett..

[42]  B. John Oommen,et al.  Stochastic searching on the line and its applications to parameter learning in nonlinear optimization , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[43]  A. Charnes,et al.  The Theory of Search: Optimum Distribution of Search Effort , 1958 .

[44]  Ralf Herbrich,et al.  Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[45]  J. Dobbie Search theory: A sequential approach† , 1963 .

[46]  Lawrence D. Stone Total optimality of incrementally optimal allocations , 1973 .

[47]  Milton C. Chew A Sequential Search Procedure , 1967 .

[48]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[49]  Keith P. Tognetti,et al.  Letter to the Editor - An Optimal Strategy for a Whereabouts Search , 1968, Oper. Res..

[50]  Ghada Hany Badr,et al.  A novel look-ahead optimization strategy for trie-based approximate string matching , 2006, Pattern Analysis and Applications.

[51]  B. John Oommen,et al.  Scale Preserving Smoothing of Polygons , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  S. Gal Search Games with Mobile and Immobile Hider , 1979 .

[53]  H. Simon,et al.  Optimal Strategies for a Class of Constrained Sequential Problems , 1977 .

[54]  P. Lachenbruch Mathematical Statistics, 2nd Edition , 1972 .

[55]  Bernard O. Koopman A THEORETICAL BASIS FOR METHOD OF SEARCH AND SCREENING , 1946 .

[56]  Ingo Wegener The discrete search problem and the construction of optimal allocations , 1982 .

[57]  Ghada Hany Badr,et al.  Breadth-first search strategies for trie-based syntactic pattern recognition , 2007, Pattern Analysis and Applications.

[58]  Andrzej PELC,et al.  Searching with Known Error Probability , 1989, Theor. Comput. Sci..

[59]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[60]  B. O. Koopman The Theory of Search. II. Target Detection , 1956 .

[61]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.