Neural Network Metalearning for Parallel Textual Information Retrieval

In this study, a triple-phase neural network metalearning technique is proposed to perform parallel textual information retrieval tasks. In the first phase, the massive textual data collections with the terabyte scale are first partitioned into various relatively small textual data subsets. Then these small data subsets are moved to different computational agents. In the second phase, the single neural network model as the intelligent learning agent is applied to the different textual data subsets so as to retrieve some relevant text documents responding to a query. For a given query, the neural network learning agent, which is sufficiently trained by back-propagation learning algorithm on underlying text documents, can produce a relevance score between 0 and 1 for a certain text document. In the third phase, based on the different relevance scores produced by the previous phase, a neural-network-based metamodel by integrating the relevance results is generated to provide a proactive information extraction model that can be used on unseen query to determine the relevance degree between the query and textual documents (out of many candidates). For illustration and testing purposes, a practical web textual information retrieval experiment is performed to verify the effectiveness and efficiency of the proposed neural-network-based metalearning technique.

[1]  Robert L. Grossman,et al.  Combining Families of Information Retrieval Algorithms Using Metalearning , 2004 .

[2]  Luis Gravano,et al.  Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies , 1995, VLDB.

[3]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[4]  Brewster Kahle,et al.  An information system for corporate users: wide area information servers , 1991 .

[5]  G Salton,et al.  Developments in Automatic Text Retrieval , 1991, Science.

[6]  Luis Gravano,et al.  The Effectiveness of GlOSS for the Text Database Discovery Problem , 1994, SIGMOD Conference.

[7]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[8]  Gerard Salton,et al.  The SMART Retrieval System , 1971 .

[9]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[10]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[11]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[12]  Halbert White,et al.  Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings , 1990, Neural Networks.

[13]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[14]  Philip K. Chan,et al.  Meta-learning in distributed data mining systems: Issues and approaches , 2007 .

[15]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[16]  Melody Y. Kiang,et al.  Managerial Applications of Neural Networks: The Case of Bank Failure Predictions , 1992 .

[17]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[18]  TamKar Yan,et al.  Managerial Applications of Neural Networks , 1992 .

[19]  Ray Tsaih,et al.  Forecasting S&P 500 stock index futures with a hybrid AI system , 1998, Decis. Support Syst..

[20]  Karen Sparck Jones A statistical interpretation of term specificity and its application in retrieval , 1972 .

[21]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[22]  Jau-Hsiung Huang,et al.  On Parallel Processing Systems: Amdahl's Law Generalized and Some Results on Optimal Design , 1992, IEEE Trans. Software Eng..