Internet searching and browsing in a multilingual world: An experiment on the Chinese Business Intelligence Portal (CBizPort)

The rapid growth of the non-English-speaking Internet population has created a need for better searching and browsing capabilities in languages other than English. However, existing search engines may not serve the needs of many non-English-speaking Internet users. In this paper, we propose a generic and integrated approach to searching and browsing the Internet in a multilingual world. Based on this approach, we have developed the Chinese Business Intelligence Portal (CBiz-Port), a meta-search engine that searches for business information of mainland China, Taiwan, and Hong Kong. Additional functions provided by CBizPort include encoding conversion (between Simplified Chinese and Traditional Chinese), summarization, and categorization. Experimental results of our user evaluation study show that the searching and browsing performance of CBiz-Port was comparable to that of regional Chinese search engines, and CBizPort could significantly augment these search engines. Subjects' verbal comments indicate that CBizPort performed best in terms of analysis functions, cross-regional searching, and user-friendliness, whereas regional search engines were more efficient and more popular. Subjects especially liked CBizPort's summarizer and categorizer, which helped in understanding search results. These encouraging results suggest a promising future of our approach to Internet searching and browsing in a multilingual world.

[1]  Robert Burgin,et al.  Performance Standards and Evaluations in IR Test Collections: Cluster-Based Retrieval Models , 1997, Inf. Process. Manag..

[2]  Tefko Saracevic,et al.  Modeling Interaction in Information Retrieval (IR): A Review and Proposal. , 1996 .

[3]  Robert Spence,et al.  A framework for navigation , 1999, Int. J. Hum. Comput. Stud..

[4]  Jay F. Nunamaker,et al.  Business intelligence explorer: a knowledge map framework for discovering business intelligence on the Web , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[5]  Hsinchun Chen,et al.  MetaSpider: Meta-searching and categorization on the Web , 2001, J. Assoc. Inf. Sci. Technol..

[6]  Gary Marchionini,et al.  Previews and overviews in digital libraries: designing surrogates to support visual information seeking , 2000 .

[7]  Richard Y. Wang,et al.  Quality information and knowledge , 1998 .

[8]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[9]  Donna K. Harman,et al.  Overview of the Sixth Text REtrieval Conference (TREC-6) , 1997, Inf. Process. Manag..

[10]  Amanda Spink,et al.  U.S. versus European web searching trends , 2002, SIGF.

[11]  Amanda Spink,et al.  Interaction in information retrieval: selection and effectiveness of search terms , 1997 .

[12]  Eleanor T. Loiacono,et al.  WebQual: A Web quality instrument , 1999 .

[13]  Peter Ingwersen,et al.  Information Retrieval Interaction , 1992 .

[14]  Hsinchun Chen,et al.  Updateable PAT-Tree Approach to Chinese Key PhraseExtraction using Mutual Information: A Linguistic Foundation for Knowledge Management , 1999 .

[15]  Thomas Redman,et al.  Data quality for the information age , 1996 .

[16]  Xia Lin,et al.  Map Displays for Information Retrieval , 1997, J. Am. Soc. Inf. Sci..

[17]  Paul B. Kantor,et al.  A study of information seeking and retrieving. I. Background and methodology , 1997, J. Am. Soc. Inf. Sci..

[18]  Ellen M. Voorhees,et al.  The fifth text REtrieval conference (TREC-5) , 1997 .

[19]  Shan-Ju L. Chang,et al.  Browsing: a multidimensional framework , 1993 .

[20]  Tom Wilson,et al.  Models in Information Behavior Research , 1999 .

[21]  Bin Zhu,et al.  elpfulMed: Intelligent searching for medical information over the internet , 2003, J. Assoc. Inf. Sci. Technol..

[22]  Hsinchun Chen,et al.  Using sentence-selection heuristics to rank text segments in TXTRACTOR , 2002, JCDL '02.

[23]  Kui-Lam Kwok Comparing representations in Chinese information retrieval , 1997, SIGIR '97.

[24]  Carol Collier Kuhlthau,et al.  A Principle of Uncertainty for Information seeking , 1993, J. Documentation.

[25]  Colleen Cool,et al.  Recognition of stages in the user's information‐seeking process during online searching by novice searchers , 1992 .

[26]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[27]  Jerome L. Myers,et al.  Research Design and Statistical Analysis , 1991 .

[28]  Bin Zhu,et al.  MedTextus: an intelligent web-based medical meta-search system , 2002, JCDL '02.

[29]  David Ellis,et al.  A Behavioural Approach to Information Retrieval System Design , 1989, J. Documentation.

[30]  Gary Marchionini,et al.  Information Seeking in Electronic Environments , 1995 .

[31]  Richard Y. Wang,et al.  Data quality assessment , 2002, CACM.

[32]  Alistair G. Sutcliffe,et al.  Towards a cognitive theory of information retrieval , 1998, Interact. Comput..

[33]  Donald P. Ballou,et al.  Modeling Data and Process Quality in Multi-Input, Multi-Output Information Systems , 1985 .

[34]  Gary Marchionini,et al.  Finding facts vs. browsing knowledge in hypertext systems , 1988, Computer.

[35]  Diane M. Strong,et al.  Information quality benchmarks: product and service performance , 2002, CACM.

[36]  Janet M. Corrigan,et al.  Background and Methodology , 2000 .

[37]  Carol Collier Kuhlthau,et al.  Longitudinal Case Studies of the Information Search Process of Users in Libraries. , 1988 .

[38]  K. A. Ericsson,et al.  Protocol Analysis: Verbal Reports as Data , 1984 .

[39]  Amanda Spink,et al.  Exploration into stages in the information search process in online information retrieval: communication between users and intermediaries , 1992 .

[40]  Peter A. Gloor CYBERMAP: yet another way of navigating in hyperspace , 1991, HYPERTEXT '91.

[41]  T. D. Wilson,et al.  Models in information behaviour research , 1999, J. Documentation.