Evaluating and selecting Web sources as external information resources of a data warehouse

A company's local data is often insufficient for analyzing market trends and making reasonable business plans. Decision making must also be based on information from suppliers, partners and competitors. Systematically integrating suitable external data from the Web into a data warehouse is a meaningful solution and will benefit the enterprise. However, the autonomy and dynamics of the Web make the task of selecting relevant and qualified external data from the Web challenging. We develop a set of criteria for evaluating and selecting Web resources as external data sources of a data warehouse and discuss how to screen Web data sources using multi-criteria decision making (MCDM) methods. The final decision with respect to selecting Web sources is sensitive to critical factors, i.e., the criterion weight and performance score of alternatives in terms of each criterion. We analyzed the sensitivity of the final rank of alternatives in terms of critical factors in order to gain an insight into the stability of our final decision. The comparison of several MCDM approaches for Web source screening is also presented.

[1]  ANALYTIC HIERARCHY PROCESS IN SELECTING BEST GROUNDWATER POND , 2000 .

[2]  Elwood Spencer Buffa,et al.  Mathematical programming : an introduction to the design and application of optimal decision machines , 1970 .

[3]  Yan Zhu,et al.  A Framework for Warehousing the Web Contents , 1999, ICSC.

[4]  Theodor J. Stewart,et al.  Relationships between Data Envelopment Analysis and Multicriteria Decision Analysis , 1996 .

[5]  Maxim Lifantsev Voting Model for Ranking Web Pages , 2000, International Conference on Internet Computing.

[6]  Enrico Gobbetti,et al.  Encyclopedia of Electrical and Electronics Engineering , 1999 .

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Christof Bornhövd,et al.  Data transformation for warehousing Web data , 2001, Proceedings Third International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems. WECWIS 2001.

[9]  Ernest H. Forman,et al.  Decision By Objectives: How To Convince Others That You Are Right , 2001 .

[10]  Harold Vaughn Jackson A Structured Approach for Classifying and Prioritizing Product Requirements , 1999 .

[11]  Maria-Esther Vidal,et al.  Using Quality of Data Metadata for Source Selection and Ranking , 2000, WebDB.

[12]  Ching-Lai Hwang,et al.  Methods for Multiple Attribute Decision Making , 1981 .

[13]  Yan Zhu,et al.  Integrating external data from Web sources into a data warehouse for OLAP and decision making , 2004 .

[14]  Felix Naumann,et al.  Quality Driven Source Selection Using Data Envelope Analysis , 1998, IQ.

[15]  Christof Bornhövd,et al.  Materializing Web Data for OLAP and DSS , 2000, Web-Age Information Management.

[16]  Joseph Sarkis,et al.  A comparative analysis of DEA as a discrete alternative multiple criteria decision tool , 2000, Eur. J. Oper. Res..

[17]  E. Triantaphyllou,et al.  A Sensitivity Analysis Approach for Some Deterministic Multi-Criteria Decision-Making Methods* , 1997 .

[18]  Evangelos Triantaphyllou,et al.  Multi-Criteria Decision Making: An Operations Research Approach , 1998 .

[19]  Richard Y. Wang,et al.  A product perspective on total data quality management , 1998, CACM.

[20]  Ching-Lai Hwang,et al.  Multiple Attribute Decision Making: Methods and Applications - A State-of-the-Art Survey , 1981, Lecture Notes in Economics and Mathematical Systems.

[21]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[22]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[23]  Richard D. Hackathorn,et al.  Web Farming for the Data Warehouse , 1998 .