论文信息 - Measuring conference quality by mining program committee characteristics

Measuring conference quality by mining program committee characteristics

Bibliometrics are important measures for venue quality in digital libraries. Impacts of venues are usually the major consideration for subscription decision-making, and for ranking and recommending high-quality venues and documents. For digital libraries in the Computer Science literature domain, conferences play a major role as an important publication and dissemination outlet. However, with a recent profusion of conferences and rapidly expanding fields, it is increasingly challenging for researchers and librarians to assess the quality of conferences. We propose a set of novel heuristics to automatically discover prestigious (and low-quality) conferences by mining the characteristics of Program Committee members. We examine the proposed cues both in isolation and combination under a classification scheme. Evaluation on a collection of 2,979 conferences and 16,147 PC members shows that our heuristics, when combined, correctly classify about 92% of the conferences, with a low false positive rate of 0.035 and a recall of more than 73% for identifying reputable conferences. Furthermore, we demonstrate empirically that our heuristics can also effectively detect a set of low-quality conferences, with a false positive rate of merely 0.002. We also report our experience of detecting two previously unknown low-quality conferences. Finally, we apply the proposed techniques to the entire quality spectrum by ranking conferences in the collection.

[1] Byung-Won On,et al. System Support for Name Authority Control Problem in Digital Libraries: OpenDBLP Approach , 2004, ECDL.

[2] A. Barabasi,et al. Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[3] Gideon S. Mann,et al. Bibliometric impact measures leveraging topic analysis , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[4] Dongwon Lee,et al. Oracle, where shall I submit my papers? , 2009, CACM.

[5] Dongwon Lee,et al. On six degrees of separation in DBLP-DB and more , 2005, SGMD.

[6] Peter Ingwersen,et al. Using citations for ranking in digital libraries , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[7] Jennifer Widom,et al. Database Publication Practices , 2005, VLDB.

[8] Johan Bollen,et al. Toward alternative metrics of journal impact: A comparison of download and citation data , 2005, Inf. Process. Manag..

[9] Andreas Thor,et al. Citation analysis of database publications , 2005, SGMD.

[10] D. Christakis,et al. Impact factor: a valid measure of journal quality? , 2003, Journal of the Medical Library Association : JMLA.

[11] J. Ross Quinlan,et al. Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[12] Riyaz Sikora,et al. Assessing the relative influence of journals in a citation network , 2005, CACM.

[13] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[15] M. Newman,et al. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16] Soongoo Hong,et al. Objective quality ranking of computing journals , 2003, CACM.

[17] Leonard M. Freeman,et al. A set of measures of centrality based upon betweenness , 1977 .

[18] Wei Fan,et al. Bagging , 2009, Encyclopedia of Machine Learning.

[19] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[20] Johan Bollen,et al. Journal status , 2006, Scientometrics.

[21] J. E. Hirsch,et al. An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[22] E GARFIELD,et al. Citation indexes for science; a new dimension in documentation through association of ideas. , 2006, Science.

[23] M. Newman. 1 Who is the best connected scientist ? A study of scientific coauthorship networks , 2004 .

[24] Marc Najork,et al. Detecting spam web pages through content analysis , 2006, WWW '06.

[25] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[26] Mehmet M. Dalkilic,et al. Using Compression to Identify Classes of Inauthentic Texts , 2006, SDM.