Unsupervised Graph-Based Similarity Learning Using Heterogeneous Features

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

[1]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[2]  Andrzej Stachurski,et al.  Parallel Optimization: Theory, Algorithms and Applications , 2000, Parallel Distributed Comput. Pract..

[3]  Ronald Rousseau,et al.  Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient , 2003, J. Assoc. Inf. Sci. Technol..

[4]  Johan Bollen,et al.  Toward alternative metrics of journal impact: A comparison of download and citation data , 2005, Inf. Process. Manag..

[5]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[6]  Elizabeth D. Liddy,et al.  Advances in Automatic Text Summarization , 2001, Information Retrieval.

[7]  Choon Hui Teo,et al.  Fast and space efficient string kernels using suffix arrays , 2006, ICML.

[8]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[9]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[10]  Dragomir R. Radev,et al.  The ACL anthology network corpus , 2009, Language Resources and Evaluation.

[11]  Lada A. Adamic,et al.  Information Diffusion in Computer Science Citation Networks , 2009, ICWSM.

[12]  Berthier A. Ribeiro-Neto,et al.  Combining link-based and content-based methods for web document classification , 2003, CIKM '03.

[13]  Mikkel Christoffersen Identifying core documents with a multiple evidence relevance filter , 2004, Scientometrics.

[14]  Jörg Sander,et al.  Analysis of SIGMOD's co-authorship graph , 2003, SGMD.

[15]  Alessandro Moschitti,et al.  A Study on Convolution Kernels for Shallow Statistic Parsing , 2004, ACL.

[16]  Andreas Thor,et al.  Citation analysis of database publications , 2005, SGMD.

[17]  M. Newman,et al.  Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Shimon Edelman,et al.  Similarity-based Word Sense Disambiguation , 1998, CL.

[19]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[21]  W. G. Parrott,et al.  Emotions in social psychology : essential readings , 2001 .

[22]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[23]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[24]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[25]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[26]  Vasileios Hatzivassiloglou,et al.  Event-Based Extractive Summarization , 2004 .

[27]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[28]  P. Jaccard Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines , 1901 .

[29]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[30]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[31]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[32]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[33]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[34]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[35]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[36]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[37]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[38]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[39]  Shenghuo Zhu,et al.  Learning multiple graphs for document recommendations , 2008, WWW.

[40]  A. D. Jackson,et al.  Measures for measures , 2006, Nature.

[41]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  U. Feige,et al.  Combination can be hard: approximability of the unique coverage problem , 2006, SODA 2006.

[43]  David J. C. MacKay,et al.  A hierarchical Dirichlet language model , 1995, Natural Language Engineering.

[44]  Mike Paterson,et al.  Longest Common Subsequences , 1994, MFCS.

[45]  C. Lee Giles,et al.  Scholarly publishing in the Internet age: a citation analysis of computer science literature , 2001, Inf. Process. Manag..

[46]  Rong Jin,et al.  Active query selection for semi-supervised clustering , 2008, 2008 19th International Conference on Pattern Recognition.

[47]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[48]  A. Tversky Features of Similarity , 1977 .

[49]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[50]  Loet Leydesdorff,et al.  Indicators of structural change in the dynamics of science: Entropy statistics of the SCI Journal Citation Reports , 2009, Scientometrics.

[51]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[52]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[53]  Andreas Paepcke,et al.  Seeing the whole in parts: text summarization for web browsing on handheld devices , 2001, WWW '01.

[54]  David Harel,et al.  On Clustering Using Random Walks , 2001, FSTTCS.

[55]  B. Bollobás The evolution of random graphs , 1984 .

[56]  Liang Zhou,et al.  On the Summarization of Dynamically Introduced Information: Online Discussions and Blogs , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[57]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[58]  Dragomir R. Radev,et al.  Sub-event based multi-document summarization , 2003, HLT-NAACL 2003.

[59]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[60]  Daisuke Ikeda,et al.  Automatically Linking News Articles to Blog Entries , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[61]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[62]  James C. Bezdek,et al.  Some Notes on Alternating Optimization , 2002, AFSS.

[63]  Daniel Pauly,et al.  Equivalence of results from two citation analyses: Thomson ISI's Citation Index and Google's Scholar service , 2005 .

[64]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[65]  Peter Nijkamp,et al.  Accessibility of Cities in the Digital Economy , 2004, cond-mat/0412004.

[66]  van Eck Nees Jan,et al.  How to normalize cooccurrence data An analysis of some well-known similarity measures , 2009 .

[67]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[68]  Steven P. Abney,et al.  Bootstrapping , 2002, ACL.

[69]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[70]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[71]  Dragomir R. Radev,et al.  CLAIRLIB Documentation v1.03 , 2007, ArXiv.

[72]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[73]  Danushka Bollegala,et al.  Measuring semantic similarity between words using web search engines , 2007, WWW '07.

[74]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[75]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .

[76]  Ian Davidson,et al.  Measuring Constraint-Set Utility for Partitional Clustering Algorithms , 2006, PKDD.

[77]  Wessel Kraaij,et al.  TNO at TDT2001: Language Model-Based Topic Detection , 2001 .

[78]  Dragomir R. Radev,et al.  Detecting Multiple Facets of an Event using Graph-Based Unsupervised Methods , 2008, COLING.

[79]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[80]  Loet Leydesdorff,et al.  Can networks of journal-journal citations be used as indicators of change in the social sciences? , 2003, J. Documentation.

[81]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[82]  Shailesh V. Date,et al.  A Probabilistic Functional Network of Yeast Genes , 2004, Science.

[83]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[84]  Wolfgang Glänzel,et al.  Towards a model for diachronous and synchronous citation analyses , 2004, Scientometrics.

[85]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[86]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[87]  Francis Narin,et al.  Interrelationships of scientific journals , 1972, J. Am. Soc. Inf. Sci..

[88]  Richard K. Belew,et al.  Scientific impact quantity and quality: Analysis of two sources of bibliographic data , 2005, ArXiv.

[89]  Ee-Peng Lim,et al.  Comments-oriented blog summarization by sentence extraction , 2007, CIKM '07.

[90]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[91]  H. S. Sichel,et al.  A bibliometric distribution which really works , 1985, J. Am. Soc. Inf. Sci..

[92]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[93]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[94]  James Allan,et al.  Retrieval and novelty detection at the sentence level , 2003, SIGIR.

[95]  Loet Leydesdorff,et al.  Top-down decomposition of the Journal Citation Reportof the Social Science Citation Index: Graph- and factor-analytical approaches , 2004, Scientometrics.

[96]  Alessandro Moschitti,et al.  Making Tree Kernels Practical for Natural Language Learning , 2006, EACL.

[97]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[98]  Daniel Jurafsky,et al.  Studying the History of Ideas Using Topic Models , 2008, EMNLP.

[99]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[100]  Dag W. Aksnes,et al.  Does self-citation pay? , 2007, Scientometrics.

[101]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[102]  Hema Raghavan,et al.  Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..

[103]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[104]  Simone Paolo Ponzetto,et al.  WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[105]  Matthew Hurst,et al.  A Language Model Approach to Keyphrase Extraction , 2003, ACL 2003.

[106]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[107]  Maria-Florina Balcan,et al.  A PAC-Style Model for Learning from Labeled and Unlabeled Data , 2005, COLT.

[108]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[109]  Gilad Mishne,et al.  Applied text analytics for blogs , 2007 .

[110]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[111]  D.A. Dervos,et al.  cc-IFF: A Cascading Citations Impact Factor Framework for the Automatic Ranking of Research Publications , 2005, 2005 IEEE Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications.

[112]  Loet Leydesdorff,et al.  Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment , 2006, J. Assoc. Inf. Sci. Technol..

[113]  Loet Leydesdorff,et al.  Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment , 2006 .

[114]  Andrew McCallum,et al.  Active Learning by Labeling Features , 2009, EMNLP.

[115]  Kazuhiko Kato,et al.  Extracting Topics From Weblogs Through Frequency Segments , 2006 .

[116]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[117]  Pascale Fung,et al.  Combining Optimal Clustering and Hidden Markov Models for Extractive Summarization , 2003, ACL 2003.

[118]  Kam-Fai Wong,et al.  Building Document Graphs for Multiple News Articles Summarization: An Event-Based Approach , 2006, ICCPOL.

[119]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[120]  Kiri Wagstaff,et al.  Value, Cost, and Sharing: Open Issues in Constrained Clustering , 2006, KDID.

[121]  John Shawe-Taylor,et al.  Syllables and other String Kernel Extensions , 2002, ICML.

[122]  Yahiko Kambayashi,et al.  A longest common subsequence algorithm suitable for similar text strings , 1982, Acta Informatica.

[123]  Ali Pinar,et al.  Latent Clustering on Graphs with Multiple Edge Types , 2011, WAW.

[124]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[125]  Rolf A. Zwaan,et al.  Assessing the usefulness of bibliometric indicators for the humanities and the social and beha vioural sciences: A comparative study , 1989, Scientometrics.

[126]  Loet Leydesdorff,et al.  The delineation of specialties in terms of journals using the dynamic journal set of the SCI , 2005, Scientometrics.

[127]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[128]  Hans A. Kestler,et al.  Generalized Venn diagrams: a new method of visualizing complex genetic set relations , 2005, Bioinform..

[129]  Zhu Zhang,et al.  NewsInEssence: A System For Domain-Independent, Real-Time News Clustering and Multi-Document Summarization , 2001, HLT.

[130]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[131]  William M. Pottenger,et al.  Leveraging Higher Order Dependencies Between Features for Text Classification , 2009 .

[132]  Mehran Sahami,et al.  A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[133]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[134]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[135]  Yannis Manolopoulos,et al.  Generalized Hirsch h-index for disclosing latent facts in citation networks , 2007, Scientometrics.

[136]  Ramon Ferrer i Cancho,et al.  The small world of human language , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[137]  Nizar Grira,et al.  Unsupervised and Semi-supervised Clustering : a Brief Survey ∗ , 2004 .

[138]  Joydeep Ghosh,et al.  Cluster Ensembles A Knowledge Reuse Framework for Combining Partitionings , 2002, AAAI/IAAI.

[139]  Susan T. Dumais,et al.  Similarity Measures for Short Segments of Text , 2007, ECIR.

[140]  Christopher H. Brooks,et al.  Improved annotation of the blogosphere via autotagging and hierarchical clustering , 2006, WWW '06.

[141]  Loet Leydesdorff,et al.  Similarity Measures, Author Cocitation Analysis, and Information Theory , 2005, J. Assoc. Inf. Sci. Technol..

[142]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[143]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[144]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[145]  Ronald Rousseau,et al.  A classification of author co-citations: Definitions and search strategies , 2004, J. Assoc. Inf. Sci. Technol..

[146]  Wei Dai,et al.  Minimal document set retrieval , 2005, CIKM '05.

[147]  Andrew McCallum,et al.  An Integrated, Conditional Model of Information Extraction and Coreference with Appli , 2004, UAI.

[148]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[149]  Christina S. Leslie,et al.  Fast String Kernels using Inexact Matching for Protein Sequences , 2004, J. Mach. Learn. Res..

[150]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.

[151]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[152]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[153]  Z. K. Silagadze,et al.  Citations and the Zipf-Mandelbrot Law , 1999, Complex Syst..

[154]  W. Bruce Croft,et al.  Cluster-based retrieval using language models , 2004, SIGIR '04.

[155]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[156]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[157]  R. Mooney,et al.  Impact of Similarity Measures on Web-page Clustering , 2000 .

[158]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[159]  Sanjoy Dasgupta,et al.  PAC Generalization Bounds for Co-training , 2001, NIPS.

[160]  Dongwon Lee,et al.  On six degrees of separation in DBLP-DB and more , 2005, SGMD.

[161]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[162]  Dragomir R. Radev,et al.  Single-document and multi-document summary evaluation using Relative Utility , 2007 .

[163]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[164]  W. John Wilbur,et al.  Corpus-based statistical screening for content-bearing terms , 2001, J. Assoc. Inf. Sci. Technol..

[165]  Chaomei Chen,et al.  Visualizing knowledge domains , 2005, Annu. Rev. Inf. Sci. Technol..

[166]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[167]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[168]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[169]  Ronald Rousseau,et al.  Author cocitation analysis and Pearson's r , 2004, J. Assoc. Inf. Sci. Technol..

[170]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[171]  Duncan J. Watts,et al.  Six Degrees: The Science of a Connected Age , 2003 .

[172]  P. Lawrence The politics of publication , 2003, Nature.

[173]  Toyoaki Nishida,et al.  Analyzing concerns of people using Weblog articles and real world temporal data , 2005 .

[174]  Stephan Bloehdorn,et al.  Combined Syntactic and Semantic Kernels for Text Classification , 2007, ECIR.

[175]  M. Amin,et al.  Impact factors: use and abuse. , 2003, Medicina.

[176]  Loet Leydesdorff,et al.  Tracking areas of strategic importance using scientometric journal mappings , 1994 .

[177]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .

[178]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[179]  Chris H. Q. Ding,et al.  Automatic topic identification using webpage clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[180]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[181]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[182]  E GARFIELD,et al.  Citation indexes for science; a new dimension in documentation through association of ideas. , 2006, Science.

[183]  Peng Cui A Tighter Analysis of Set Cover Greedy Algorithm for Test Set , 2007, ESCAPE.

[184]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[185]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[186]  John Yearwood,et al.  Opinion Search in Web Logs , 2007, ADC.

[187]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, ICML '05.

[188]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[189]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[190]  Stephen J. Bensman,et al.  Classification and powerlaws: The logarithmic transformation , 2006, J. Assoc. Inf. Sci. Technol..

[191]  D. York Least-squares fitting of a straight line. , 1966 .

[192]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[193]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[194]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[195]  Hsin-Hsi Chen,et al.  Major topic detection and its application to opinion summarization , 2005, SIGIR '05.

[196]  M. Newman Coauthorship networks and patterns of scientific collaboration , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[197]  W. Bruce Croft,et al.  Corpus-based stemming using cooccurrence of word variants , 1998, TOIS.

[198]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[199]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[200]  Maria-Florina Balcan,et al.  Person Identification in Webcam Images: An Application of Semi-Supervised Learning , 2005 .

[201]  Stephen J. Wright,et al.  Dissimilarity in Graph-Based Semi-Supervised Classification , 2007, AISTATS.

[202]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.

[203]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[204]  Yun Chi,et al.  Eigen-trend: trend analysis in the blogosphere based on singular value decompositions , 2006, CIKM '06.

[205]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[206]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[207]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[208]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[209]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[210]  Carl Gutwin,et al.  Domain-Specific Keyphrase Extraction , 1999, IJCAI.

[211]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[212]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[213]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[214]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[215]  James Allan,et al.  Introduction to topic detection and tracking , 2002 .

[216]  William M. Pottenger,et al.  A framework for understanding Latent Semantic Indexing (LSI) performance , 2006, Inf. Process. Manag..

[217]  Hinrich Sch Automatic Word Sense Discrimination , 1998 .

[218]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[219]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[220]  Nees Jan van Eck,et al.  How to normalize cooccurrence data? An analysis of some well-known similarity measures , 2009, J. Assoc. Inf. Sci. Technol..

[221]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[222]  Dragomir R. Radev,et al.  The ACL Anthology Reference Corpus: A Reference Dataset for Bibliographic Research in Computational Linguistics , 2008, LREC.

[223]  Robert Krovetz,et al.  Viewing morphology as an inference process , 1993, Artif. Intell..

[224]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[225]  Loet Leydesdorff,et al.  Mapping the Chinese Science Citation Database in terms of aggregated journal-journal citation relations , 2005, J. Assoc. Inf. Sci. Technol..

[226]  Henry Small Visualizing science by citation mapping , 1999 .

[227]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[228]  Qin Lu,et al.  Extractive Summarization using Inter- and Intra- Event Relevance , 2006, ACL.

[229]  R. Jones,et al.  Active Learning with Feedback on Both Features and Instances , 2006 .

[230]  Ali Pinar,et al.  On Clustering on Graphs with Multiple Edge Types , 2011, Internet Math..

[231]  S H Strogatz,et al.  Random graph models of social networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[232]  Jan van Leeuwen,et al.  A Family of Similarity Measures Between Two Strings , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[233]  E. A. Leicht,et al.  Large-scale structure of time evolving citation networks , 2007, 0706.0015.

[234]  Weiguo Fan,et al.  WebInEssence: A Personalized Web-Based Multi-Document Summarization and Recommendation System , 2008 .

[235]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[236]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[237]  Loet Leydesdorff,et al.  Theories of Citation , 1998 .

[238]  Peter D. Turney Learning Algorithms for Keyphrase Extraction , 2000, Information Retrieval.

[239]  Yannis Manolopoulos,et al.  Generalized h-index for Disclosing Latent Facts in Citation Networks , 2006, ArXiv.

[240]  Loet Leydesdorff,et al.  Can scientific journals be classified in terms of aggregated journal-journal citation relations using the Journal Citation Reports? , 2009, J. Assoc. Inf. Sci. Technol..

[241]  Michael Gamon Graph-Based Text Representation for Novelty Detection , 2006 .

[242]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[243]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[244]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[245]  J. Hirsch Does the h index have predictive power? , 2007, Proceedings of the National Academy of Sciences.

[246]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[247]  Alexander J. Smola,et al.  Fast Kernels for String and Tree Matching , 2002, NIPS.

[248]  B. John Oommen,et al.  The Noisy Substring Matching Problem , 1983, IEEE Transactions on Software Engineering.

[249]  Gideon S. Mann,et al.  Bibliometric impact measures leveraging topic analysis , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[250]  Graeme Hirst,et al.  Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures , 2004 .

[251]  Dragomir R. Radev,et al.  Citation Analysis, Centrality, and the ACL Anthology , 2008 .

[252]  Dragomir R. Radev,et al.  Edge Weight Regularization over Multiple Graphs for Similarity Learning , 2010, 2010 IEEE International Conference on Data Mining.

[253]  Belver C. Griffith,et al.  A method for partitioning the journal literature , 1980, J. Am. Soc. Inf. Sci..

[254]  Dragomir R. Radev,et al.  The ACL Anthology Network , 2009 .

[255]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[256]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[257]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[258]  Loet Leydesdorff,et al.  Clusters and Maps of Science Journals Based on Bi-connected Graphs in the Journal Citation Reports , 2009, ArXiv.

[259]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[260]  Bernadette Bouchon-Meunier,et al.  Enhanced web document summarization using hyperlinks , 2003, HYPERTEXT '03.

[261]  Nick Koudas,et al.  BlogScope: spatio-temporal analysis of the blogosphere , 2007, WWW '07.

[262]  Patrick Doreian,et al.  Structural equivalence in a journal network , 1985, J. Am. Soc. Inf. Sci..

[263]  Dan Roth,et al.  Learning with Feature Description Logics , 2002, ILP.

[264]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[265]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.