Power Law Distributions in Information Retrieval
暂无分享,去创建一个
[1] J. Drucker,et al. Regional Industrial Dominance and Business Success: A Productivity-Based Analysis [Dissertation] , 2007 .
[2] Hans Peter Luhn,et al. The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..
[3] Ripunjai K. Shukla,et al. On the proficient use of GEV distribution: a case study of subtropical monsoon region in India , 2012, 1203.0642.
[4] Gang Wang,et al. Exploiting query term correlation for list caching in web search engines , 2013, CIKM.
[5] Wolfgang G. Stock,et al. "Power tags" in information retrieval , 2010, Libr. Hi Tech.
[6] B. M. Hill,et al. A Simple General Approach to Inference About the Tail of a Distribution , 1975 .
[7] Ramon Ferrer-i-Cancho,et al. Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution , 2010, PloS one.
[8] Joshua Drucker,et al. Regional dominance and industrial success: a productivity-based analysis , 2007 .
[9] Gregory W. Corder,et al. Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach , 2009 .
[10] Alan F. Smeaton,et al. Replicating Web Structure in Small-Scale Test Collections , 2004, Information Retrieval.
[11] Raisa E. Feldman,et al. Limit Distributions for Sums of Independent Random Vectors , 2002 .
[12] Matthew Hurst,et al. BlogPulse: Automated Trend Discovery for Weblogs , 2003 .
[13] Avi Arampatzis,et al. A signal-to-noise approach to score normalization , 2009, CIKM.
[14] Stasa Milojevic,et al. Power law distributions in information science: Making the case for logarithmic binning , 2010, J. Assoc. Inf. Sci. Technol..
[15] Leonid Kopylev,et al. Constrained Parameters in Applications: Review of Issues and Approaches , 2012 .
[16] Wolfgang Kellerer,et al. Outtweeting the Twitterers - Predicting Information Cascades in Microblogs , 2010, WOSN.
[17] Enhong Chen,et al. Context-aware query suggestion by mining click-through and session data , 2008, KDD.
[18] Ricardo A. Baeza-Yates,et al. A Three Level Search Engine Index Based in Query Log Distribution , 2003, SPIRE.
[19] Q. Vuong. Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .
[20] Debora Donato,et al. Determining Factors Behind the PageRank Log-Log Plot , 2007, WAW.
[21] Serena H. Chen,et al. Good practice in Bayesian network modelling , 2012, Environ. Model. Softw..
[22] Ingemar J. Cox,et al. On the Feasibility of Unstructured Peer-to-Peer Information Retrieval , 2011, ICTIR.
[23] Victor R. Lesser,et al. Multi-agent based peer-to-peer information retrieval systems with concurrent search sessions , 2006, AAMAS '06.
[24] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[25] Reka Albert,et al. Mean-field theory for scale-free random networks , 1999 .
[26] John W. Emerson,et al. Nonparametric Goodness-of-Fit Tests for Discrete Null Distributions , 2011, R J..
[27] H. Akaike. A new look at the statistical model identification , 1974 .
[28] Serge Fdida,et al. From popularity prediction to ranking online news , 2014, Social Network Analysis and Mining.
[29] David R. Anderson,et al. Model selection and multimodel inference : a practical information-theoretic approach , 2003 .
[30] Greg N. Gregoriou. Operational Risk Toward Basel III: Best Practices and Issues in Modeling, Management, and Regulation , 2009 .
[31] Jaap Kamps,et al. The Importance of Link Evidence in Wikipedia , 2008, ECIR.
[32] Gerard Salton,et al. Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.
[33] C. Mallows. More comments on C p , 1995 .
[34] Ibrahim Matta,et al. On the origin of power laws in Internet topologies , 2000, CCRV.
[35] Abraham Bookstein,et al. Informetric distributions, part I: Unified overview , 1990, J. Am. Soc. Inf. Sci..
[36] Matjaz Perc,et al. Zipf's law and log-normal distributions in measures of scientific output across fields and institutions: 40 years of Slovenia's research as an example , 2010, J. Informetrics.
[37] Leo Egghe,et al. The Distribution of N-Grams , 2000, Scientometrics.
[38] Abdur Chowdhury,et al. A picture of search , 2006, InfoScale '06.
[39] Gilad Mishne,et al. Leave a Reply: An Analysis of Weblog Comments , 2006 .
[40] Hosung Park,et al. What is Twitter, a social network or a news media? , 2010, WWW '10.
[41] Clifford M. Hurvich,et al. Regression and time series model selection in small samples , 1989 .
[42] Ricardo A. Baeza-Yates,et al. Extracting semantic relations from query logs , 2007, KDD '07.
[43] Stephen E. Fienberg,et al. Testing Statistical Hypotheses , 2005 .
[44] M. Newman. Power laws, Pareto distributions and Zipf's law , 2005 .
[45] Peter Ingwersen,et al. Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.
[46] Lahomtoires d'Electronique. AN INFORMATIONAL THEORY OF THE STATISTICAL STRUCTURE OF LANGUAGE 36 , 2010 .
[47] J. Eeckhout. Gibrat's Law for (All) Cities , 2004 .
[48] Pasquale Cirillo,et al. Are your data really Pareto distributed , 2013, 1306.0100.
[49] Wang Dahui,et al. True reason for Zipf's law in language , 2005 .
[50] D. Cox,et al. An Analysis of Transformations , 1964 .
[51] Bruce M. Maggs,et al. Efficient content location using interest-based locality in peer-to-peer systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).
[52] C. Schunn,et al. Evaluating Goodness-of-Fit in Comparison of Models to Data , 2005 .
[53] Christina Lioma. Part of speech N-grams for information retrieval , 2008 .
[54] Wolfgang Gatterbauer,et al. Rules of Thumb for Information Acquisition from Large and Redundant Data , 2010, ECIR.
[55] Ian T. Foster,et al. Mapping the Gnutella Network: Macroscopic Properties of Large-Scale Peer-to-Peer Systems , 2002, IPTPS.
[56] Christos Faloutsos,et al. Graph mining: Laws, generators, and algorithms , 2006, CSUR.
[57] David M. Pennock,et al. Winners don't take all: Characterizing the competition for links on the web , 2002, Proceedings of the National Academy of Sciences of the United States of America.
[58] Karen Spärck Jones. A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.
[59] P. Hall,et al. Estimating a tail exponent by modelling departure from a Pareto distribution , 1999 .
[60] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..
[61] Hongyuan Zha,et al. Exploring social annotations for information retrieval , 2008, WWW.
[62] H. Simon,et al. ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .
[63] Wei-Ying Ma,et al. Optimizing web search using web click-through data , 2004, CIKM '04.
[64] Pavlin Mavrodiev,et al. Social resilience in online communities: the autopsy of friendster , 2013, COSN '13.
[65] Paul Ormerod,et al. to be published , 1995 .
[66] Geoffrey Sampson,et al. Word frequency distributions , 2002, Computational Linguistics.
[67] Michael A. Bean,et al. Probability: The Science of Uncertainty with Applications to Investments, Insurance, and Engineering , 2000 .
[68] Jorma Rissanen,et al. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[69] Kenneth Ward Church,et al. Heavy-tailed distributions and multi-keyword queries , 2007, SIGIR.
[70] C. L. Mallows. Some comments on C_p , 1973 .
[71] Nicole A. Lazar,et al. Statistics of Extremes: Theory and Applications , 2005, Technometrics.
[72] Torsten Suel,et al. Batch query processing for web search engines , 2011, WSDM '11.
[73] Karen Sparck Jones. A statistical interpretation of term specificity and its application in retrieval , 1972 .
[74] Avi Arampatzis,et al. A study of query length , 2008, SIGIR '08.
[75] Luca Becchetti,et al. The distribution of pageRank follows a power-law only for particular values of the damping factor , 2006, WWW '06.
[76] Yoav Goldberg,et al. A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books , 2013, *SEMEVAL.
[77] Markus Koppenberger,et al. Topology of music recommendation networks. , 2006, Chaos.
[78] D. Posada,et al. Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. , 2004, Systematic biology.
[79] Christopher R. Palmer,et al. Generating network topologies that obey power laws , 2000, Globecom '00 - IEEE. Global Telecommunications Conference. Conference Record (Cat. No.00CH37137).
[80] Wuying Liu,et al. Power Law for Text Categorization , 2013, CCL.
[81] Hai Jin,et al. Efficient search for peer-to-peer information retrieval using semantic small world , 2006, WWW '06.
[82] Maarten de Rijke,et al. Using Prior Information Derived from Citations in Literature Search , 2007, RIAO.
[83] J. MacKinnon,et al. Several Tests for Model Specication in the Pres-ence of Alternative Hypotheses , 1981 .
[84] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .
[85] Mark Voorneveld,et al. Superstars without Talent? The Yule Distribution Controversy , 2009, The Review of Economics and Statistics.
[86] Michael Mitzenmacher,et al. A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..
[87] Leif Azzopardi. Query side evaluation: an empirical analysis of effectiveness and effort , 2009, SIGIR.
[88] Ioannis Partalas,et al. Re-ranking approach to classification in large-scale power-law distributed category systems , 2014, SIGIR.
[89] Li Fan,et al. Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).
[90] Aristides Gionis,et al. The impact of caching on search engines , 2007, SIGIR.
[91] Kevin A. Clarke. Nonparametric Model Discrimination in International Relations , 2003 .
[92] Ricardo A. Baeza-Yates,et al. Content-Based Image Retrieval and Characterization on Specific Web Collections , 2004, CIVR.
[93] Ian Soboroff,et al. Does WT10g look like the web? , 2002, SIGIR '02.
[94] Kevin A. Clarke. A Simple Distribution-Free Test for Nonnested Model Selection , 2007, Political Analysis.
[95] Azer Bestavros,et al. Sources and characteristics of Web temporal locality , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).
[96] N. L. Johnson,et al. Continuous Multivariate Distributions, Volume 1: Models and Applications , 2019 .
[97] Dietrich Klakow,et al. Hierarchical pitman-yor language model for information retrieval , 2010, SIGIR '10.
[98] Yan Lu,et al. Characteristics of character usage in Chinese Web searching , 2009, Inf. Process. Manag..
[99] Christina Lioma,et al. Part of speech n-grams and Information Retrieval , 2008 .
[100] Peter Nijkamp,et al. Accessibility of Cities in the Digital Economy , 2004, cond-mat/0412004.
[101] S. Redner. How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.
[102] V. Strickler,et al. Statistical String Theory for Courts: If the Data Don't Fit . . . . , 2008 .
[103] R. Strawderman. Continuous Multivariate Distributions, Volume 1: Models and Applications , 2001 .
[104] Luca Vogt,et al. When Genius Failed The Rise And Fall Of Long Term Capital Management , 2016 .
[105] J. Eric Bickel,et al. Reexamining Discrete Approximations to Continuous Distributions , 2013, Decis. Anal..
[106] X. Gabaix. Power Laws in Economics and Finance , 2008 .
[107] M. Crovella,et al. Estimating the Heavy Tail Index from Scaling Properties , 1999 .
[108] W. Reed. The Pareto law of incomes—an explanation and an extension , 2003 .
[109] Wolfgang Nejdl,et al. Can all tags be used for search? , 2008, CIKM '08.
[110] Francis Jack Smith,et al. Extension of Zipf’s Law to Word and Character N-grams for English and Chinese , 2003, ROCLING/IJCLCLP.
[111] J. Hilbe. Negative Binomial Regression: Preface , 2007 .
[112] Zhiyong Lu,et al. Predicting clicks of PubMed articles , 2013, AMIA.
[113] Charles L. A. Clarke,et al. Efficient and effective spam filtering and re-ranking for large web datasets , 2010, Information Retrieval.
[114] Noriaki Kawamae,et al. Supervised N-gram topic model , 2014, WSDM.
[115] Mao Ye,et al. Exploiting geographical influence for collaborative point-of-interest recommendation , 2011, SIGIR.
[116] G. Āllport. The Psycho-Biology of Language. , 1936 .
[117] Andreas Hotho,et al. Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.
[118] Andrei Z. Broder,et al. Graph structure in the Web , 2000, Comput. Networks.
[119] H Pashler,et al. How persuasive is a good fit? A comment on theory testing. , 2000, Psychological review.
[120] Lada A. Adamic,et al. Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.
[121] Lada A. Adamic,et al. Evolutionary Dynamics of the World Wide Web , 1999 .
[122] R. E. Wheeler. Statistical distributions , 1983, APLQ.
[123] M. Meerschaert,et al. Limit Distributions for Sums of Independent Random Vectors: Heavy Tails in Theory and Practice , 2001 .
[124] Jérôme Kunegis,et al. Fairness on the web: alternatives to the power law , 2012, WebSci '12.
[125] G. Zipf,et al. The Psycho-Biology of Language , 1936 .
[126] Yuval Shavitt,et al. On the Applicability of Peer-to-peer Data in Music Information Retrieval Research , 2010, ISMIR.
[127] H. Bauke. Parameter estimation for power-law distributions by maximum likelihood methods , 2007, 0704.1867.
[128] M. Evans. Statistical Distributions , 2000 .
[129] Mark E. J. Newman,et al. Power-Law Distributions in Empirical Data , 2007, SIAM Rev..
[130] Matthias Hagen,et al. The power of naive query segmentation , 2010, SIGIR '10.
[131] Robert Tappan Morris,et al. DNS performance and the effectiveness of caching , 2001, IMW '01.
[132] Andreas Hotho,et al. Logsonomy - social information retrieval with logdata , 2008, Hypertext.
[133] Brian Peacock,et al. Statistical Distributions: Forbes/Statistical Distributions 4E , 2010 .
[134] Albert Maydeu-Olivares,et al. Goodness-of-Fit Testing , 2010 .
[135] Nick Craswell,et al. Random walks on the click graph , 2007, SIGIR.
[136] Yiming Yang,et al. Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.
[137] William J. Reed,et al. The Double Pareto-Lognormal Distribution—A New Parametric Model for Size Distributions , 2004, WWW 2001.
[138] G. Miller,et al. Some effects of intermittent silence. , 1957, The American journal of psychology.
[139] Leif Azzopardi,et al. Age Dependent Document Priors in Link Structure Analysis , 2005, ECIR.
[140] Valentin Robu,et al. The complex dynamics of collaborative tagging , 2007, WWW '07.
[141] Jean Monnet-Saint-Etienne. Discretization of Continuous Attributes , 2015 .
[142] Mark B. Sandler,et al. Music Information Retrieval Using Social Tags and Audio , 2009, IEEE Transactions on Multimedia.
[143] Domenico Cantone,et al. Finite State Models for the Generation of Large Corpora of Natural Language Texts , 2009, FSMNLP.
[144] Venugopalan Ramasubramanian,et al. Beehive: Exploiting Power Law Query Distributions for O(1) Lookup Performance in Peer to Peer Overlays , 2003 .
[145] J. Pitman,et al. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .
[146] H. S. Heaps,et al. Information retrieval, computational and theoretical aspects , 1978 .
[147] Emmanuel J. Yannakoudakis,et al. n-Grams and their implication to natural language understanding , 1990, Pattern Recognit..
[148] Iadh Ounis,et al. Light Syntactically-Based Index Pruning for Information Retrieval , 2007, ECIR.
[149] M. Clements,et al. The influence of personalization on tag query length in social media search , 2010, Inf. Process. Manag..
[150] Colin L. Mallows,et al. Some Comments on Cp , 2000, Technometrics.
[151] Hiroshi Nakagawa,et al. Topic models with power-law using Pitman-Yor process , 2010, KDD.
[152] R. Albert,et al. The large-scale organization of metabolic networks , 2000, Nature.
[153] Nitish Srivastava,et al. Modeling Documents with Deep Boltzmann Machines , 2013, UAI.
[154] Andrea Esuli,et al. CoPhIR: a Test Collection for Content-Based Image Retrieval , 2009, ArXiv.
[155] Iadh Ounis,et al. A syntactically-based query reformulation technique for information retrieval , 2008, Inf. Process. Manag..
[156] András A. Benczúr,et al. SpamRank - fully automatic link spam detection. Work in progress , 2005 .
[157] Roelof van Zwol,et al. Flickr tag recommendation based on collective knowledge , 2008, WWW.
[158] Yiming Yang,et al. A scalability analysis of classifiers in text categorization , 2003, SIGIR.