The Pushshift Reddit Dataset
暂无分享,去创建一个
Jeremy Blackburn | Savvas Zannettou | Jason Baumgartner | Brian Keegan | Megan Squire | J. Blackburn | Savvas Zannettou | Brian Keegan | Megan Squire | Jason Baumgartner | Jeremy Blackburn
[1] Jason Chuang,et al. Large-Scale Topical Analysis of Multiple Online News Sources with Media Cloud , 2014 .
[2] Elizabeth Gibney. Privacy hurdles thwart Facebook democracy research , 2019, Nature.
[3] Gabriel Skantze,et al. Crowdsourcing a self-evolving dialog graph , 2019, CUI.
[4] Amy Bruckman,et al. Does Transparency in Moderation Really Matter? , 2019, Proc. ACM Hum. Comput. Interact..
[5] Michael Mattioli,et al. Big data, bigger dilemmas: A critical review , 2015, J. Assoc. Inf. Sci. Technol..
[6] Elinor Ostrom,et al. Ideas, Artifacts, and Facilities: Information as a Common-Pool Resource , 2003 .
[7] Abdolreza Abhari,et al. Using Deep Learning to Recommend Discussion Threads to Users in an Online Forum , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).
[8] Benno Stein,et al. TL;DR: Mining Reddit to Learn Automatic Summarization , 2017, NFiS@EMNLP.
[9] Nathalie Japkowicz,et al. Towards Ethical Content-Based Detection Of Online Influence Campaigns , 2019, 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP).
[10] Tilmann Rabl,et al. An Intermediate Representation for Optimizing Machine Learning Pipelines , 2019, Proc. VLDB Endow..
[11] Carlos Castillo,et al. Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries , 2019, Front. Big Data.
[12] Sumaru Niida,et al. The Impact of Social Network Structure on the Growth and Survival of Online Communities , 2019, 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).
[13] Sune Lehmann,et al. Accelerating dynamics of collective attention , 2019, Nature Communications.
[14] Chenhao Tan,et al. Tracing Community Genealogy: How New Communities Emerge from the Old , 2018, ICWSM.
[15] Gianluca Stringhini,et al. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities , 2018 .
[16] Ali Ahmadvand,et al. ConCET: Entity-Aware Topic Classification for Open-Domain Conversational Agents , 2019, CIKM.
[17] J. Nathan Matias,et al. Caveat emptor, computational social science: Large-scale missing data in a widely-published Reddit corpus , 2018, PloS one.
[18] Gloria Mark,et al. Detecting Potential Warning Behaviors of Ideological Radicalization in an Alt-Right Subreddit , 2019, ICWSM.
[19] Gianluca Stringhini,et al. Understanding Web Archiving Services and Their (Mis)Use on Social Media , 2018, ICWSM.
[20] D. Lazer,et al. Data ex Machina: Introduction to Big Data , 2017 .
[21] Sibel Adali,et al. The Impact of Crowds on News Engagement: A Reddit Case Study , 2017, Proceedings of the International AAAI Conference on Web and Social Media.
[22] Tilmann Rabl,et al. ScootR: Scaling R Dataframes on Dataflow Systems , 2018, SoCC.
[23] Tilmann Rabl,et al. BlockJoin: Efficient Matrix Partitioning Through Joins , 2017, Proc. VLDB Endow..
[24] Çağrı Çöltekin,et al. Identifying Depression on Reddit: The Effect of Training Data , 2018, EMNLP 2018.
[25] Daniel Arthur Hunter Ra,et al. Cyberspace as Place, and the Tragedy of the Digital Anticommons , 2002 .
[26] Kate Starbird,et al. Examining the Alternative Media Ecosystem Through the Production of Alternative Narratives of Mass Shooting Events on Twitter , 2017, ICWSM.
[27] Wei Wang,et al. Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking , 2018, NAACL.
[28] Alessio Botta,et al. Monetizing data: A new source of value in payments , 2017 .
[29] Bernard J. Jansen,et al. View, Like, Comment, Post: Analyzing User Engagement by Topic at 4 Levels across 5 Social Media Platforms for 53 News Organizations , 2019, ICWSM.
[30] Eric Gilbert,et al. The Internet's Hidden Rules , 2018, Proceedings of the ACM on Human-Computer Interaction.
[31] Jean-Charles Delvenne,et al. Modelling structure and predicting dynamics of discussion threads in online boards , 2018, J. Complex Networks.
[32] John Kelly,et al. Polarization, Partisanship and Junk News Consumption over Social Media in the US , 2018, ArXiv.
[33] Scott J Leischow,et al. Underage JUUL Use Patterns: Content Analysis of Reddit Messages , 2019, Journal of medical Internet research.
[34] Kathy McKeown,et al. Fixed That for You: Generating Contrastive Claims with Semantic Edits , 2019, NAACL.
[35] Wenji Mao,et al. Social Computing: From Social Informatics to Social Intelligence , 2007, IEEE Intell. Syst..
[36] Ana Paula Couto da Silva,et al. Online Social Networks in Health Care: A Study of Mental Disorders on Reddit , 2018, 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI).
[37] Cornelius Puschmann. An end to the wild west of social media research: a response to Axel Bruns , 2019, Information, Communication & Society.
[38] Gianluca Stringhini,et al. Who Let The Trolls Out?: Towards Understanding State-Sponsored Trolls , 2018, WebSci.
[39] Cliff Lampe,et al. Big Data in Survey Research AAPOR Task Force Report , 2015 .
[40] G. N. Gilbert. Computational Social Science , 2010 .
[41] Keith N. Hampton. Studying the Digital: Directions and Challenges for Digital Methods , 2017 .
[42] Mark Dredze,et al. Elites and foreign actors among the alt-right: The Gab social media platform , 2019, First Monday.
[43] David A. Broniatowski,et al. Characterizing Trends in Human Papillomavirus Vaccine Discourse on Reddit (2007-2015): An Observational Study , 2019, JMIR Public Health and Surveillance.
[44] Scott A. Golder,et al. Digital Footprints: Opportunities and Challenges for Online Social Research , 2014 .
[45] J. Nathan Matias,et al. Going Dark: Social Factors in Collective Action Against Platform Operators in the Reddit Blackout , 2016, CHI.
[46] David M. Mimno,et al. Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity , 2017, WWW.
[47] Tawfiq Ammari,et al. Self-declared Throwaway Accounts on Reddit , 2019, Proceedings of the ACM on Human-Computer Interaction.
[48] D. Boyd. Untangling research and practice: What Facebook’s “emotional contagion” study teaches us , 2016 .
[49] Alexander Halavais. Overcoming terms of service: a proposal for ethical distributed research , 2019, Information, Communication & Society.
[50] Denis Helic,et al. Evaluating narrative-driven movie recommendations on Reddit , 2019, IUI.
[51] Denis Helic,et al. Modeling User Dynamics in Collaboration Websites , 2017, Dynamics On and Of Complex Networks III.
[52] Georgios Gousios,et al. The GHTorent dataset and tool suite , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).
[53] Venkata Rama Kiran Garimella,et al. WhatsApp, Doc? A First Look at WhatsApp Public Group Data , 2018, ICWSM 2018.
[54] Jacob A Rohde,et al. Topic Clustering of E-Cigarette Submissions Among Reddit Communities: A Network Perspective , 2019, Health education & behavior : the official publication of the Society for Public Health Education.
[55] Gisele L. Pappa,et al. Reddit Weight Loss Communities: Do They Have What It Takes for Effective Health Interventions? , 2018, 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI).
[56] Jacob Eisenstein,et al. You Can't Stay Here , 2017 .
[57] Alice E. Marwick,et al. Media Manipulation and Disinformation Online , 2017 .
[58] V. Burris,et al. White Supremacist Networks on the Internet , 2000 .
[59] Nancy Fulda,et al. Semantically Aligned Sentence-Level Embeddings for Agent Autonomy and Natural Language Understanding , 2019 .
[60] Georgios Paliouras,et al. TimeRank: A Random Walk Approach for Community Discovery in Dynamic Networks , 2018, COMPLEX NETWORKS.
[61] Katrin Weller,et al. A manifesto for data sharing in social media research , 2016, WebSci.
[62] Dan Mercea,et al. The disinformation landscape and the lockdown of social platforms , 2019, Information, Communication & Society.
[63] Xuan Zhu,et al. Quantifying Context Overlap for Training Word Embeddings , 2018, EMNLP.
[64] L. Manovich,et al. Trending: The Promises and the Challenges of Big Social Data , 2012 .
[65] Sergey I. Nikolenko,et al. Lost in Conversation: A Conversational Agent Based on the Transformer and Transfer Learning , 2019 .
[66] Fulya Ozcan. Bayesian Nonparametric Models on Big Data , 2017 .
[67] Jeffrey Mervis. Privacy concerns could derail Facebook data-sharing plan. , 2019, Science.
[68] Srayan Datta,et al. Identifying Misaligned Inter-Group Links and Communities , 2017, Proc. ACM Hum. Comput. Interact..
[69] L. Palen,et al. Crisis informatics—New data for extraordinary times , 2016, Science.
[70] Jure Leskovec,et al. Community Interaction and Conflict on the Web , 2018, WWW.
[71] Carlos Guestrin,et al. The Rise and Fall of Network Stars , 2017, Inf. Process. Manag..
[72] Zeynep Tufekci,et al. Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls , 2014, ICWSM.
[73] K. Erikson,et al. Discovering the Social , 2018 .
[74] Jisun An,et al. Political Discussions in Homogeneous and Cross-Cutting Communication Spaces , 2019, ICWSM.
[75] Cristian Danescu-Niculescu-Mizil,et al. Content Removal as a Moderation Strategy , 2019, Proc. ACM Hum. Comput. Interact..
[76] Adrienne Massanari,et al. #Gamergate and The Fappening: How Reddit’s algorithm, governance, and culture support toxic technocultures , 2017, New Media Soc..
[77] Eshwar Chandrasekharan,et al. Crossmod: A Cross-Community Learning-based System to Assist Reddit Moderators , 2019, Proc. ACM Hum. Comput. Interact..
[78] Daniela Stan Raicu,et al. Automatic extraction of informal topics from online suicidal ideation , 2018, BMC Bioinformatics.
[79] Tim Squirrell,et al. Platform dialectics: The relationships between volunteer moderators and end users on reddit , 2019, New Media Soc..
[80] Srayan Datta,et al. Extracting Inter-community Conflicts in Reddit , 2018, ICWSM.
[81] Mihael Arcan,et al. First Insights on a Passive Major Depressive Disorder Prediction System with Incorporated Conversational Chatbot , 2018, AICS.
[82] Gian Paolo Rossi,et al. Mastodon Content Warnings: Inappropriate Contents in a Microblogging Platform , 2019, ICWSM.
[83] Kevin Crowston,et al. Validity Issues in the Use of Social Network Analysis with Digital Trace Data , 2011, J. Assoc. Inf. Syst..
[84] Christine L. Borgman,et al. The conundrum of sharing research data , 2012, J. Assoc. Inf. Sci. Technol..
[85] Alex Wang,et al. Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling , 2018, ACL.
[86] Wen Zheng,et al. Enhancing Conversational Dialogue Models with Grounded Knowledge , 2019, CIKM.
[87] Maria Glenski,et al. Characterizing Speed and Scale of Cryptocurrency Discussion Spread on Reddit , 2019, WWW.
[88] Geoff Kaufman,et al. Moderator engagement and community development in the age of algorithms , 2019, New Media Soc..
[89] Jonathan Gemmell,et al. Discovery of Informal Topics from Post Traumatic Stress Disorder Forums , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).
[90] Leon Derczynski,et al. Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition , 2017, NUT@EMNLP.
[91] Axel Bruns,et al. After the ‘APIcalypse’: social media platforms and their fight against critical scholarly research , 2019, Information, Communication & Society.
[92] André Panisson,et al. Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort , 2019, WWW.
[93] E. Walker,et al. A machine learning approach to predicting psychosis using semantic density and latent content analysis , 2019, npj Schizophrenia.
[94] Harith Alani,et al. Exploring Misogyny across the Manosphere in Reddit , 2019, WebSci.
[95] Jonathan Gemmell,et al. Detecting and Characterizing Trends in Online Mental Health Discussions , 2018, 2018 IEEE International Conference on Data Mining Workshops (ICDMW).
[96] Chenhao Tan,et al. Are All Successful Communities Alike? Characterizing and Predicting the Success of Online Communities , 2019, WWW.
[97] Mohammad Al Hasan,et al. Investigate Transitions into Drug Addiction through Text Mining of Reddit Data , 2019, KDD.
[98] Pablo Gamallo,et al. Contextualized Translations of Phrasal Verbs with Distributional Compositional Semantics and Monolingual Corpora , 2019, Computational Linguistics.
[99] James Boyle,et al. The Second Enclosure Movement and the Construction of the Public Domain , 2003 .
[100] D. Boyd,et al. CRITICAL QUESTIONS FOR BIG DATA , 2012 .
[101] Eric Gilbert,et al. The Bag of Communities: Identifying Abusive Behavior Online with Preexisting Internet Data , 2017, CHI.
[102] Emily T Hébert,et al. A content analysis of JUUL discussions on social media: Using Reddit to understand patterns and perceptions of JUUL use. , 2019, Drug and alcohol dependence.
[103] Casey Fiesler,et al. Reddit Rules! Characterizing an Ecosystem of Governance , 2018, ICWSM.
[104] Deen Freelon,et al. On the Interpretation of Digital Trace Data in Communication and Social Computing Research , 2014 .
[105] Bernard J. Jansen,et al. Detecting Toxicity Triggers in Online Discussions , 2019, HT.
[106] Deen Freelon. Computational Research in the Post-API Age , 2018, Political Communication.
[107] Amy Bruckman,et al. "Did You Suspect the Post Would be Removed?" , 2019, Proc. ACM Hum. Comput. Interact..
[108] Steven A. Sumner,et al. Increases in Online Posts About Synthetic Opioids Preceding Increases in Synthetic Opioid Death Rates: a Retrospective Observational Study , 2019, Journal of General Internal Medicine.
[109] Andrew Johnston,et al. Identifying Extremism in Text Using Deep Learning , 2020, Development and Analysis of Deep Learning Architectures.
[110] C. Rosé,et al. The Discourse of Online Content Moderation: Investigating Polarized User Responses to Changes in Reddit’s Quarantine Policy , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[111] Ryan Wesslen,et al. Shouting into the Void: A Database of the Alternative Social Media Platform Gab , 2019, ICWSM.
[112] J. Nathan Matias,et al. The Civic Labor of Volunteer Moderators Online , 2019, Social Media + Society.