In Search of Disruptive Ideas - Outlier Detection Techniques in Crowdsourcing Innovation Platforms

The key challenge for data science in open innovation web systems is to find best ideas among thousands of community submissions. To date, this has been done with metrics reflecting enterprise needs or community preferences. This article proposes to look in a different direction: inspired by theoretical studies on disruptive innovation, we frame the problem of valuable ideas as those rarely taken up by masses or organisations yet having potential to change industries. Our aim is to find technological means for automatic detection of such innovations to aid decision making. Following past findings from business sciences on nature of disruptive innovations, the article presents a comparative study of multiple outlier detection algorithms applied to two real-world datasets containing textual descriptions of ideas for different industries. Obtained results demonstrate capability of outlier detection and show k-NN algorithm with TF-IDF and cosine distance to be the best candidate for the task.

[1]  Clayton M. Christensen,et al.  Disruptive innovation: the Southwest Airlines case revisited , 2011 .

[2]  M. Klein,et al.  A Roadmap for Open Innovation Systems , 2015 .

[3]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[4]  Stephen P. Osborne,et al.  Handbook of Innovation in Public Services , 2013 .

[5]  Carlos Angel Iglesias,et al.  The road from community ideas to organisational innovation: a life cycle survey of idea management systems , 2011, Int. J. Web Based Communities.

[6]  Carlos Angel Iglesias,et al.  Classifying and comparing community innovation in Idea Management Systems , 2013, Decis. Support Syst..

[7]  Carlos Angel Iglesias,et al.  Exploiting Structured Linked Data in Enterprise Knowledge Management Systems: An Idea Management Case Study , 2011, 2011 IEEE 15th International Enterprise Distributed Object Computing Conference Workshops.

[8]  Clayton M. Christensen,et al.  Disruptive Technologies: Catching the Wave , 1995 .

[9]  Tony M. O'Driscoll,et al.  From experience: applying performance support technology in the fuzzy front end , 2000 .

[10]  Iain Bitran,et al.  Assessing the management of innovation with software tools: an application of innovationEnterprizer , 2009, Int. J. Technol. Manag..

[11]  Nitin Agarwal,et al.  Information quality challenges in social media , 2010, ICIQ.

[12]  Maik Thiele,et al.  Setting Goals and Choosing Metrics for Recommender System Evaluations , 2011 .

[13]  G. Savoiu,et al.  A HOLISTIC APPROACH TO INNOVATION MANAGEMENT IN BANKING: A REVIEW , 2015 .

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Sírius Thadeu Ferreira da Silva,et al.  Moving from Ideas to Proposals , 2010 .

[16]  Jan Marco Leimeister,et al.  Does Collaboration among Participants Lead to Better Ideas in IT-Based Idea Competitions? An Empirical Investigation , 2010, HICSS.

[17]  Chun Wei Choo,et al.  Innovation and knowledge creation: How are these concepts related? , 2006, Int. J. Inf. Manag..

[18]  Karel Jezek,et al.  Comparing Semantic Models for Evaluating Automatic Document Summarization , 2015, TSD.

[19]  Paul Michael Di Gangi,et al.  Steal my idea! Organizational adoption of user innovations from a user innovation community: A case study of Dell IdeaStorm , 2009, Decis. Support Syst..

[20]  Norman May,et al.  An Idea Ontology for Innovation Management , 2009, Int. J. Semantic Web Inf. Syst..

[21]  Karl T. Ulrich,et al.  Idea Generation and the Quality of the Best Idea , 2009, Manag. Sci..

[22]  Irem Y. Tumer,et al.  A comparison of creativity and innovation metrics and sample validation through in-class design projects , 2013 .

[23]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[24]  Matthias Samwald,et al.  Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation , 2015, ArXiv.

[25]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[26]  Xavier Ferràs,et al.  Innovation Management Practices, Strategic Adaptation, and Business Results: Evidence from the Electronics Industry , 2011 .

[27]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[28]  Alexander Brem,et al.  Innovation management in emerging technology ventures – the concept of an integrated idea management , 2007 .

[29]  J. Shah,et al.  Collaborative Sketching (C-Sketch)--An Idea Generation Technique for Engineering Design. , 2001 .

[30]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[31]  Réjean Landry,et al.  Lessons from Innovation Empirical Studies in the Manufacturing Sector: A Systematic Review of the Literature from 1993-2003 , 2006 .

[32]  Guangjian Li,et al.  Context-aware Sentiment Word Identification: sentiword2vec , 2016, ArXiv.

[33]  Yue Lu,et al.  Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA , 2011, Information Retrieval.

[34]  Thomas Lee Rodgers,et al.  Identifying Quality, Novel, and Creative Ideas: Constructs and Scales for Idea Evaluation , 2006, J. Assoc. Inf. Syst..

[35]  Donghee Yoo,et al.  Recommending valuable ideas in an open innovation community: A text mining approach to information overload problem , 2018, Ind. Manag. Data Syst..

[36]  G. Jouret Inside Cisco's Search for the Next Big Idea , 2009 .

[37]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[38]  Muammer Ozer,et al.  2 WHAT DO WE KNOW ABOUT NEW PRODUCT IDEA SELECTION ? , 2002 .

[39]  Carlos Angel Iglesias,et al.  A Model for Integration and Interlinking of Idea Management Systems , 2010, MTSR.

[40]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[41]  Richard D. Wang,et al.  Tournaments for Ideas , 2010 .

[42]  Alexander Benedikt Merz,et al.  Mechanisms to Select Ideas in Crowdsourced Innovation Contests - A Systematic Literature Review and Research Agenda , 2018, ECIS.

[43]  Donghee Yoo,et al.  An Ontology-based Co-creation Enhancing System for Idea Recommendation in an Online Community , 2015, DATB.

[44]  Mokter Hossain,et al.  Ideation through Online Open Innovation Platform: Dell IdeaStorm , 2015 .

[45]  O. Bjelland,et al.  An Inside View of IBM's 'Innovation Jam' , 2008 .

[46]  Yan Liu,et al.  Looking for great ideas: analyzing the innovation jam , 2007, WebKDD/SNA-KDD '07.

[47]  Allan Afuah,et al.  Innovation Management: Strategies, Implementation, and Profits , 1997 .

[48]  Shuchita Upadhyaya,et al.  Outlier Detection: Applications And Techniques , 2012 .

[49]  Arthur Zimek,et al.  On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study , 2016, Data Mining and Knowledge Discovery.

[50]  Marta Indulska,et al.  European Conference on Information Systems ( ECIS ) 5-15-2012 IDEA ASSESSMENT IN OPEN INNOVATION : A STATE OF PRACTICE , 2017 .

[51]  Guy Shani,et al.  Evaluating Recommendation Systems , 2011, Recommender Systems Handbook.

[52]  Gregory Allen Smith VIDEO SCENE DETECTION USING CLOSED CAPTION TEXT , 2009 .

[53]  Paul Windrum,et al.  Innovation in public sector services: entrepreneurship, creativity and management , 2008 .

[54]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[55]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[56]  Wen-Chia Tsai,et al.  Towards an analytical framework of organizational innovation in the service industry , 2010 .

[57]  Rosanna Garcia,et al.  A critical look at technological innovation typology and innovativeness terminology: a literature review , 2002 .

[58]  Israel M. Kirzner Competition and Entrepreneurship , 1973 .

[59]  Stefan Decker,et al.  Semantic innovation management across the extended enterprise , 2006 .

[60]  Antonio Hidalgo Nuchera,et al.  Innovation Management Techniques and Tools: a review from Theory and Practice , 2008 .

[61]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[62]  Ji Zhang,et al.  Advancements of Outlier Detection: A Survey , 2013, EAI Endorsed Trans. Scalable Inf. Syst..

[63]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[64]  J. Fagerberg,et al.  Innovation studies—The emerging structure of a new scientific field , 2009 .

[65]  G. Mulgan,et al.  Social Innovation: What it is, why it matters and how it can be accelerated , 2007 .

[66]  Mokter Hossain,et al.  Generating Ideas on Online Platforms: A Case Study of “My Starbucks Idea” , 2015 .

[67]  J. Scholderer,et al.  In Search of New Product Ideas: Identifying Ideas in Online Communities by Machine Learning and Text Mining , 2017 .

[68]  Eric Horvitz,et al.  What's your idea?: a case study of a grassroots innovation pipeline within a large software company , 2010, CHI.

[69]  Gregoris Mentzas,et al.  IDEM: A Prediction Market for Idea Management , 2008, WEB.

[70]  Thomas Demeester,et al.  Learning Semantic Similarity for Very Short Texts , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[71]  Mark Klein,et al.  Supporting Collaborative Deliberation Using a Large-Scale Argumentation System: The Mit Collaboratorium , 2008 .

[72]  M. Rocío Martínez-Torres,et al.  Content analysis of open innovation communities using latent semantic indexing , 2015, Technol. Anal. Strateg. Manag..

[73]  J. Schumpeter,et al.  Capitalism, Socialism and Democracy , 1943 .

[74]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[75]  J. Marshall Open Innovation: The New Imperative for Creating and Profiting from Technology , 2004 .

[76]  Joel Taylor,et al.  Indexing reliability for condition survey data , 2007 .

[77]  Stefan Hrastinski,et al.  A Review of Technologies for Open Innovation: Characteristics and Future Trends , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[78]  Isabel Segura-Bedmar,et al.  Word Embedding Clustering for Disease Named Entity Recognition , 2015 .

[79]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.