Weak signal identification with semantic web mining

We investigate an automated identification of weak signals according to Ansoff to improve strategic planning and technological forecasting. Literature shows that weak signals can be found in the organization's environment and that they appear in different contexts. We use internet information to represent organization's environment and we select these websites that are related to a given hypothesis. In contrast to related research, a methodology is provided that uses latent semantic indexing (LSI) for the identification of weak signals. This improves existing knowledge based approaches because LSI considers the aspects of meaning and thus, it is able to identify similar textual patterns in different contexts. A new weak signal maximization approach is introduced that replaces the commonly used prediction modeling approach in LSI. It enables to calculate the largest number of relevant weak signals represented by singular value decomposition (SVD) dimensions. A case study identifies and analyses weak signals to predict trends in the field of on-site medical oxygen production. This supports the planning of research and development (R&D) for a medical oxygen supplier. As a result, it is shown that the proposed methodology enables organizations to identify weak signals from the internet for a given hypothesis. This helps strategic planners to react ahead of time.

[1]  Dirk Thorleuchter,et al.  Technology classification with latent semantic indexing , 2013, Expert Syst. Appl..

[2]  Dirk Thorleuchter,et al.  Analyzing Website Content for Improved R&T Collaboration Planning , 2013, WorldCIST.

[3]  Hsu-Hao Tsai,et al.  Global data mining: An empirical study of current trends, future forecasts and technology diffusions , 2012, Expert Syst. Appl..

[4]  Dirk Thorleuchter,et al.  Mining Social Behavior Ideas of Przewalski Horses , 2011 .

[5]  Dirk Thorleuchter,et al.  Granular Deleting in Multi Level Security Models – An Electronic Engineering Approach , 2012 .

[6]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7]  Dirk Thorleuchter,et al.  A compared R&D-based and patent-based cross impact analysis for identifying relationships between technologies , 2010 .

[8]  Dirk Thorleuchter,et al.  Protecting research and technology from espionage , 2013, Expert Syst. Appl..

[9]  Karen Spärck Jones Index term weighting , 1973, Inf. Storage Retr..

[10]  P. Pattynama,et al.  Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology. , 1998, European journal of radiology.

[11]  Thompson S. H. Teo,et al.  Assessing the impact of using the Internet for competitive intelligence , 2001, Inf. Manag..

[12]  Dirk Thorleuchter,et al.  Mining Innovative Ideas to Support New Product Research and Development , 2010 .

[13]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[14]  Dirk Thorleuchter,et al.  Semantic technology classification — A defence and security case study , 2011, 2011 International Conference on Uncertainty Reasoning and Knowledge Engineering.

[15]  Chung-Hong Lee,et al.  An information fusion approach to integrate image annotation and text mining methods for geographic knowledge discovery , 2012, Expert Syst. Appl..

[16]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[17]  Jan Oliver Schwarz,et al.  Business wargaming: developing foresight within a strategic simulation , 2009, Technol. Anal. Strateg. Manag..

[18]  Dirk Van den Poel,et al.  Predicting home-appliance acquisition sequences: Markov/Markov for Discrimination and survival analysis for modeling sequential information in NPTB models , 2007, Decis. Support Syst..

[19]  Andrew D. A. Maidment,et al.  Comparison of receiver operating characteristic curves on the basis of optimal operating points. , 1996, Academic radiology.

[20]  Dirk Thorleuchter,et al.  Companies website optimising concerning consumer's searching for new products , 2011, 2011 International Conference on Uncertainty Reasoning and Knowledge Engineering.

[21]  Jan Oliver Schwarz,et al.  Pitfalls in implementing a strategic early warning system , 2005 .

[22]  Dirk Thorleuchter,et al.  Mining ideas from textual information , 2010, Expert Syst. Appl..

[23]  Yuh-Min Chen,et al.  Developing a semantic-enable information retrieval mechanism , 2010, Expert Syst. Appl..

[24]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[25]  Chao-Fu Hong,et al.  Opportunities for Crossing the Chasm between Early Adopters and the Early Majority through New Uses of Innovative Products , 2011, Rev. Socionetwork Strateg..

[26]  Jianping Zeng,et al.  Topics modeling based on selective Zipf distribution , 2012, Expert Syst. Appl..

[27]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[28]  Sandro Mendonça,et al.  The strategic strength of weak signal analysis , 2012 .

[29]  Michael W. Berry,et al.  Mining consumer product data via latent semantic indexing , 1999, Intell. Data Anal..

[30]  Dirk Thorleuchter,et al.  Predicting customer profitability during acquisition: Finding the optimal combination of data source and data mining technique , 2013, Expert Syst. Appl..

[31]  Dirk Thorleuchter,et al.  Web mining based extraction of problem solution ideas , 2013, Expert Syst. Appl..

[32]  Pierre Rossel,et al.  Weak signals as a flexible framing space for enhanced management and decision-making , 2009, Technol. Anal. Strateg. Manag..

[33]  Dirk Thorleuchter,et al.  Predicting e-commerce company success by mining the text of its publicly-accessible website , 2012, Expert Syst. Appl..

[34]  Dirk Thorleuchter,et al.  Using Webcrawling of Publicly Available Websites to Assess E-commerce Relationships , 2012, 2012 Annual SRII Global Conference.

[35]  Gregoris Mentzas,et al.  Using latent topics to enhance search and recommendation in Enterprise Social Software , 2012, Expert Syst. Appl..

[36]  Dirk Thorleuchter,et al.  Using NMF for Analyzing War Logs , 2012, Future Security.

[37]  Janghyeok Yoon,et al.  Detecting weak signals for long-term business opportunities using text mining of Web news , 2012, Expert Syst. Appl..

[38]  Jari Kaivo-oja,et al.  Wild cards, weak signals and organisational improvisation , 2004 .

[39]  Mu-Chen Chen,et al.  Mining changes in customer behavior in retail marketing , 2005, Expert Syst. Appl..

[40]  Turo Uskali Paying Attention to Weak Signals - The Key Concept for Innovation Journalism , 2005 .

[41]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[42]  Koichi Takeda,et al.  Information retrieval on the web , 2000, CSUR.

[43]  Nasim Tabatabaei,et al.  Detecting Weak Signals by Internet-Based Environmental Scanning , 2011 .

[44]  Michael A. Abebe,et al.  CHIEF EXECUTIVE EXTERNAL NETWORK TIES AND ENVIRONMENTAL SCANNING ACTIVITIES: AN EMPIRICAL EXAMINATION , 2012 .

[45]  Dirk Thorleuchter,et al.  Improved multilevel security with latent semantic indexing , 2012, Expert Syst. Appl..

[46]  Osmo Kuusi,et al.  Filters of weak signals hinder foresight: Monitoring weak signals efficiently in corporate decision-making , 2006 .

[47]  João Falcão e Cunha,et al.  Modeling partial customer churn: On the value of first product-category purchase sequences , 2012, Expert systems with applications.

[48]  H. Ansoff,et al.  Managing Strategic Surprise by Response to Weak Signals , 1975 .

[49]  Dirk Thorleuchter,et al.  High granular multi-level-security model for improved usability , 2011, 2011 International Conference on System science, Engineering design and Manufacturing informatization.

[50]  Basheer AL-allak,et al.  Evaluating the Adoption and Use of Internet-based Marketing Information Systems to Improve Marketing Intelligence (The Case of Tourism SMEs in Jordan) , 2010 .

[51]  Fabio Stella,et al.  Topic model validation , 2012, Neurocomputing.

[52]  Dirk Thorleuchter,et al.  Improved Emergency Management by a Loosely Coupled Logistic System , 2012, Future Security.

[53]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[54]  Yongtae Park,et al.  A patent-based cross impact analysis for quantitative estimation of technological impact: The case of information and communication technology , 2007 .

[55]  Dirk Thorleuchter Finding New Technological Ideas and Inventions with Text Mining and Technique Philosophy , 2007, GfKl.

[56]  Dirk Thorleuchter,et al.  Usability Based Modeling for Advanced IT-Security – An Electronic Engineering Approach , 2012 .

[57]  Tuomo Kuosa,et al.  Futures signals sense-making framework (FSSF): A start-up tool to analyse and categorise weak signals, wild cards, drivers, trends and other types of information , 2010 .

[58]  J. March,et al.  Environmental Scanning : Acquisition and Use of Information by Managers , 2011 .

[59]  H. Igor Ansoff,et al.  Implanting Strategic Management , 1984 .

[60]  Geert Wets,et al.  Customer-adapted coupon targeting using feature selection , 2004, Expert Syst. Appl..

[61]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[62]  James Allan,et al.  Automatic structuring and retrieval of large text files , 1994, CACM.

[63]  Geert Wets,et al.  Direct and indirect effects of retail promotions on sales and profits in the do-it-yourself market , 2003, Expert Syst. Appl..

[64]  Pradnya Purandare Web Mining: A Key to Improve Business on Web , 2008, IADIS European Conf. Data Mining.

[65]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[66]  Reinhold Decker,et al.  An internet‐based approach to environmental scanning in marketing planning , 2005 .

[67]  C. Jothi Venkateswaran,et al.  Fuzzy Temporal Clustering Approach for E-Commerce Websites , 2012 .

[68]  Dirk Thorleuchter,et al.  Extracting Consumers Needs for New Products - A Web Mining Approach , 2010, 2010 Third International Conference on Knowledge Discovery and Data Mining.

[69]  Jae Kyeong Kim,et al.  A literature review and classification of recommender systems research , 2012, Expert Syst. Appl..

[70]  Ramón F. Brena,et al.  Query Based Topic Modeling: An Information-Theoretic Framework for Semantic Analysis in Large-Scale Collections , 2012 .

[71]  Hui Xiong,et al.  A semantic term weighting scheme for text categorization , 2011, Expert Syst. Appl..

[72]  Dirk Thorleuchter,et al.  Analyzing existing customers' websites to improve the customer acquisition process as well as the profitability prediction in B-to-B marketing , 2012, Expert Syst. Appl..

[73]  Marja Toivonen,et al.  Weak signals: Ansoff today , 2012 .

[74]  Maximilien Kintz,et al.  Aggregating web-based ideation platforms , 2012 .

[75]  Elina Hiltunen,et al.  The future sign and its three dimensions , 2008 .

[76]  Dirk Thorleuchter,et al.  Extraction of Ideas from Microsystems Technology , 2012, CSIE 2012.

[77]  Rossitza Setchi,et al.  User-oriented ontology-based clustering of stored memories , 2012, Expert Syst. Appl..

[78]  Youngjoong Ko,et al.  Text classification from unlabeled documents with bootstrapping and feature projection techniques , 2009, Inf. Process. Manag..

[79]  Dirk Thorleuchter,et al.  Vertrauliche Verarbeitung staatlich eingestufter Information – die Informationstechnologie im Geheimschutz , 2008, Informatik-Spektrum.

[80]  Xue Li,et al.  Unified collaborative filtering model based on combination of latent features , 2010, Expert Syst. Appl..

[81]  Dirk Van den Poel,et al.  Investigating purchasing-sequence patterns for financial services using Markov, MTD and MTDg models , 2006, Eur. J. Oper. Res..