Identifying Opportunities in Multilingual Business Environments Using Environmental Scanning and Text Mining Techniques

The identification of opportunities for growth can be made easier if comprehensive information relevant to the business environment is available to managers. Such recognition of business opportunities can also help sustain competitive advantage. Information relevant to business environment is usually written and posted in many languages and can be accessed from many sources. The collection of this information is time consuming and labor intensive and techniques such as environmental scanning that are proposed in previous research can facilitate this information search. In this study, we propose a technique to automatically perform tasks using text mining tools that search, translate, and extract information from online documents. Updated information produced by these tools will be current and accessible by all levels of management and facilitate managerial decision making. INTRODUCTION As organizations expand their operations in different countries, information concerning the local economy, specific information concerning customers, and competitors must be available to the management to make timely decisions. This helps organizations succeed and achieve competitive advantage. The timely processing and transferring of information to decision makers has been defined in different ways in past research. For this research we will use the term Environmental Scanning (ES). ES has been described recently by Decker, Wagner, and Scholz (2005) as a means by which managers study relevant business environments. Culnan (1983) described ES as the acquisition of information about events taking place outside the organization which can be used to respond effectively to changes in the environment. Hambrick (1982) described ES as the key step in the process of organizational adaption to the business setting. In summary, ES is acquiring information to allow effective decisions to be made by management. An organization must decide what information is important and what is not. There are three sources of information that organizations typically access: human, documentary, and physical phenomena (Keegan, 1974). Many executives prefer direct human intelligence. But, with the increasing use of the internet for accessing and dissemination of information, and the need for timely information, the Internet is a quality resource for information. Recently, He and Zhu (2007) looked at corporate blogs to access the information gained from them. J. Strong, K. Ghosh & S. Conlon 2008 Volume 17, Numbers 3/4 190 Aguilar (1967) identified the information required by an organization and separated it into several areas (for example, market information, information about competitors etc). Some organizations have developed ES processes, but instead of an overall process, they have developed several processes which include competitive intelligence (CI), knowledge management (KM), and business intelligence (BI). These processes are related to ES in that ES is the action of obtaining information about all aspects of a business. Such processes are usually departmentalized and the information is not always combined to form actionable information in a timely fashion. Specifically many departments control the information for political power within the organization. Most marketing departments focus on competitor information and customer relationship management, while the production and operations department controls the supply chain management system. So in each organization, we must define what processes are developed and what processes need to be developed and combine these separate processes into an overall system to improve the organizations ability to make effective decisions with the information available. Information relevant to the business environment may be written and posted in many languages. So, the translation of information (in addition to the search and storage of the information) becomes essential. In this paper, we attempt to develop an environmental scanning tool that will perform structured and unstructured scanning of the multilingual business environment, subject to the criteria set by management to deliver timely and relevant information. Timeliness of the information can have an impact on quality of strategic decisions. Thus searching for this information and transferring this knowledge in-time to management is essential. In the following sections of this paper, we will discuss the motivation for this research, as well as the future research possibilities for ES. In addition, the tools and methods used to extract information automatically are developed. Further, since information on the internet is continually expanding, and varied, and published in many languages, we talk about how MT systems are facilitating the translation of information from one language (source) to another (target), subsequently enhancing the capabilities of ES. Finally, an operational model that stipulates the functions and steps needed for ES are proposed and discussed and conclusions are drawn. Motivation Organizations interested in expanding into international markets need comprehensive information about the business environment in those regions to make strategic decisions. To access this information, corporations need to convert local information to the language of interest (which is understood by managers). As stated earlier, a key factor mentioned in previous literature for a business to succeed is the use of ES (Dollinger, 1984; Daft, Sormunen, & Parks, 1988; Subramanian, Fernandes, & Harper1993; Ngamkroeckjoti & Johri, 2003). Salton (1970) was one of the first to have examined automatic translation of documents from one language to the other. Applying Salton’s idea of machine translation in the context of converting business information from one language to the other (target language or language of interest), we illustrate how the combination of ES and MT systems can improve the quality of information (accuracy and timeliness) available to multinational corporations. Opportunities in Multilingual Journal of International Technology and Information Management 191 Muralidharan (2003) surveyed multinational corporations and found that these firms performmacro ES, financial ES, and market competitive ES. The first, macro ES, includes societal attitudes toward foreign companies in the foreign country, the general demographic trends in the foreign country, government regulations on foreign investments, trade pacts involving the foreign country, and technology development in the foreign country. The second area, financial ES is composed of inflation rate in the foreign country, the prime lending interest rate in the foreign country, the prices of specific raw materials in the foreign country, and the exchange rate of the foreign country’s currency. The last area, market competitive ES included competitor actions in the foreign country, and the market response to the multinational company in the foreign country. Keeping in mind the above mentioned areas of ES, several ES and CI prototypes have been built by research scholars and commercial software firms. Liu, Turban, and Lee (2000) developed MasterScan for the pulp and paper industry. CI Spider was developed by Chen, Chau, and Zeng (2002). McManus and Snyder (2003) developed EPSS to gather information from the internal sources of an organization. Decker, et al. (2005) used the Information Foraging Theory (IFT) of Pirolli and Card (1999) to develop their system which was capable of performing ES. The Fuld and Company reviewed seventeen ES/CI commercial software packages in their “2006-2007 Intelligence Report.” The report provided a two page review and analysis of each of the seventeen ES/CI packages and using metrics (such as ability to conduct meta-searches, filtering of extracted information etc) for comparisons, rated each of the ES/CI software. In this paper we attempt to investigate ways to improve ES by using available technologies which gather and extract information from the internet and other electronic sources using IE and MT. Using these tools, we search, extract, translate, store, and present the information in the language of interest in a concise format to facilitate strategic decision making. Further, in this paper, by targeting the area of finance, specifically the prime lending rate in India, and obtaining the data found in articles online, we demonstrate the usefulness of the tools. Finally, using an English to Hindi translator, we show how a MT system helps us to acquire information written in different languages. Text Mining Tools Text mining is the process of acquiring information by analyzing and deriving patterns from textual data. Generally, text mining techniques are based on research areas that include information retrieval, data mining, machine learning, statistics, and computational linguistics. Typical text mining tasks consist of text categorization, text clustering, concept/entity extraction, and document summarization. Three of the major text mining techniques are discussed in the following sections. Information Retrieval (IR) There are several technologies that are used to retrieve the documents that pertain to the information an organization needs. IR takes input from a web browser and searches the internet J. Strong, K. Ghosh & S. Conlon 2008 Volume 17, Numbers 3/4 192 or particular Uniform Resource Locator (URL) address for documents that match the query provided by the user. This search is performed using the HTML, XML and other tags located in the documents that identify the subject of the document. Information Extraction (IE) Information Extraction (IE) is a sub-area of Natural Language Processing (NLP) that extracts information from unstructured text documents and produces structured format data. Extracted data can be put in databases or filled in slots in templates (Cardie, 1997; Cowie & Lehnert, 1996). Also, Kalczynski (2005) proposed Temporal Document Retrieval Model (TD

[1]  Lisa F. Rau,et al.  SCISOR: extracting information from on-line news , 1990, CACM.

[2]  Ram Subramanian,et al.  Environmental Scanning in U.S. Companies: Their Nature and Their Relationship , 1993 .

[3]  Maria T. Pazienza,et al.  Information Extraction , 2002, Lecture Notes in Computer Science.

[4]  Arthur A. Rasher,et al.  An Empirical Investigation of the Relationship Between Environmental Assessment and Corporate Performance. , 1984 .

[5]  Douglas E. Appelt,et al.  Introduction to Information Extraction Technology , 1999, IJCAI 1999.

[6]  R. Muralidharan Environmental Scanning and Strategic Decisions in Multinational Corporations , 2003 .

[7]  Denise Johnson McManus,et al.  Knowledge Management: The Role of EPSS , 2003 .

[8]  Pawel J. Kalczynski,et al.  Time Dimension for Business News in the Knowledge Warehouse , 2005, Journal of International Technology and Information Management.

[9]  Lalit M. Johri,et al.  Coping with hypercompetition in the financial services industry in Thailand: environmental scanning practices of leaders and followers , 2003 .

[10]  Mary J. Culnan,et al.  ENVIRONMENTAL SCANNING: THE EFFECTS OF TASK COMPLEXITY AND SOURCE ACCESSIBILITY ON INFORMATION GATHERING BEHAVIOR* , 1983 .

[11]  Peter Pirolli,et al.  Information Foraging , 2009, Encyclopedia of Database Systems.

[12]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[13]  W. Keegan Multinational Scanning: A Study of the Information Sources Utilized by Headquarters Executives in Multinational Companies. , 1974 .

[14]  F. Aguilar Scanning the business environment , 1967 .

[15]  Marc J. Dollinger,et al.  Environmental Boundary Spanning and Information Processing Effects on Organizational Performance , 1984 .

[16]  Shaoyi He,et al.  Corporate Blogs of 40 Fortune 500 Companies: Distribution, Categorization and Characteristics , 2007, Journal of International Technology and Information Management.

[17]  D. Hambrick Environmental scanning and organizational strategy , 1982 .

[18]  Claire Cardie,et al.  Empirical Methods in Information Extraction , 1997, AI Mag..

[19]  Hsinchun Chen,et al.  CI Spider: a tool for competitive intelligence on the Web , 2002, Decis. Support Syst..

[20]  R. Daft,et al.  Chief executive scanning, environmental characteristics, and company performance: An empirical study , 1988 .

[21]  Gerard Salton,et al.  Automatic Processing of Foreign Language Documents , 1969, COLING.

[22]  Efraim Turban,et al.  Software Agents for Environmental Scanning in Electronic Commerce , 2000, Inf. Syst. Frontiers.

[23]  Reinhold Decker,et al.  An internet‐based approach to environmental scanning in marketing planning , 2005 .

[24]  J. J. West,et al.  Strategy, environmental scanning, and their effect upon firm performance : an exploratory study of the food service industry , 1988 .