Big Data Fusion Model for Heterogeneous Financial Market Data (FinDf)

The dawn of big data has seen the volume, variety, and velocity of data sources increase dramatically. Enormous amounts of structured, semi-structured and unstructured heterogeneous data can be garnered at a rapid rate, making analysis of such big data a herculean task. This has never been truer for data relating to financial stock markets, the biggest challenge being the 7Vs of big data which relate to the collection, pre-processing, storage and real-time processing of such huge quantities of disparate data sources. Data fusion techniques have been adopted in a wide number of fields to cope with such vast amounts of heterogeneous data from multiple sources and fuse them together in order to produce a more comprehensive view of the data and its underlying relationships. Research into the fusing of heterogeneous financial data is scant within the literature, with existing work only taking into consideration the fusing of text-based financial documents. The lack of integration between financial stock market data, social media comments, financial discussion board posts and broker agencies means that the benefits of data fusion are not being realised to their full potential. This paper proposes a novel data fusion model, inspired by the data fusion model introduced by the Joint Directors of Laboratories, for the fusing of disparate data sources relating to financial stocks. Data with a diverse set of features from different data sources will supplement each other in order to obtain a Smart Data Layer, which will assist in scenarios such as irregularity detection and prediction of stock prices.

[1]  Marco A. Solano,et al.  Enterprise data architecture principles for High-Level Multi-Int fusion: A pragmatic guide for implementing a heterogeneous data exploitation framework , 2012, 2012 15th International Conference on Information Fusion.

[2]  Hosagrahar V. Jagadish,et al.  Big data challenges and opportunities in financial stability monitoring , 2016 .

[3]  Chee,et al.  Cashtags and Sentiment Analysis in Predicting Stock Price Movements , 2017 .

[4]  Lucien Wald,et al.  Data fusion : a conceptual approach for an efficient exploitation of remote sensing images , 1998 .

[5]  Miriam A. M. Capretz,et al.  Challenges for MapReduce in Big Data , 2014, 2014 IEEE World Congress on Services.

[6]  Osmar R. Zaïane,et al.  Data Mining Applications for Fraud Detection in Securities Market , 2012, 2012 European Intelligence and Security Informatics Conference.

[7]  P. Manning Financial journalism, news sources and the banking crisis , 2013 .

[8]  Wei Wei,et al.  Twitter volume spikes and stock options pricing , 2016, Comput. Commun..

[9]  Shahriar Akter,et al.  Big data analytics in electronic markets , 2017, Electron. Mark..

[10]  Jesús García,et al.  Context-based Information Fusion: A survey and discussion , 2015, Inf. Fusion.

[11]  Fakhri Karray,et al.  Multisensor data fusion: A review of the state-of-the-art , 2013, Inf. Fusion.

[12]  Vijay Borges,et al.  Survey of context information fusion for ubiquitous Internet-of-Things (IoT) systems , 2016, Open Comput. Sci..

[13]  Wolfgang Jank,et al.  Real-Time Diffusion of Information on Twitter and the Financial Markets , 2016, PloS one.

[14]  W. Currie,et al.  A model for unpacking big data analytics in high-frequency trading , 2017 .

[15]  James M. Conrad,et al.  A survey of multisensor fusion techniques, architectures and methodologies , 2017, SoutheastCon 2017.

[16]  Michele Zappavigna,et al.  Discourse of Twitter and social media , 2012 .

[17]  Maamoun Ahmed,et al.  Data Mining and Fusion Techniques for WSNs as a Source of the Big Data , 2015 .

[18]  Erik Blasch,et al.  Revisiting the JDL model for information exploitation , 2013, Proceedings of the 16th International Conference on Information Fusion.

[19]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[20]  Clare Stanier,et al.  Defining Big Data , 2016, BDAW '16.

[21]  Fabrizio Lillo,et al.  A Large Scale Study to Understand the Relation between Twitter and Financial Market , 2016, 2016 Third European Network Intelligence Conference (ENIC).

[22]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[23]  Magdy Bayoumi,et al.  Data Fusion in WSN , 2012 .

[24]  M. Bevilacqua,et al.  Data Fusion Strategy for Precise Vehicle Location for Intelligent Self-Aware Maintenance Systems , 2015, 2015 6th International Conference on Intelligent Systems, Modelling and Simulation.

[25]  Rasmus Bro,et al.  Understanding data fusion within the framework of coupled matrix and tensor factorizations , 2013 .

[26]  H. Chen Group polarization in virtual communities: The case of stock message boards , 2013 .

[27]  Frank C. Graves,et al.  Computerized and High‐Frequency Trading , 2014 .

[28]  Keeley A. Crockett,et al.  Experiment for Analysing the Impact of Financial Events on Twitter , 2017, ICA3PP.

[29]  Stefan Stieglitz,et al.  #IronyOff - Understanding the Usage of Irony on Twitter during a Corporate Crisis , 2017, PACIS.

[30]  James Llinas,et al.  Revisiting the JDL Data Fusion Model II , 2004 .

[31]  Rachelle Vessey Zappavigna, M. (2012). Discourse of Twitter and Social Media: How We Use Language to Create Affiliation on the Web. London: Bloomsbury , 2015 .

[32]  Christoffer Hallstensen,et al.  Multisensor Fusion for Intrusion Detection and Situational Awareness , 2017 .

[33]  Aaron Elliot,et al.  Time Series Prediction : Predicting Stock Price , 2017 .

[34]  Seref Sagiroglu,et al.  Big data: A review , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[35]  Elena Console,et al.  Data Fusion , 2009, Encyclopedia of Database Systems.

[36]  L. Harris Trading and Exchanges: Market Microstructure for Practitioners , 2002 .

[37]  Christian Schindelhauer,et al.  Data Fusion of Time Stamps and Transmitted Data for Unsynchronized Beacons , 2015, IEEE Sensors Journal.

[38]  Lars Christian Andersen,et al.  Data-driven Approach to Information Sharing using Data Fusion and Machine Learning , 2016 .

[39]  GeunSik Jo,et al.  Smart Data: Where the Big Data Meets the Semantics , 2017, Comput. Intell. Neurosci..

[40]  Mathias Ekstedt,et al.  Automated architecture modeling for enterprise technology manageme using principles from data fusion: A security analysis case , 2016, 2016 Portland International Conference on Management of Engineering and Technology (PICMET).

[41]  Serge Chaumette,et al.  Using heterogeneous multilevel swarms of UAVs and high-level data fusion to support situation management in surveillance scenarios , 2016, 2016 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI).

[42]  Xiaomo Liu,et al.  funSentiment at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs Using Word Vectors Built from StockTwits and Twitter , 2017, *SEMEVAL.

[43]  Daniel H. Boylan The innovative use of Twitter technology by bank leadership to enhance shareholder value , 2016 .

[44]  Tomer Geva,et al.  Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news , 2014, Decis. Support Syst..

[45]  H. V. Jagadish,et al.  Research Challenges in Financial Data Modeling and Analysis , 2017, Big Data.

[46]  Osmar R. Zaïane,et al.  Time series contextual anomaly detection for detecting market manipulation in stock market , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[47]  Florin Radulescu,et al.  MongoDB vs Oracle -- Database Comparison , 2012, 2012 Third International Conference on Emerging Intelligent Data and Web Technologies.

[48]  James H. Garrett,et al.  A data fusion approach for track monitoring from multiple in-service trains , 2017 .

[49]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[50]  Francisco Herrera,et al.  Big data preprocessing: methods and prospects , 2016 .

[51]  Pascal Vasseur,et al.  Introduction to Multisensor Data Fusion , 2005, The Industrial Information Technology Handbook.

[52]  Songting Chen,et al.  Cheetah , 2010, Proc. VLDB Endow..

[53]  Caiyun Zhang,et al.  Applying data fusion techniques for benthic habitat mapping and monitoring in a coral reef ecosystem , 2015 .

[54]  Tomer Geva,et al.  Predicting Intraday Stock returns by Integrating Market Data and Financial News Reports , 2010, MCIS.

[55]  Ammar Belatreche,et al.  Pre-processing online financial text for sentiment classification: A natural language processing approach , 2014, 2014 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr).

[56]  Rajasekar Krishnamurthy,et al.  Extracting, Linking and Integrating Data from Public Sources: A Financial Case Study , 2015, IEEE Data Eng. Bull..

[57]  Jitendra R. Raol,et al.  Multi-Sensor Data Fusion with MATLAB® , 2009 .

[58]  Alan N. Steinberg,et al.  Revisions to the JDL data fusion model , 1999, Defense, Security, and Sensing.

[59]  Mohammad. M. AlyanNezhadi,et al.  An efficient algorithm for multisensory data fusion under uncertainty condition , 2017 .

[60]  Keeley A. Crockett,et al.  Financial Discussion Boards Irregularities Detection System (FDBs-IDS) using information extraction , 2017, 2017 Intelligent Systems Conference (IntelliSys).

[61]  Federico Castanedo,et al.  A Review of Data Fusion Techniques , 2013, TheScientificWorldJournal.