Information Search, Integration, and Personalization: 13th International Workshop, ISIP 2019, Heraklion, Greece, May 9–10, 2019, Revised Selected Papers

In this paper we introduce an approach, called LODQA, for open domain Question Answering over Linked Open Data. We confine ourselves to three kinds of questions: factoid, confirmation, and definition questions. By using LODQA it is feasible to answer questions over 400 millions of entities of any domain without using any training data, since we exploit simultaneously 400 Linked datasets. In particular, we exploit the services of LODsyndesis, a suite of services (based on semantics-aware indexes) which supports cross-dataset reasoning over hundreds of Linked datasets and 2 billion triples. The proposed Question Answering process follows an information extraction approach and comprises several steps including question cleaning, heuristic based question type identification, entity recognition, linking and disambiguation using Linked Data-based methods and pure NLP methods (specifically DBpedia Spotlight and Stanford CoreNLP), WordNet-based question expansion for tackling the lexical gap (between the input question and the underlying sources), and triple scoring for producing the final answer. We discuss the benefits of this approach in terms of answerable questions and answer verification, and we investigate, through experimental results, how the aforementioned steps of the process affect the effectiveness and the efficiency of question answering.

[1]  Jadwiga Indulska,et al.  Modeling Context Information in Pervasive Computing Systems , 2002, Pervasive.

[2]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[3]  Bahar Sateli,et al.  What's in this paper?: Combining Rhetorical Entities with Linked Open Data for Semantic Literature Querying , 2015, WWW.

[4]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[5]  Bernhard Seeger,et al.  A Generic Approach to Bulk Loading Multidimensional Index Structures , 1997, VLDB.

[6]  Ashwin Machanavajjhala,et al.  Network sampling , 2013, KDD.

[7]  Jian Pei,et al.  A Data-adaptive and Dynamic Segmentation Index for Whole Matching on Time Series , 2013, Proc. VLDB Endow..

[8]  Kostas Stefanidis,et al.  On Recommending Evolution Measures: A Human-Aware Approach , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[9]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[10]  Robin Milner,et al.  Theories for the Global Ubiquitous Computer , 2004, FoSSaCS.

[11]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[12]  Ion Androutsopoulos,et al.  Extracting contract elements , 2017, ICAIL.

[13]  Vasilis Efthymiou,et al.  Minoan ER: Progressive Entity Resolution in the Web of Data , 2016, EDBT.

[14]  Themis Palpanas,et al.  Massively Distributed Time Series Indexing and Querying , 2020, IEEE Transactions on Knowledge and Data Engineering.

[15]  Panos K. Chrysanthis,et al.  MPG: Not So Random Exploration of a City , 2016, 2016 17th IEEE International Conference on Mobile Data Management (MDM).

[16]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[17]  Paul Buitelaar,et al.  Ontology-based information extraction and integration from heterogeneous data sources , 2008, Int. J. Hum. Comput. Stud..

[18]  Anthony K. H. Tung,et al.  SpADe: On Shape-based Pattern Detection in Streaming Time Series , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[19]  Yuzuru Tanaka Proximity-Based Federation of Smart Objects and Their Application Framework , 2017 .

[20]  Philip S. Yu,et al.  HierarchyScan: a hierarchical similarity search algorithm for databases of long sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[21]  Pavlos Protopapas,et al.  Computational Intelligence Challenges and Applications on Large-Scale Astronomical Time Series Databases , 2014, IEEE Computational Intelligence Magazine.

[22]  Johannes Gehrke,et al.  Generating data series query workloads , 2018, The VLDB Journal.

[23]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[24]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[25]  Dennis Shasha,et al.  Tuning Time Series Queries in Finance: Case Studies and Recommendations , 1999, IEEE Data Eng. Bull..

[26]  Behrang Q. Zadeh,et al.  The ACL RD-TEC 2.0: A Language Resource for Evaluating Term Extraction and Entity Recognition Methods , 2016, LREC.

[27]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[28]  Arnon Avron,et al.  The Value of the Four Values , 1998, Artif. Intell..

[29]  Nicole Bidoit,et al.  Negation in Rule-Based Database Languages: A Survey , 1991, Theor. Comput. Sci..

[30]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[31]  Kunio Kashino,et al.  Time-series active search for quick retrieval of audio and video , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[32]  Minos N. Garofalakis,et al.  Query Analytics over Probabilistic Databases with Unmerged Duplicates , 2015, IEEE Transactions on Knowledge and Data Engineering.

[33]  Ignacio Iacobacci,et al.  Embeddings for Word Sense Disambiguation: An Evaluation Study , 2016, ACL.

[34]  Themis Palpanas,et al.  Data Series Management (Dagstuhl Seminar 19282) , 2019, Dagstuhl Reports.

[35]  Bahar Sateli,et al.  Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud , 2015, PeerJ Comput. Sci..

[36]  Nuel D. Belnap,et al.  A Useful Four-Valued Logic , 1977 .

[37]  Hotham Altwaijry,et al.  QuERy: A Framework for Integrating Entity Resolution with Query Processing , 2015, Proc. VLDB Endow..

[38]  Mike Bergman The Open World Assumption : Elephant in the Room , 2016 .

[39]  Jian Pei,et al.  WAT: Finding Top-K Discords in Time Series Database , 2007, SDM.

[40]  Themis Palpanas,et al.  Top-k Nearest Neighbor Search In Uncertain Data Series , 2014, Proc. VLDB Endow..

[41]  Mari Ostendorf,et al.  Scientific Information Extraction with Semi-supervised Neural Tagging , 2017, EMNLP.

[42]  R. Merton The Matthew Effect in Science , 1968, Science.

[43]  Allen H. Renear,et al.  Strategic Reading, Ontologies, and the Future of Scientific Publishing , 2009, Science.

[44]  Vayianos Pertsas,et al.  Scholarly Ontology: modelling scholarly practices , 2017, International Journal on Digital Libraries.

[45]  Wolfgang Nejdl,et al.  Meta-Blocking: Taking Entity Resolutionto the Next Level , 2014, IEEE Transactions on Knowledge and Data Engineering.

[46]  Kostas Stefanidis,et al.  Fairness in Group Recommendations in the Health Domain , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[47]  Kostas Stefanidis,et al.  Open Source Software Recommendations Using Github , 2018, TPDL.

[48]  Yuzuru Tanaka Proximity-Based Federation of Smart Objects: Liberating Ubiquitous Computing from Stereotyped Application Scenarios , 2010, KES.

[49]  Themis Palpanas,et al.  The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art , 2018, Proc. VLDB Endow..

[50]  Panagiota Fatourou,et al.  ParIS: The Next Destination for Fast Data Series Indexing and Query Answering , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[51]  Katsiaryna Mirylenka,et al.  Data Series Similarity Using Correlation-Aware Measures , 2017, SSDBM.

[52]  Lorena Etcheverry,et al.  QB4OLAP: A new vocabulary for olap cubes on the semantic web , 2012 .

[53]  Eamonn J. Keogh,et al.  iSAX: disk-aware mining and indexing of massive time series datasets , 2009, Data Mining and Knowledge Discovery.

[54]  Ulf Leser,et al.  Set Similarity Joins on MapReduce: An Experimental Survey , 2018, Proc. VLDB Endow..

[55]  Boualem Benatallah,et al.  Scalable graph-based OLAP analytics over process execution data , 2015, Distributed and Parallel Databases.

[56]  Patrick Valduriez,et al.  Distributed Algorithms to Find Similar Time Series , 2019, ECML/PKDD.

[57]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[58]  Evaggelia Pitoura,et al.  Fair sequential group recommendations , 2020, SAC.

[59]  Eamonn J. Keogh,et al.  Matrix Profile X: VALMOD - Scalable Discovery of Variable-Length Motifs in Data Series , 2018, SIGMOD Conference.

[60]  Themis Palpanas,et al.  RINSE: Interactive Data Series Exploration with ADS+ , 2015, Proc. VLDB Endow..

[61]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[62]  Anastasia Bezerianos,et al.  Progressive Similarity Search on Time Series Data , 2019, EDBT/ICDT Workshops.

[63]  Benedikt Kämpgen,et al.  Interacting with Statistical Linked Data via OLAP Operations , 2012, ILD@ESWC.

[64]  Eamonn J. Keogh,et al.  iSAX 2.0: Indexing and Mining One Billion Time Series , 2010, 2010 IEEE International Conference on Data Mining.

[65]  Eamonn J. Keogh,et al.  Beyond one billion time series: indexing and mining very large time series collections with $$i$$SAX2+ , 2013, Knowledge and Information Systems.

[66]  Jeyhun Karimov,et al.  Benchmarking Distributed Stream Data Processing Systems , 2019, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[67]  Katsiaryna Mirylenka,et al.  Correlation-Aware Distance Measures for Data Series , 2017, EDBT.

[68]  Themis Palpanas,et al.  Coconut Palm: Static and Streaming Data Series Exploration Now in your Palm , 2019, SIGMOD Conference.

[69]  Kostas Stefanidis,et al.  Fair Team Recommendations for Multidisciplinary Projects , 2019, 2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[70]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[71]  Barry Smyth,et al.  Case-based recommender systems , 2005, The Knowledge Engineering Review.

[72]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[73]  François Goasdoué,et al.  RDF analytics: lenses over semantic graphs , 2014, WWW.

[74]  Sergio Greco,et al.  A Logical Framework for Querying and Repairing Inconsistent Databases , 2003, IEEE Trans. Knowl. Data Eng..

[75]  Muhammad Zubair Asghar,et al.  Lexicon-enhanced sentiment analysis framework using rule-based classification scheme , 2017, PloS one.

[76]  Andreas Thor,et al.  Learning-Based Approaches for Matching Web Data Entities , 2010, IEEE Internet Computing.

[77]  Melvin Fitting,et al.  Bilattices and the Semantics of Logic Programming , 1991, J. Log. Program..

[78]  Bernhard Seeger,et al.  An Evaluation of Generic Bulk Loading Techniques , 2001, VLDB.

[79]  Elke A. Rundensteiner,et al.  GBI: A Generalized R-Tree Bulk-Insertion Strategy , 1999, SSD.

[80]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[81]  Peter Christen,et al.  A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication , 2012, IEEE Transactions on Knowledge and Data Engineering.

[82]  Min-Yen Kan,et al.  Extracting and matching authors and affiliations in scholarly documents , 2013, JCDL '13.

[83]  Matthew R. Walter,et al.  Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences , 2015, AAAI.

[84]  Claudia Niederée,et al.  On-the-fly entity-aware query processing in the presence of linkage , 2010, Proc. VLDB Endow..

[85]  Benoît Sagot,et al.  Population of a Knowledge Base for News Metadata from Unstructured Text and Web Data , 2012, AKBC-WEKEX@NAACL-HLT.

[86]  Eamonn J. Keogh,et al.  iSAX: indexing and mining terabyte sized time series , 2008, KDD.

[87]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[88]  Katsiaryna Mirylenka,et al.  Uncertain Time-Series Similarity: Return to the Basics , 2012, Proc. VLDB Endow..

[89]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[90]  Osmar R. Zaïane,et al.  Current State of Text Sentiment Analysis from Opinion to Emotion Mining , 2017, ACM Comput. Surv..

[91]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[92]  Lorena Etcheverry,et al.  Querying Semantic Web Data Cubes , 2016, AMW.

[93]  Daniel Zappala,et al.  Condensing Steam: Distilling the Diversity of Gamer Behavior , 2016, Internet Measurement Conference.

[94]  Gottfried Vossen,et al.  Towards Self-Service Business Intelligence , 2013 .

[95]  Hans-Peter Kriegel,et al.  Similarity Search on Time Series Based on Threshold Queries , 2006, EDBT.

[96]  Dietrich Rebholz-Schuhmann,et al.  Using argumentation to extract key sentences from biomedical abstracts , 2007, Int. J. Medical Informatics.

[97]  Eamonn J. Keogh,et al.  Experimental comparison of representation methods and distance measures for time series data , 2010, Data Mining and Knowledge Discovery.

[98]  Usman Qamar,et al.  TOM: Twitter opinion mining framework using hybrid classification scheme , 2014, Decis. Support Syst..

[99]  George Papastefanatos,et al.  Enabling persistent identification of groups of duplicates in data aggregators , 2016, 2016 IEEE 32nd International Conference on Data Engineering Workshops (ICDEW).

[100]  Kostas Stefanidis,et al.  Enhancing Long Term Fairness in Recommendations with Variational Autoencoders , 2019, MEDES.

[101]  Panagiota Fatourou,et al.  ParIS+: Data Series Indexing on Multi-Core Architectures , 2020, IEEE Transactions on Knowledge and Data Engineering.

[102]  Yannis Tzitzikas,et al.  How Linked Data can Aid Machine Learning-Based Tasks , 2017, TPDL.

[103]  Vayianos Pertsas,et al.  Ontology Driven Extraction of Research Processes , 2018, SEMWEB.

[104]  Enrico Motta,et al.  TechMiner: Extracting Technologies from Academic Publications , 2016, EKAW.

[105]  Dimitrios Gunopulos,et al.  Finding Similar Time Series , 1997, PKDD.

[106]  George Papastefanatos,et al.  Scaling Entity Resolution to Large, Heterogeneous Data with Enhanced Meta-blocking , 2016, EDBT.

[107]  Kostas Stefanidis,et al.  RDF Query Answering Using Apache Spark: Review and Assessment , 2018, 2018 IEEE 34th International Conference on Data Engineering Workshops (ICDEW).

[108]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[109]  Themis Palpanas,et al.  Big Sequence Management: A glimpse of the Past, the Present, and the Future , 2016, SOFSEM.

[110]  Panagiota Fatourou,et al.  MESSI: In-Memory Data Series Indexing , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[111]  Kostas Stefanidis,et al.  Social-Based Collaborative Filtering , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[112]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.

[113]  Themis Palpanas,et al.  ADS: the adaptive data series index , 2016, The VLDB Journal.

[114]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[115]  Jean-Daniel Fekete,et al.  Progressive Analytics: A Computation Paradigm for Exploratory Data Analysis , 2016, ArXiv.

[116]  Andreas Harth,et al.  Transforming statistical linked data for use in OLAP systems , 2011, I-Semantics '11.

[117]  Kostas Stefanidis,et al.  Exploring RDFS KBs Using Summaries , 2018, International Semantic Web Conference.

[118]  Violaine Prince,et al.  Ontology Population via NLP Techniques in Risk Management , 2008 .

[119]  Lorena Etcheverry,et al.  Enhancing OLAP Analysis with Web Cubes , 2012, ESWC.

[120]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[121]  Themis Palpanas,et al.  DPiSAX: Massively Distributed Partitioned iSAX , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[122]  João Fernando Ferreira,et al.  Framer: Planning Models from Natural Language Action Descriptions , 2017, ICAPS.

[123]  Yacine Ouzrout,et al.  Negation Handling in Sentiment Analysis at Sentence Level , 2017, J. Comput..

[124]  Hiroyuki Kitagawa,et al.  An ETL Framework for Online Analytical Processing of Linked Open Data , 2013, WAIM.

[125]  Johannes Gehrke,et al.  Query Workloads for Data Series Indexes , 2015, KDD.

[126]  Claudia Niederée,et al.  A Blocking Framework for Entity Resolution in Highly Heterogeneous Information Spaces , 2013, IEEE Transactions on Knowledge and Data Engineering.

[127]  Ira Assent,et al.  The TS-tree: efficient time series search and retrieval , 2008, EDBT '08.

[128]  George Papastefanatos,et al.  Schema-agnostic vs Schema-based Configurations for Blocking Methods on Homogeneous Data , 2015, Proc. VLDB Endow..

[129]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[130]  Ashwin Machanavajjhala,et al.  Entity Resolution: Theory, Practice & Open Challenges , 2012, Proc. VLDB Endow..

[131]  Kostas Stefanidis,et al.  Incremental Data Partitioning of RDF Data in SPARK , 2018, ESWC.

[132]  Lutz Bornmann,et al.  Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references , 2014, J. Assoc. Inf. Sci. Technol..

[133]  Phokion G. Kolaitis,et al.  Repair checking in inconsistent databases: algorithms and complexity , 2009, ICDT '09.

[134]  Matthias Jarke,et al.  Logic Programming and Databases , 1984, Expert Database Workshop.

[135]  Themis Palpanas,et al.  Indexing for interactive exploration of big data series , 2014, SIGMOD Conference.

[136]  Themis Palpanas,et al.  Scalable data series subsequence matching with ULISSE , 2020, The VLDB Journal.

[137]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[138]  Craig Chambers,et al.  The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing , 2015, Proc. VLDB Endow..

[139]  Amy L. Murphy,et al.  Practical Data Prediction for Real-World Wireless Sensor Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[140]  Themis Palpanas,et al.  Report on the First and Second Interdisciplinary Time Series Analysis Workshop (ITISA) , 2019, SGMD.

[141]  Patrick Schäfer,et al.  SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets , 2012, EDBT '12.

[142]  Yoav Shoham,et al.  Content-Based, Collaborative Recommendation. , 1997 .

[143]  Nicolas Spyratos,et al.  A High Level Query Language for Big Data Analytics , 2014 .

[144]  Sergio Greco,et al.  Computing Approximate Query Answers over Inconsistent Knowledge Bases , 2018, IJCAI.

[145]  Alexis Tsoukiàs,et al.  A first-order, four valued, weakly paraconsistent logic and its relation to rough sets semantics , 2002 .

[146]  Hans-Peter Kriegel,et al.  "Strength Lies in Differences": Diversifying Friends for Recommendations through Subspace Clustering , 2014, CIKM.

[147]  Hotham Altwaijry,et al.  Query-Driven Approach to Entity Resolution , 2013, Proc. VLDB Endow..

[148]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[149]  Yannis Tzitzikas,et al.  On Measuring the Lattice of Commonalities Among Several Linked Datasets , 2016, Proc. VLDB Endow..

[150]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[151]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[152]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[153]  Yuzuru Tanaka,et al.  Proximity-based federation of smart objects , 2015, Journal of Intelligent Information Systems.

[154]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[155]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[156]  Yu Song,et al.  POSBIOTM-NER : A Machine Learning Approach for Bio-Named Entity Recognition , 2004 .

[157]  Themis Palpanas,et al.  Data Series Management: The Road to Big Sequence Analytics , 2015, SGMD.

[158]  Ning An,et al.  Improving Performance with Bulk-Inserts in Oracle R-Trees , 2003, VLDB.

[159]  Kostas Stefanidis,et al.  FairGRecs: Fair Group Recommendations by Exploiting Personal Health Information , 2018, DEXA.

[160]  Andreas Thor,et al.  MOMA - A Mapping-based Object Matching System , 2007, CIDR.

[161]  Peter Christen,et al.  A Comparison of Fast Blocking Methods for Record Linkage , 2003, KDD 2003.

[162]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[163]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[164]  Dimitrios Gunopulos,et al.  Indexing Large Human-Motion Databases , 2004, VLDB.

[165]  Edward J. Eberle The Method and Role of Comparative Law , 2008 .

[166]  Themis Palpanas,et al.  Scalable, Variable-Length Similarity Search in Data Series: The ULISSE Approach , 2018, Proc. VLDB Endow..

[167]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[168]  Eljas Soisalon-Soininen,et al.  Single and Bulk Updates in Stratified Trees: An Amortized and Worst-Case Analysis , 2003, Computer Science in Perspective.

[169]  John Grant,et al.  Analysing inconsistent first-order knowledgebases , 2008, Artif. Intell..

[170]  Themis Palpanas,et al.  Return of the Lernaean Hydra: Experimental Evaluation of Data Series Approximate Similarity Search , 2019, Proc. VLDB Endow..

[171]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[172]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[173]  Frank Wm. Tompa,et al.  Efficiently updating materialized views , 1986, SIGMOD '86.

[174]  Tariq Rahim Soomro,et al.  Big Data Analysis: Apache Storm Perspective , 2015 .

[175]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[176]  Lorena Etcheverry,et al.  Efficient Analytical Queries on Semantic Web Data Cubes , 2017, Journal on Data Semantics.

[177]  Walter Daelemans,et al.  A formal framework for evaluation of information extraction , 2004 .

[178]  Paulo Sérgio Almeida,et al.  A Survey of Distributed Data Aggregation Algorithms , 2011, IEEE Communications Surveys & Tutorials.

[179]  Dimitrios Gunopulos,et al.  Streaming Time Series Summarization Using User-Defined Amnesic Functions , 2008, IEEE Transactions on Knowledge and Data Engineering.

[180]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[181]  Matthias Thimm On the Expressivity of Inconsistency Measures (Extended Abstract) , 2017, IJCAI.

[182]  Alexander S. Yeh,et al.  More accurate tests for the statistical significance of result differences , 2000, COLING.

[183]  Hans-Peter Kriegel,et al.  Fast Group Recommendations by Applying User Clustering , 2012, ER.

[184]  Neoklis Polyzotis,et al.  QueRIE: Collaborative Database Exploration , 2014, IEEE Transactions on Knowledge and Data Engineering.

[185]  Subbarao Kambhampati,et al.  Extracting Action Sequences from Texts Based on Deep Reinforcement Learning , 2018, IJCAI.

[186]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[187]  Nicolas Spyratos,et al.  Hypothesis-based semantics of logic programs in multivalued logics , 2004, TOCL.

[188]  Klaus H. Hinrichs,et al.  Efficient Bulk Operations on Dynamic R-Trees , 2001, Algorithmica.

[189]  Lise Getoor,et al.  Query-time entity resolution , 2006, KDD '06.

[190]  Raymond Reiter On Closed World Data Bases , 1977, Logic and Data Bases.

[191]  Jing Zhang,et al.  Collaborative filtering recommendation algorithm based on user preference derived from item domain features , 2014 .

[192]  Anastasia Bezerianos,et al.  Comparing Similarity Perception in Time Series Visualizations , 2019, IEEE Transactions on Visualization and Computer Graphics.

[193]  Frank van Harmelen,et al.  A semantic web primer , 2004 .

[194]  Vayianos Pertsas,et al.  Ontology-Driven Information Extraction from Research Publications , 2018, TPDL.

[195]  Themis Palpanas,et al.  Coconut: A Scalable Bottom-Up Approach for Building Data Series Indexes , 2018, Proc. VLDB Endow..

[196]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[197]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[198]  Torben Bach Pedersen,et al.  Time Series Management Systems: A Survey , 2017, IEEE Transactions on Knowledge and Data Engineering.

[199]  Amir Shaikhha,et al.  DBToaster: higher-order delta processing for dynamic, frequently fresh views , 2012, The VLDB Journal.

[200]  Themis Palpanas,et al.  The Parallel and Distributed Future of Data Series Mining , 2017, 2017 International Conference on High Performance Computing & Simulation (HPCS).

[201]  Jiawei Han,et al.  LINKREC: a unified framework for link recommendation with user attributes and graph structure , 2010, WWW '10.

[202]  Sebastian Hellmann,et al.  Real-Time RDF Extraction from Unstructured Data Streams , 2013, SEMWEB.

[203]  Andreas Thor,et al.  Evaluation of entity resolution approaches on real-world match problems , 2010, Proc. VLDB Endow..

[204]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[205]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[206]  Nicolas Spyratos,et al.  HIFUN - a high level functional query language for big data analytics , 2018, Journal of Intelligent Information Systems.

[207]  Bamshad Mobasher,et al.  A Survey of Collaborative Recommendation and the Robustness of Model-Based Algorithms , 2008, IEEE Data Eng. Bull..

[208]  Christopher D. Manning,et al.  Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers , 2011, IJCNLP.

[209]  Reynold Xin,et al.  Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark , 2018, SIGMOD Conference.

[210]  Themis Palpanas,et al.  ULISSE: ULtra Compact Index for Variable-Length Similarity Search in Data Series , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[211]  Themis Palpanas,et al.  Coconut: sortable summarizations for scalable indexes over static and streaming data series , 2019, The VLDB Journal.

[212]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[213]  Panagiotis Karras,et al.  Scalable kNN search on vertically stored time series , 2011, KDD.

[214]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[215]  Yannis Tzitzikas,et al.  Facetize: An Interactive Tool for Cleaning and Transforming Datasets for Facilitating Exploratory Search , 2018, ArXiv.

[216]  Yue Zhang,et al.  Context-Sensitive Twitter Sentiment Classification Using Neural Network , 2016, AAAI.

[217]  Eamonn J. Keogh,et al.  VALMOD: A Suite for Easy and Exact Detection of Variable Length Motifs in Data Series , 2018, SIGMOD Conference.

[218]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.