Big data analytics and knowledge discovery

We welcome you to this special issue dedicated to the best papers presented at the 16th International Conference on Data Warehousing and Knowledge Discovery (DaWaK) that was held in Munich, Germany, September 1–5, 2014. The DaWaK conference has been widely accepted as a key technology event for small, large, and global corporate and noncorporate organizations to improve their capabilities in data analysis, decision support, and the automatic extraction of knowledge from data. With the exponentially growing amount of information generated by social networks, sensors, etc., to be factored in the decision making process, the data to be considered become more and more complex both in terms of structure and semantics. In parallel, we are seeing a spectacular development of knowledge-sharing communities such as Wikipedia and the maturity of a number public knowledge bases (KBs) such as YAGO that may be used to reduce the heterogeneity between data sources. New developments such as cloud computing, big data, and KB add to the challenges, with massive scaling, a new computing infrastructure, and new types of data and semantics. Consequently, the process of retrieval and knowledge discovery from this deluge of heterogeneous complex data are crucial to the research in the domain. During the past years, the DaWaK conference has become one of the most important international scientific events that brings together researchers, developers, and professionals to discuss the latest research issues and experiences in developing and deploying data warehousing and knowledge discovery systems, applications, and solutions. DaWaK is in the top 20 of the Google Scholar Ranking related to Data Mining & Analysis: http://scholar.google.com/citations?view_op=top_venues& hl=fr&vq=eng_datamininganalysis. The DaWaK 2014 Conference built on this tradition of facilitating the cross-disciplinary exchange of ideas, experiences, and potential research directions. DaWaK 2014 aimed at introducing innovative principles, methods, models, algorithms and solutions, industrial products, and experiences that cover all phases of life cycle of a data warehouse application design to challenging problems met in the development of data warehousing, knowledge discovery, data mining applications, and the emerging area of high-performance computing. The DaWaK 2014 call for papers attracted 109 papers, and the program committee finally selected 34 full papers and 8 short papers, making an acceptance rate of 31%. The accepted papers cover a number of broad research areas on both theoretical and practical aspects of data warehouse and knowledge discovery. In the area of data warehousing, the topic covered included the modeling and ETL (extract, transform, load), ontologies, real-time data warehouses, query optimization, map reduce paradigm, storage models, scalability, distributed and parallel processing and data warehouses and data mining applications integration, recommendation and personalization, multidimensional analysis of text documents, and data warehousing for real-world applications such as health, bio-informatics, and telecommunication. In the areas of data mining and knowledge discovery, the topics included stream data analysis and mining, traditional data mining techniques topics such as frequent item sets, clustering, association, classification ranking and application of data mining technologies to real-world problems, and fuzzy mining, and skyline. It is especially notable to see that some papers covered emerging real-world applications such as bioinformatics, social network, telecommunication, and brain analysis. Out of the 34 full papers, we selected seven papers to be invited for the special issue in the Journal Concurrency and Computation: Practice and Experience, Wiley. After a second round of reviews, we finally accepted four papers. Thus, the relative acceptance rate for the papers included in this special issue is competitive. Needless to say, these four papers represent innovative and