Distributed futures for efficient data transfer between parallel processes

This paper defines distributed futures, a construct that provides at the same time a data container similar to a distributed vector, and a single synchronization entity that behaves similarly to a standard future. This simple construct makes it easy to program a composition, in a task-parallel way, of several massively data-parallel tasks. The approach is implemented and evaluated in the context of a bulk synchronous parallel (BSP) active object framework.

[1]  Dave Clarke,et al.  ParT: An Asynchronous Parallel Abstraction for Speculative Pipeline Computations , 2016, COORDINATION.

[2]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[3]  Thomas Tran,et al.  A Machine Learning Approach for Identifying Disease-Treatment Relations in Short Texts , 2011, IEEE Transactions on Knowledge and Data Engineering.

[4]  Rob H. Bisseling,et al.  Bulk: A Modern C++ Interface for Bulk-Synchronous Parallel Programs , 2018, Euro-Par.

[5]  Torsten Suel,et al.  BSPlib: The BSP programming library , 1998, Parallel Comput..

[6]  Frank S. de Boer,et al.  On Futures for Streaming Data in ABS - (Short Paper) , 2017, FORTE.

[7]  Yifan Xu,et al.  Proactive work stealing for futures , 2019, PPoPP.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  Catherine Faron-Zucker,et al.  The KGRAM Abstract Machine for Knowledge Graph Querying , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[10]  John P. A. Ioannidis,et al.  Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review , 2017, J. Am. Medical Informatics Assoc..

[11]  Gul A. Agha,et al.  ACTORS - a model of concurrent computation in distributed systems , 1985, MIT Press series in artificial intelligence.

[12]  Frank S. de Boer,et al.  A Survey of Active Object Languages , 2017, ACM Comput. Surv..

[13]  Ludovic Henrio,et al.  Active Objects for Coordinating BSP Computations (Short Paper) , 2018, COORDINATION.

[14]  Catherine Faron-Zucker,et al.  Injecting Domain Knowledge in Electronic Medical Records to Improve Hospitalization Prediction , 2019, ESWC.

[15]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[16]  Vincent Emonet,et al.  SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes , 2018, BMC Bioinformatics.

[17]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[18]  Araceli Sanchis,et al.  Activity Recognition Using Hybrid Generative/Discriminative Models on Home Environments Using Binary Sensors , 2013, Sensors.

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  Ludovic Henrio,et al.  Godot: All the Benefits of Implicit and Explicit Futures , 2019, ECOOP.

[21]  Frédéric Loulergue,et al.  High Level BSP Programming: BSML and BSlambda , 1999, Scottish Functional Programming Workshop.

[22]  George Forman,et al.  Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement , 2010, SKDD.

[23]  Yoav Goldberg,et al.  Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets , 2019, EMNLP.

[24]  Pascal Staccini,et al.  Creation of the First French Database in Primary Care Using the ICPC2: Feasibility Study , 2017, MedInfo.

[25]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[26]  Dave Clarke,et al.  ParT: An Asynchronous Parallel Abstraction for Speculative Pipeline Computations , 2018 .

[27]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28]  Frédéric Loulergue,et al.  OSL: Optimized Bulk Synchronous Parallel Skeletons on Distributed Arrays , 2009, APPT.

[29]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[30]  Marc Pouzet,et al.  Programming parallelism with futures in lustre , 2012, EMSOFT '12.

[31]  Derek Wyatt Akka Concurrency , 2013 .

[32]  Macarena Espinilla,et al.  Using Ontologies for the Online Recognition of Activities of Daily Living† , 2018, Sensors.

[33]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[34]  Klaus Krippendorff,et al.  Estimating the Reliability, Systematic Error and Random Error of Interval Data , 1970 .