The data analytics group at the qatar computing research institute

The Qatar Computing Research Institute (QCRI), a member of Qatar Foundation for Education, Science and Community Development, started its activities in early 2011. QCRI is focusing on tackling large-scale computing challenges that address national priorities for growth and development and that have global impact in computing research. QCRI has currently five research groups working on different aspects of computing, these are: Arabic Language Technologies, Social Computing, Scientific Computing, Cloud Computing, and Data Analytics. The data analytics group at QCRI, DA@QCRI for short, has embarked in an ambitious endeavour to become a premiere world-class research group by tackling diverse research topics related to data quality, data integration, information extraction, scientific data management, and data mining. In the short time since its birth, DA@QCRI has grown to now have eight permanent scientists, two software engineers and around ten interns and postdocs at any given time. The group contributions are starting to appear in top venues.

[1]  Lukasz Golab,et al.  On the relative trust between inconsistent data and inaccurate constraints , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[2]  Dennis Shasha,et al.  Declarative Data Cleaning: Language, Model, and Algorithms , 2001, VLDB.

[3]  Ihab F. Ilyas,et al.  Just-in-time information extraction using extraction views , 2012, SIGMOD Conference.

[4]  Jianzhong Li,et al.  Incremental Detection of Inconsistencies in Distributed Data , 2012, IEEE Transactions on Knowledge and Data Engineering.

[5]  Felix Naumann,et al.  Advancing the discovery of unique column combinations , 2011, CIKM '11.

[6]  Jan Chomicki,et al.  Answer sets for consistent query answering in inconsistent databases , 2002, Theory and Practice of Logic Programming.

[7]  Wenfei Fan,et al.  Conditional functional dependencies for capturing data inconsistencies , 2008, TODS.

[8]  Felix Naumann,et al.  Quality-Driven Query Answering for Integrated Information Systems , 2002, Lecture Notes in Computer Science.

[9]  Cong Yu,et al.  Who Tags What? An Analysis Framework , 2012, Proc. VLDB Endow..

[10]  Michael Stonebraker,et al.  Data Curation at Scale: The Data Tamer System , 2013, CIDR.

[11]  Paolo Papotti,et al.  Scalable data exchange with functional dependencies , 2010, Proc. VLDB Endow..

[12]  Shuai Ma,et al.  Improving Data Quality: Consistency and Accuracy , 2007, VLDB.

[13]  Joseph M. Hellerstein,et al.  Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.

[14]  Wenfei Fan,et al.  Inferring data currency and consistency for conflict resolution , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[15]  Shai Ben-David,et al.  Modeling and Querying Possible Repairs in Duplicate Detection , 2009, Proc. VLDB Endow..

[16]  Shuai Ma,et al.  Interaction between Record Matching and Data Repairing , 2014, JDIQ.

[17]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.

[18]  Ahmed K. Elmagarmid,et al.  Behavior based record linkage , 2010, Proc. VLDB Endow..

[19]  Jianzhong Li,et al.  CerFix: A System for Cleaning Data with Certain Fixes , 2011, Proc. VLDB Endow..

[20]  Ahmed K. Elmagarmid,et al.  Guided data repair , 2011, Proc. VLDB Endow..

[21]  Paolo Papotti,et al.  Holistic data cleaning: Putting violations into context , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Ahmed K. Elmagarmid,et al.  GDR: a system for guided data repair , 2010, SIGMOD Conference.

[23]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[24]  Ahmed K. Elmagarmid,et al.  TAILOR: a record linkage toolbox , 2002, Proceedings 18th International Conference on Data Engineering.

[25]  Cong Yu,et al.  MapRat: Meaningful Explanation, Interactive Exploration and Geo-Visualization of Collaborative Ratings , 2012, Proc. VLDB Endow..

[26]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[27]  Jef Wijsen,et al.  Database repairing using updates , 2005, TODS.

[28]  Jianzhong Li,et al.  Towards certain fixes with editing rules and master data , 2010, The VLDB Journal.

[29]  Rajeev Rastogi,et al.  A cost-based model and effective heuristic for repairing constraints by value modification , 2005, SIGMOD '05.

[30]  Jianzhong Li,et al.  Reasoning about Record Matching Rules , 2009, Proc. VLDB Endow..