Functional Dependencies Over Possibilistic Databases: An Interpretation Based on the Possible Worlds Semantics.

In this paper, we introduce a definition of the concept of a functional dependency (FD) in the context of databases containing ill-known attributes values represented by possibility distributions. Contrary to previous proposals, this definition is based on the possible worlds model and consists in viewing the satisfaction of an FD by a relation as an uncertain event whose possibility and necessity can be quantified. We give the principle of a method for incrementally computing the related possibility and necessity degrees and tackle the issue of tuple refinement in the presence of an FD.

[1]  R. Nichol,et al.  On Departures from a Power Law in the Galaxy Correlation Function , 2003, astro-ph/0301280.

[2]  Weak-lensing halo numbers and dark-matter profiles , 2001, astro-ph/0103465.

[3]  T. S. Jayram,et al.  Efficient allocation algorithms for OLAP over imprecise data , 2006, VLDB.

[4]  Christopher Ré,et al.  Efficient Evaluation of , 2007, DBPL.

[5]  Neta A. Bahcall,et al.  The Dependence on Environment of the Color-Magnitude Relation of Galaxies , 2003, astro-ph/0307336.

[6]  Jingren Zhou,et al.  SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..

[7]  David J. DeWitt,et al.  Parallel algorithms for the execution of relational database operations , 1983, TODS.

[8]  M. Giavalisco,et al.  Photometric redshifts of galaxies in COSMOS , 2006 .

[9]  M. Postman,et al.  The morphology-density relation - The group connection , 1984 .

[10]  Dan Olteanu,et al.  Fast and Simple Relational Processing of Uncertain Data , 2007, 2008 IEEE 24th International Conference on Data Engineering.

[11]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[12]  T. S. Jayram,et al.  Efficient aggregation algorithms for probabilistic data , 2007, SODA '07.

[13]  Padova,et al.  On the environmental dependence of halo formation , 2004 .

[14]  Yuri Gurevich,et al.  The complexity of query reliability , 1998, PODS.

[15]  Raghav Kaushik,et al.  Efficient exact set-similarity joins , 2006, VLDB.

[16]  R. Ellis,et al.  The 2dF Galaxy Redshift Survey: the dependence of galaxy clustering on luminosity and spectral type , 2001, astro-ph/0112043.

[17]  A. Mazure,et al.  The VIMOS-VLT deep survey - galaxy luminosity function per morphological type up to z = 1.2 , 2006 .

[18]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[19]  Prithviraj Sen,et al.  Representing and Querying Correlated Tuples in Probabilistic Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[20]  Jennifer Widom,et al.  Databases with uncertainty and lineage , 2008, The VLDB Journal.

[21]  Rahul Gupta,et al.  Creating probabilistic databases from information extraction models , 2006, VLDB.

[22]  Dan Suciu,et al.  Management of probabilistic data: foundations and challenges , 2007, PODS '07.

[23]  Andrew J. Connolly,et al.  Marked correlations in galaxy formation models , 2005 .

[24]  Andrew W. Moore,et al.  A multiple tree algorithm for the efficient association of asteroid observations , 2005, KDD '05.

[25]  Christopher Ré,et al.  Materialized Views in Probabilistic Databases for Information Exchange and Query Optimization , 2007, VLDB.

[26]  A. Hamilton,et al.  Evidence for biasing in the CfA survey , 1988 .

[27]  Rob Pike,et al.  Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..

[28]  Peter J. Haas,et al.  MCDB: a monte carlo approach to managing uncertain data , 2008, SIGMOD Conference.

[29]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[30]  Christopher Ré,et al.  Event queries on correlated probabilistic streams , 2008, SIGMOD Conference.

[31]  Christopher Ré,et al.  Efficient Top-k Query Evaluation on Probabilistic Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[32]  Wayne Hu,et al.  Effects of Photometric Redshift Uncertainties on Weak-Lensing Tomography , 2005 .

[33]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[34]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[35]  Dan Suciu,et al.  The Boundary Between Privacy and Utility in Data Publishing , 2007, VLDB.