A survey on unsupervised outlier detection in high‐dimensional numerical data
暂无分享,去创建一个
Hans-Peter Kriegel | Arthur Zimek | Erich Schubert | A. Zimek | H. Kriegel | Erich Schubert | Arthur Zimek
[1] Reda Alhajj,et al. A comprehensive survey of numeric and symbolic outlier mining techniques , 2006, Intell. Data Anal..
[2] Dimitris Achlioptas,et al. Database-friendly random projections , 2001, PODS.
[3] Christian Böhm,et al. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.
[4] Ke Zhang,et al. A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data , 2009, PAKDD.
[5] Christos Faloutsos,et al. Example-Based Outlier Detection for High Dimensional Datasets , 2005 .
[6] Philip S. Yu,et al. On High Dimensional Indexing of Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[7] Sanjay Chawla,et al. SLOM: a new measure for local spatial outliers , 2006, Knowledge and Information Systems.
[8] Hans-Peter Kriegel,et al. Subspace and projected clustering: experimental evaluation and analysis , 2009, Knowledge and Information Systems.
[9] Alexandr Andoni,et al. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[10] John A. Hartigan,et al. Clustering Algorithms , 1975 .
[11] J. Matousek,et al. On variants of the Johnson–Lindenstrauss lemma , 2008 .
[12] Christian Böhm,et al. Independent quantization: an index compression technique for high-dimensional data spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).
[13] Hans-Peter Kriegel,et al. Subspace Similarity Search: Efficient k-NN Queries in Arbitrary Subspaces , 2010, SSDBM.
[14] Sridhar Ramaswamy,et al. Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.
[15] Elke Achtert,et al. Visual Evaluation of Outlier Detection Models , 2010, DASFAA.
[16] Hans-Peter Kriegel,et al. Angle-based outlier detection in high-dimensional data , 2008, KDD.
[17] Sameer Singh,et al. Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..
[18] Vipin Kumar,et al. Feature bagging for outlier detection , 2005, KDD '05.
[19] Huan Liu,et al. Subspace clustering for high dimensional data: a review , 2004, SKDD.
[20] Christian Böhm,et al. Fast parallel similarity search in multimedia databases , 1997, SIGMOD '97.
[21] Tok Wang Ling,et al. HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data , 2004, VLDB.
[22] Ira Assent,et al. OutRank: ranking outliers in high dimensional data , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.
[23] Hans-Peter Kriegel,et al. Subspace clustering , 2012, WIREs Data Mining Knowl. Discov..
[24] J. Douglas Carroll,et al. Is the Distance Compression Effect Overstated? Some Theory and Experimentation , 2009, MLDM.
[25] Hans-Peter Kriegel,et al. The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.
[26] D. Hilbert. Ueber die stetige Abbildung einer Line auf ein Flächenstück , 1891 .
[27] Vivekanand Gopalkrishnan,et al. Efficient Pruning Schemes for Distance-Based Outlier Detection , 2009, ECML/PKDD.
[28] Ira Assent,et al. Evaluating Clustering in Subspace Projections of High Dimensional Data , 2009, Proc. VLDB Endow..
[29] Rasmus Pagh,et al. A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data , 2012, KDD.
[30] Shirish Tatikonda,et al. Locality Sensitive Outlier Detection: A ranking driven approach , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[31] Ira Assent,et al. Subspace outlier mining in large multimedia databases , 2007, Parallel Universes and Local Patterns.
[32] Sanjay Chawla,et al. Finding Local Anomalies in Very High Dimensional Space , 2010, 2010 IEEE International Conference on Data Mining.
[33] Hans-Peter Kriegel,et al. The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.
[34] Wolfgang Müller,et al. Faster Exact Histogram Intersection on Large Data Collections Using Inverted VA-Files , 2004, CIVR.
[35] Raymond T. Ng,et al. Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.
[36] Peter Filzmoser,et al. Outlier identification in high dimensions , 2008, Comput. Stat. Data Anal..
[37] Philip S. Yu,et al. Redefining Clustering for High-Dimensional Applications , 2002, IEEE Trans. Knowl. Data Eng..
[38] Charu C. Aggarwal,et al. On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.
[39] Ira Assent,et al. EDSC: efficient density-based subspace clustering , 2008, CIKM '08.
[40] A. Zimek,et al. Deriving quantitative models for correlation clusters , 2006, KDD '06.
[41] Osmar R. Zaïane,et al. An Efficient Reference-Based Approach to Outlier Detection in Large Datasets , 2006, Sixth International Conference on Data Mining (ICDM'06).
[42] Elke Achtert,et al. Evaluation of Clusterings -- Metrics and Visual Support , 2012, 2012 IEEE 28th International Conference on Data Engineering.
[43] Victoria J. Hodge,et al. A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.
[44] Michel Verleysen,et al. Quality assessment of dimensionality reduction: Rank-based criteria , 2009, Neurocomputing.
[45] Arnold P. Boedihardjo,et al. GLS-SOD: a generalized local statistical approach for spatial outlier detection , 2010, KDD '10.
[46] Michel Verleysen,et al. The Curse of Dimensionality in Data Mining and Time Series Prediction , 2005, IWANN.
[47] A. Zimek,et al. Subspace Clustering, Ensemble Clustering, Alternative Clustering, Multiview Clustering: What Can We Learn From Each Other? , 2010 .
[48] Klemens Böhm,et al. HiCS: High Contrast Subspaces for Density-Based Outlier Ranking , 2012, 2012 IEEE 28th International Conference on Data Engineering.
[49] Hans-Peter Kriegel,et al. Outlier Detection in Axis-Parallel Subspaces of High Dimensional Data , 2009, PAKDD.
[50] Marios Hadjieleftheriou,et al. R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.
[51] Clara Pizzuti,et al. Fast Outlier Detection in High Dimensional Spaces , 2002, PKDD.
[52] Hans-Peter Kriegel,et al. Evaluation of Multiple Clustering Solutions , 2011, MultiClust@ECML/PKDD.
[53] Xiang Lian,et al. Similarity Search in Arbitrary Subspaces Under Lp-Norm , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[54] Felix Naumann,et al. Data fusion , 2009, CSUR.
[55] G. Peano. Sur une courbe, qui remplit toute une aire plane , 1890 .
[56] VARUN CHANDOLA,et al. Anomaly detection: A survey , 2009, CSUR.
[57] Arthur Zimek,et al. A survey on enhanced subspace clustering , 2013, Data Mining and Knowledge Discovery.
[58] Shin'ichi Satoh,et al. The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.
[59] Jung-Min Park,et al. An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.
[60] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.
[61] Emmanuel Müller,et al. SOREX: Subspace Outlier Ranking Exploration Toolkit , 2010, ECML/PKDD.
[62] Christos Faloutsos,et al. Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.
[63] Charu C. Aggarwal,et al. Re-designing distance functions and distance-based applications for high dimensional data , 2001, SGMD.
[64] Hans-Peter Kriegel,et al. Quality of Similarity Rankings in Time Series , 2011, SSTD.
[65] Beng Chin Ooi,et al. An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).
[66] Ramakrishnan Srikant,et al. Fast algorithms for mining association rules , 1998, VLDB 1998.
[67] Ira Assent,et al. Clustering high dimensional data , 2012 .
[68] Elke Achtert,et al. Global Correlation Clustering Based on the Hough Transform , 2008, Stat. Anal. Data Min..
[69] Christos Faloutsos,et al. Example-based robust outlier detection in high dimensional datasets , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).
[70] Philip S. Yu,et al. Finding generalized projected clusters in high dimensional spaces , 2000, SIGMOD '00.
[71] Christos Faloutsos,et al. The TV-tree: An index structure for high-dimensional data , 1994, The VLDB Journal.
[72] W. B. Johnson,et al. Extensions of Lipschitz mappings into Hilbert space , 1984 .
[73] Ira Assent,et al. An Unbiased Distance-Based Outlier Detection Approach for High-Dimensional Data , 2011, DASFAA.
[74] Vivekanand Gopalkrishnan,et al. Feature Extraction for Outlier Detection in High-Dimensional Spaces , 2010, FSDM.
[75] Li Yang. Distance‐preserving dimensionality reduction , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..
[76] Christian Böhm,et al. A cost model for nearest neighbor search in high-dimensional data space , 1997, PODS.
[77] Man Lung Yiu,et al. Iterative projected clustering by subspace mining , 2005, IEEE Transactions on Knowledge and Data Engineering.
[78] Ata Kabán,et al. When is 'nearest neighbour' meaningful: A converse theorem and implications , 2009, J. Complex..
[79] Jonathan Goldstein,et al. When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.
[80] Hans-Peter Kriegel,et al. LoOP: local outlier probabilities , 2009, CIKM.
[81] Christos Faloutsos,et al. On the 'Dimensionality Curse' and the 'Self-Similarity Blessing' , 2001, IEEE Trans. Knowl. Data Eng..
[82] Philip S. Yu,et al. Outlier detection for high dimensional data , 2001, SIGMOD '01.
[83] Dunja Mladenic,et al. The Role of Hubness in Clustering High-Dimensional Data , 2011, IEEE Transactions on Knowledge and Data Engineering.
[84] J. S. Marron,et al. Geometric representation of high dimension, low sample size data , 2005 .
[85] Jing Gao,et al. Converting Output Scores from Outlier Detection Algorithms into Probability Estimates , 2006, Sixth International Conference on Data Mining (ICDM'06).
[86] Stephen D. Bay,et al. Mining distance-based outliers in near linear time with randomization and a simple pruning rule , 2003, KDD '03.
[87] Raymond T. Ng,et al. Finding Intensional Knowledge of Distance-Based Outliers , 1999, VLDB.
[88] Clara Pizzuti,et al. Outlier mining in large high-dimensional data sets , 2005, IEEE Transactions on Knowledge and Data Engineering.
[89] Hans-Peter Kriegel,et al. Subspace similarity search using the ideas of ranking and top-k retrieval , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).
[90] Raghu Ramakrishnan,et al. Theory of nearest neighbors indexability , 2006, TODS.
[91] Martin Ester,et al. Density‐based clustering , 2019, WIREs Data Mining Knowl. Discov..
[92] Bell Telephone,et al. ROBUST ESTIMATES, RESIDUALS, AND OUTLIER DETECTION WITH MULTIRESPONSE DATA , 1972 .
[93] Vladimir Pestov,et al. On the geometry of similarity search: Dimensionality curse and concentration of measure , 1999, Inf. Process. Lett..
[94] Alexandros Nanopoulos,et al. Nearest neighbors in high-dimensional data: the emergence and influence of hubs , 2009, ICML '09.
[95] Michel Verleysen,et al. The Concentration of Fractional Distances , 2007, IEEE Transactions on Knowledge and Data Engineering.
[96] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[97] Hans-Peter Kriegel,et al. Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.
[98] Anthony Wirth,et al. Correlation Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.
[99] Emmanuel Müller,et al. Adaptive outlierness for subspace outlier ranking , 2010, CIKM '10.
[100] Anthony K. H. Tung,et al. Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.
[101] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD '00.
[102] Alexandros Nanopoulos,et al. Time-Series Classification in Many Intrinsic Dimensions , 2010, SDM.
[103] A. Zimek,et al. BeyOND — Unleashing BOND , 2011 .
[104] Anthony K. H. Tung,et al. Mining top-n local outliers in large databases , 2001, KDD '01.
[105] Stefan Berchtold,et al. Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets , 2003, IEEE Trans. Knowl. Data Eng..
[106] Philip S. Yu,et al. An effective and efficient algorithm for high-dimensional outlier detection , 2005, The VLDB Journal.
[107] Douglas M. Hawkins. Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.
[108] E. Gehan,et al. The properties of high-dimensional data spaces: implications for exploring gene and protein expression data , 2008, Nature Reviews Cancer.
[109] Kristin P. Bennett,et al. Density-based indexing for approximate nearest-neighbor queries , 1999, KDD '99.
[110] Ira Assent,et al. DUSC: Dimensionality Unbiased Subspace Clustering , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).
[111] Emmanuel Müller,et al. Statistical selection of relevant subspace projections for outlier ranking , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[112] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[113] V. A. Epanechnikov. Non-Parametric Estimation of a Multivariate Probability Density , 1969 .
[114] Jörg Sander,et al. Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering , 2008, KDD.
[115] Sergey Brin,et al. Near Neighbor Search in Large Metric Spaces , 1995, VLDB.
[116] Hans-Jörg Schek,et al. A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.
[117] Hans-Peter Kriegel,et al. A General Framework for Increasing the Robustness of PCA-Based Correlation Clustering Algorithms , 2008, SSDBM.
[118] Hans-Peter Kriegel,et al. Efficient Query Processing in Arbitrary Subspaces Using Vector Approximations , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).
[119] Srinivasan Parthasarathy,et al. Fast mining of distance-based outliers in high-dimensional datasets , 2008, Data Mining and Knowledge Discovery.
[120] Ata Kabán,et al. On the distance concentration awareness of certain data reduction techniques , 2011, Pattern Recognit..
[121] Sameer Singh,et al. Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..
[122] Ali S. Hadi,et al. Detection of outliers , 2009 .
[123] Xiaogang Su,et al. Outlier detection , 2011, WIREs Data Mining Knowl. Discov..
[124] Dimitris Achlioptas,et al. Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..
[125] A. Zimek,et al. On Using Class-Labels in Evaluation of Clusterings , 2010 .
[126] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[127] Hui Xiong,et al. Distance metrics for high dimensional nearest neighborhood recovery: Compression and normalization , 2012, Inf. Sci..
[128] Raymond T. Ng,et al. A unified approach for mining outliers , 1997, CASCON.
[129] Vipin Kumar,et al. Anomaly Detection for Discrete Sequences: A Survey , 2012, IEEE Transactions on Knowledge and Data Engineering.
[130] Alexandros Nanopoulos,et al. Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..
[131] Christos Faloutsos,et al. Analysis of the Clustering Properties of the Hilbert Space-Filling Curve , 2001, IEEE Trans. Knowl. Data Eng..
[132] Mario A. López,et al. High dimensional similarity search with space filling curves , 2001, Proceedings 17th International Conference on Data Engineering.
[133] Jian Tang,et al. Enhancing Effectiveness of Outlier Detections for Low Density Patterns , 2002, PAKDD.
[134] Hans-Peter Kriegel,et al. Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? , 2010, SSDBM.
[135] Fionn Murtagh,et al. The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering , 2008, J. Classif..
[136] Srinivasan Parthasarathy,et al. Distance-based outlier detection , 2010, Proc. VLDB Endow..
[137] Myoung-Ho Kim,et al. FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting , 2004, Inf. Softw. Technol..
[138] Jinyan Li,et al. Distance Based Subspace Clustering with Flexible Dimension Partitioning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.
[139] R. Suganya,et al. Data Mining Concepts and Techniques , 2010 .
[140] Dimitrios Gunopulos,et al. Subspace Clustering of High Dimensional Data , 2004, SDM.
[141] Elke Achtert,et al. Spatial Outlier Detection: Data, Algorithms, Visualizations , 2011, SSTD.
[142] Christos Faloutsos,et al. LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).
[143] Philip S. Yu,et al. Finding generalized projected clusters in high dimensional spaces , 2000, SIGMOD 2000.
[144] Arthur Zimek,et al. Subspace Clustering Techniques , 2009, Encyclopedia of Database Systems.
[145] Hans-Peter Kriegel,et al. On Evaluation of Outlier Rankings and Outlier Scores , 2012, SDM.
[146] Suresh Venkatasubramanian,et al. The Johnson-Lindenstrauss Transform: An Empirical Study , 2011, ALENEX.
[147] Albert-László Barabási,et al. Scale-Free Networks: A Decade and Beyond , 2009, Science.
[148] Shin'ichi Satoh,et al. Distinctiveness-sensitive nearest-neighbor search for efficient similarity retrieval of multimedia information , 2001, Proceedings 17th International Conference on Data Engineering.
[149] Piotr Indyk,et al. Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..
[150] Alexandros Nanopoulos,et al. On the existence of obstinate results in vector space models , 2010, SIGIR.
[151] Hans-Peter Kriegel,et al. Interpreting and Unifying Outlier Scores , 2011, SDM.
[152] Shashi Shekhar,et al. A Unified Approach to Detecting Spatial Outliers , 2003, GeoInformatica.
[153] Vivekanand Gopalkrishnan,et al. Mining Outliers with Ensemble of Heterogeneous Detectors on Random Subspaces , 2010, DASFAA.
[154] Edward Hung,et al. Mining Outliers with Faster Cutoff Update and Space Utilization , 2009, PAKDD.