Detection of outlier information by the use of linguistic summaries based on classic and interval‐valued fuzzy sets

Automatic summary of databases is an important tool in strategic decision‐making. This paper presents the application of linguistic summaries to outlier detection in databases containing both text and numeric attributes. The proposed method applies Yager’s standard summary based on interval‐valued fuzzy sets. Fuzzy similarity measures are the features which are looked for. Detection of outliers can identify defects, remove impurities from the data, and, most of all, it may provide the basis for decision‐making processes. In this paper, we introduce a definition of an outlier based on linguistic summaries. Feasibility of the method is demonstrated on practical examples.

[1]  Piotr S. Szczepaniak,et al.  Internet Search Based on Text Intuitionistic Fuzzy Similarity , 2003, Intelligent Exploration of the Web.

[2]  A. Duraj,et al.  Information Outliers and Their Detection , 2017 .

[3]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[4]  C.J.H. Mann Similarity and Compatibility in Fuzzy Set Theory – Assessment and Applications , 2002 .

[5]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[6]  Lotfi A. Zadeh,et al.  The concept of a linguistic variable and its application to approximate reasoning-III , 1975, Inf. Sci..

[7]  Chris Cornelis,et al.  Implication in intuitionistic fuzzy and interval-valued fuzzy set theory: construction, classification, application , 2004, Int. J. Approx. Reason..

[8]  A. Madansky Identification of Outliers , 1988 .

[9]  Krzysztof Myszkorowski,et al.  Analysis of fuzzy -ary relations with the use of interval-valued fuzzy functional dependencies , 2013, Int. J. Gen. Syst..

[10]  Adam Niewiadomski,et al.  Interval-Valued and Interval Type-2 Fuzzy Sets: A Subjective Comparison , 2007, 2007 IEEE International Fuzzy Systems Conference.

[11]  A. Niewiadomski,et al.  Interval-valued linguistic summaries of databases , 2006 .

[12]  Anna Wilbik,et al.  Linguistic summarization of time series using a fuzzy quantifier driven aggregation , 2008, Fuzzy Sets Syst..

[13]  Anna Wilbik,et al.  An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation , 2010, Int. J. Intell. Syst..

[14]  Piotr S. Szczepaniak,et al.  Outlier detection using linguistically quantified statements , 2018, Int. J. Intell. Syst..

[15]  A. Niewiadomski,et al.  News Generating Based on Interval Type-2 Linguistic Summaries of Databases , 2006 .

[16]  Agnieszka Duraj,et al.  Outlier detection in medical data using linguistic summaries , 2017, 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA).

[17]  Adam Niewiadomski,et al.  A Type-2 Fuzzy Approach to Linguistic Summarization of Data , 2008, IEEE Transactions on Fuzzy Systems.

[18]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[19]  Charu C. Aggarwal Outlier Detection in Categorical, Text and Mixed Attribute Data , 2013 .

[20]  Lotfi A. Zadeh,et al.  A COMPUTATIONAL APPROACH TO FUZZY QUANTIFIERS IN NATURAL LANGUAGES , 1983 .

[21]  K. Atanassov,et al.  Interval-Valued Intuitionistic Fuzzy Sets , 2019, Studies in Fuzziness and Soft Computing.

[22]  Janusz Kacprzyk,et al.  LINGUISTIC SUMMARIES OF DATA USING FUZZY LOGIC , 2001 .

[23]  Ronald R. Yager,et al.  Linguistic Summaries as a Tool for Database Discovery , 1994, FQAS.

[24]  Chris Cornelis,et al.  RELATING INTUITIONISTIC FUZZY SETS AND INTERVAL-VALUED FUZZY SETS THROUGH BILATTICES , 2004 .

[25]  Mark Burgin,et al.  Information Studies and the Quest for Transdisciplinarity , 2017 .

[26]  Slawomir Zadrozny,et al.  Bipolar Queries: Some Inspirations from Intention and Preference Modeling , 2012, Combining Experimentation and Theory.

[27]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[28]  Adam Niewiadomski,et al.  Efficient similarity measures for texts matching , 2015 .

[29]  Daniel Sánchez,et al.  Fuzzy quantification: a state of the art , 2014, Fuzzy Sets Syst..

[30]  I. Turksen Interval valued fuzzy sets based on normal forms , 1986 .

[31]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[32]  Lotfi A. Zadeh,et al.  Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic , 1997, Fuzzy Sets Syst..

[33]  Adam Niewiadomski Methods for the Linguistic Summarization of Data: Applications of Fuzzy Sets and Their Extensions , 2008 .

[34]  Piotr S. Szczepaniak,et al.  Detection of Outlier Information Using Linguistic Summarization , 2015, FQAS.

[35]  Ronald R. Yager,et al.  A new approach to the summarization of data , 1982, Inf. Sci..

[36]  Chris Cornelis,et al.  Intuitionistic fuzzy sets and interval-valued fuzzy sets: a critical comparison , 2003, EUSFLAT Conf..

[37]  Slawomir Zadrozny,et al.  Computing With Words Is an Implementable Paradigm: Fuzzy Queries, Linguistic Data Summaries, and Natural-Language Generation , 2010, IEEE Transactions on Fuzzy Systems.