Data, especially in large item sets, hide a wealth of information on the processes that have created and modified them. Often, a data-field or a set of data-fields are not modified only through well-defined processes, but also through latent processes; without the knowledge of the second type of processes, testing cannot be considered exhaustive. As a matter of fact, changes in the data deriving from unknown processes can cause anomalies not detectable by testing, which focuses on known data variation rules. History of data variations can yield information about the nature of the changes. In my work I focus on the elicitation of an evolution profile of data: the values data may assume, the change frequencies, the temporal variation of a piece of data in relation to other data, or other constraints that are directly connected to the reference domain. The profile of evolution is then used to detect anomalies in the database state evolution. Detecting anomalies in the database state evolution could strengthen the quality of a system, since a data anomaly could be the signal of a defect in the applications populating the database.
[1]
Yizhou Sun,et al.
Integrating community matching and outlier detection for mining evolutionary community outliers
,
2012,
KDD.
[2]
Ulrich Güntzer,et al.
Algorithms for association rule mining — a general survey and comparison
,
2000,
SKDD.
[3]
Jugal K. Kalita,et al.
A Survey of Outlier Detection Methods in Network Anomaly Identification
,
2011,
Comput. J..
[4]
Navjot Singh,et al.
A log mining approach to failure analysis of enterprise telephony systems
,
2008,
2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).
[5]
VARUN CHANDOLA,et al.
Anomaly detection: A survey
,
2009,
CSUR.