Process models are an important tool for software engineers to produce reliable software within schedule and budget. Especially technically challenging domains like machine learning need a supportive process model to guide the developers and stakeholders during the development process. One major problem type of machine learning is anomaly detection. Its goal is to identify anomalous data points (outlier) between the normal data instances. Anom- aly detection has a wide scope of applications in industrial and scienti c areas. Detecting intruders in computer networks, distin- guishing between cancerous and healthy tissue in medical images, cleaning data from disturbing outliers for further evaluation and many more. The cross-industry standard process for data mining (CRISP-DM) has been developed to support developers with all kinds of data mining applications. It describes a generic model of six phases that covers the whole development cycle. The generality of the CRISP-DM model is as much a strength as it is a weakness, since the particularities of di erent problem types like anomaly detection can not be addressed without making the model overly complex. There is a need for a more practical, specialised process model for anomaly detection applications. We demonstrate this issue and outline an approach towards a practical process model tailored to the development of anomaly detection systems.
[1]
Gregory Piatetsky-Shapiro,et al.
The KDD process for extracting useful knowledge from volumes of data
,
1996,
CACM.
[2]
VARUN CHANDOLA,et al.
Anomaly detection: A survey
,
2009,
CSUR.
[3]
MusílekPetr,et al.
A survey of Knowledge Discovery and Data Mining process models
,
2006
.
[4]
Lukasz Kurgan,et al.
Trends in Data Mining and Knowledge Discovery
,
2005
.
[5]
Thomas Reinartz,et al.
CRISP-DM 1.0: Step-by-step data mining guide
,
2000
.
[6]
Ernestina Menasalvas Ruiz,et al.
Toward data mining engineering: A software engineering approach
,
2009,
Inf. Syst..
[7]
Leon J. Osterweil,et al.
Software processes are software too
,
1987,
ISPW.
[8]
Javier Segovia,et al.
A Data Mining & Knowledge Discovery Process Model
,
2009
.
[9]
Charu C. Aggarwal,et al.
Outlier Analysis
,
2013,
Springer New York.