A hybrid model for business process event and outcome prediction

Large service companies run complex customer service processes to provide communication services to their customers. The flawless execution of these processes is essential because customer service is an important differentiator. They must also be able to predict if processes will complete successfully or run into exceptions in order to intervene at the right time, preempt problems and maintain customer service. Business process data are sequential in nature and can be very diverse. Thus, there is a need for an efficient sequential forecasting methodology that can cope with this diversity. This paper proposes two approaches, a sequential k nearest neighbour and an extension of Markov models both with an added component based on sequence alignment. The proposed approaches exploit temporal categorical features of the data to predict the process next steps using higher order Markov models and the process outcomes using sequence alignment technique. The diversity aspect of the data is also added by considering subsets of similar process sequences based on k nearest neighbours. We have shown, via a set of experiments, that our sequential k nearest neighbour offers better results when compared with the original ones; our extension Markov model outperforms random guess, Markov models and hidden Markov models.

[1]  George Karypis,et al.  Selective Markov models for predicting Web page accesses , 2004, TOIT.

[2]  Douglas A. Popken,et al.  A hybrid system-identification method for forecasting telecommunications product demands , 2002 .

[3]  B. Majeed,et al.  Business process forecasting in telecom industry , 2011, 2011 IEEE GCC Conference and Exhibition (GCC).

[4]  Peter Pirolli,et al.  Distributions of surfers' paths through the World Wide Web: Empirical characterizations , 1999, World Wide Web.

[5]  Oded Netzer,et al.  A Hidden Markov Model of Customer Relationship Dynamics , 2008, Mark. Sci..

[6]  Peter Pirolli,et al.  Mining Longest Repeating Subsequences to Predict World Wide Web Surfing , 1999, USENIX Symposium on Internet Technologies and Systems.

[7]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[8]  M. Waterman,et al.  Estimating statistical significance of sequence alignments. , 1994, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[9]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[10]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[11]  Mary P. Harper,et al.  A Second-Order Hidden Markov Model for Part-of-Speech Tagging , 1999, ACL.

[12]  Detlef D. Nauck,et al.  K Nearest Sequence Method and Its Application to Churn Prediction , 2006, IDEAL.

[13]  Bogdan Gabrys,et al.  A Non-sequential Representation of Sequential Data for Churn Prediction , 2009, KES.

[14]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[15]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[16]  Michalis Vazirgiannis,et al.  Web path recommendations based on page ranking and Markov models , 2005, WIDM '05.

[17]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.