Ranking Mortgage Origination Applications Using Customer, Product, Environment and Workflow Attributes

In this paper, we analyze the performance of an end-toendMortgage Origination (MO) process. The process beginswith the submission of a mortgage application by anapplicant to a lender and ends with one of the followingoutcomes: closing, i.e., loan approved by the lender andaccepted by the applicant or non-closing, i.e., loan eitherrejected by the lender, or approved by the lender and notaccepted by the applicant. Ranking mortgage applicationsby their predicted likelihood of closing at various steps inthe process is useful for process efficiency and identificationof actionable insights to convert applications likely tonon-close into those that are likely to close.To build models for ranking applications at any step ofthe MO process, we take into account customer and productspecific attributes of the applications as well as environmentattributes and the history of the applications or workflow.The large state-space of the workflow makes the rankingproblem challenging. We propose two workflow attributes,each with a state-space of dimension one, based on the numberof visits to any step and a particular step (re-work) respectively.We find that incorporating these workflow attributesinto the density modeling technique that we developresults in improvement of 4:8 percent in Average Precisionover models that only incorporate customer, product andenvironment attributes. The simple and scalable densitymodeling technique allows for easy identification of applicationsthat are likely to non-close and consequent correctiveaction such as change in the attributes of the mortgageproduct being offered. Further, our results indicate that themodel is comparable to Support Vector Machines and superiorto Logistic Regression for ranking.

[1]  Dimitrios Gunopulos,et al.  Mining Process Models from Workflow Logs , 1998, EDBT.

[2]  W. Greene,et al.  计量经济分析 = Econometric analysis , 2009 .

[3]  Chitra Dorai,et al.  A new policy for the service request assignment problem with multiple severity level, due date and sla penalty service requests , 2008, 2008 Winter Simulation Conference.

[4]  R. Gerritsen Assessing loan risks: a data mining case study , 1999 .

[5]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[6]  Emine Yilmaz,et al.  A geometric interpretation of r-precision and its correlation with average precision , 2005, SIGIR '05.

[7]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[8]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms, and Systems , 1994 .

[9]  Yi Chen,et al.  Efficient ticket routing by resolution sequence mining , 2008, KDD.

[10]  Vijay S. Iyengar,et al.  Analytics for Audit and Business Controls in Corporate Travel and Entertainment , 2007, AusDM.

[11]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[12]  Yin Zhao,et al.  Mortgage data mining , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).

[13]  I-Min A. Chen,et al.  Modeling scientific experiments with an object data model , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[14]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.