Applicability of Clustering and Classification Algorithms for Recruitment Data Mining

of appropriate employees and their retention are the major concerns towards creating the competitive strength in the knowledge economy. Every year IT companies recruit fresh graduates through their campus selection programs after examining their skills by conducting tests, group discussion and a number of interviews. The recruitment process requires enormous amount of effort and investment. During each phase of the recruitment process, candidates are filtered based on some performance criteria. Intense analysis on the system indicates that a pattern exists among the candidates selected for an industry. The problem domain is complex and the aspects of candidates that impact the recruitment process is not explicit. In this research, the domain knowledge is extracted through knowledge acquisition techniques. Data mining techniques that fit the problem are determined. A study has been made by applying K-means and fuzzy C-means clustering and decision tree classification algorithms to the recruitment data of an industry. Experiments were conducted with the data collected from an IT industry to support their hiring decisions. Pruned and unpruned trees were constructed using ID3, C4.5 and CART algorithms. From the comparative study, it has been observed that the clustering algorithms are not much suitable for the problem and performance of the C4.5 decision tree algorithm is high. Using the constructed decision trees, discussions were made with the domain experts to deduce viable decision rules. candidate among those who graduate becomes a herculean task. The process involves lot of effort by the recruiting team and money spent for the process is phenomenal. One of the mechanisms used by the industries is to conduct tests and group discussions during the filtration process. The selection process uses different criteria that comprise the average of their semester marks, marks obtained in the aptitude, programming and technical tests conducted by the company, group discussion, technical and HR interviews. These criteria are common for all the students, but the skill level of the students vary since they are from different disciplines and backgrounds. With a varying curriculum, mode of delivery and the evaluation methodologies followed in the educational system, the recruitment process becomes much more challenging. The time taken and expenditure for conducting group discussions and interviews consumes more than 90% of the total effort for the recruitment process. It has been observed that 1 among 120 students who apply get selected and the ratio of number of candidates selected against the number of candidates interviewed after tests is approximately 1:20. Reducing these ratios will immensely help the industries to save the effort. Lot of effort is put to analyze the profiles of the applicants to determine the ones that suit the needs of the industry. The knowledge required for this process is not explicit as quantitative numbers but a hidden convention that may be extracted by mining the profiles of previous years and

[1]  Ramakrishnan Srikant,et al.  The Quest Data Mining System , 1996, KDD.

[2]  Vijay V. Raghavan,et al.  A new fuzzy clustering algorithm for optimally finding granular prototypes , 2005, Int. J. Approx. Reason..

[3]  Mohammad Saidi-Mehrabad,et al.  The development of an expert system for effective selection and appointment of the jobs applicants in human resource management , 2007, Comput. Ind. Eng..

[4]  R. Quinlan,et al.  Decision tree discovery , 1999 .

[5]  J. P. Zimmerman Personnel Management , 1951, Nature.

[6]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[7]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[8]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[9]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[10]  Chen-Fu Chien,et al.  Data mining to improve personnel selection and enhance human capital: A case study in high-technology industry , 2008, Expert Syst. Appl..

[11]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Adam A. Porter,et al.  Learning from Examples: Generation and Evaluation of Decision Trees for Software Resource Analysis , 1988, IEEE Trans. Software Eng..

[13]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[14]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[15]  Sang-Chan Park,et al.  A hybrid approach of neural network and memory-based learning to data mining , 2000, IEEE Trans. Neural Networks Learn. Syst..

[16]  Sankar K. Pal,et al.  Data mining in soft computing framework: a survey , 2002, IEEE Trans. Neural Networks.

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .