论文信息 - Building on the Arules Infrastructure for Analyzing Transaction Data with R

Building on the Arules Infrastructure for Analyzing Transaction Data with R

The free and extensible statistical computing environment R with its enormous number of extension packages already provides many state-of-the-art techniques for data analysis. Support for association rule mining, a popular exploratory method which can be used, among other purposes, for uncovering cross-selling opportunities in market baskets, has become available recently with the R extension package arules. After a brief introduction to transaction data and association rules, we present the formal framework implemented in arules and demonstrate how clustering and association rule mining can be applied together using a market basket data set from a typical retailer. This paper shows that implementing a basic infrastructure with formal classes in R provides an extensible basis which can very efficiently be employed for developing new applications (such as clustering transactions) in addition to association rule mining.

Kurt Hornik | Michael Hahsler | Michael Hahsler | K. Hornik

[1] Kurt Hornik,et al. Introduction to arules – A computational environment for mining association rules and frequent item sets , 2009 .

[2] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[3] Chun Zhang,et al. Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[4] William Frawley,et al. Knowledge Discovery in Databases , 1991 .

[5] Michael J. A. Berry,et al. Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[6] Philip S. Yu,et al. Finding Localized Associations in Market Basket Data , 2002, IEEE Trans. Knowl. Data Eng..

[7] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[8] Peter J. Rousseeuw,et al. Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[9] Kurt Hornik,et al. A CLUE for CLUster Ensembles , 2005 .

[10] Gregory Piatetsky-Shapiro,et al. Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[11] Joydeep Ghosh,et al. Distance based clustering of association rules , 1999 .

[12] Christian Borgelt,et al. EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[13] Kurt Hornik,et al. Introduction to arules — Mining Association Rules and Frequent Item Sets , 2006 .

[14] Gary J. Russell,et al. Perspectives on Multiple Category Choice , 1997 .

[15] P. Sneath,et al. Some thoughts on bacterial classification. , 1957, Journal of general microbiology.

[16] Bart Goethals,et al. Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations (FIMI'03) , 2003 .

[17] John M. Chambers,et al. Programming With Data , 1998 .