Mining Association Rules between Sets of Items in Large Databases
A. KrishnaKumar1, D. Amrita2, N. Swathi Priya3
1A.KrishnaKumar, Department Of Information Technology, SNS College Of Engineering, Coimbatore (TN), India.
2D.Amrita, Department Of Information Technology, SNS College Of Engineering, Coimbatore (TN), India.
3N.SwathiPriya, Department Of Information Technology, SNS College Of Engineering, Coimbatore (TN), India.
Manuscript received on April 05, 2013. | Revised Manuscript received on April 11, 2013. | Manuscript published on April 15, 2013. | PP: 24-27 | Volume-1 Issue-5, April 2013. | Retrieval Number: E0211041513/2013©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. To overcome this drawback, several methods were proposed in the literature such as item set concise representations, redundancy reduction, and post processing. However, being generally based on statistical information, most of these methods do not guarantee that the extracted rules are interesting for the user. Thus, it is crucial to help the decision-maker with an efficient post processing step in order to reduce the number of rules. This paper proposes a new interactive approach to prune and filter discovered rules. First, we propose to use ontologies in order to improve the integration of user knowledge in the post processing task. Second, we propose the Rule Schema formalism extending the specification language proposed by Liu et al. for user expectations. Furthermore, an interactive framework is designed to assist the user throughout the analyzing task. Applying our new approach over voluminous sets of rules, we were able, by integrating domain expert knowledge in the post processing step, to reduce the number of rules to several dozens or less. Moreover, the quality of the filtered rules was validated by the domain expert at various points in the interactive process.
Keywords: Clustering, classification, and association rules, interactive data exploration and discovery, knowledge management applications.