Open Access Open Access  Restricted Access Subscription or Fee Access

Efficient Mining of Active and Valuable Clustered Sequential Patterns

Sahista Machchhar, Madhuri Vaghasia, Chintan Bhatt

Abstract


Clustering of inherent sequential natured data sets is useful for various purposes. Over the years, many methods have been developed for clustering objects having sequential nature according to their similarity. However, these methods tend to have a computational complexity that is at least quadratic on the number of sequences. Also, clustering algorithms often require that the entire dataset be kept in the computer memory. In this paper, we present novel algorithm for Mining of constraint based clustered sequential patterns (CBCSP) algorithm for clustering only user interesting sequential data using recency, monetary and compactness constraints. So, the algorithm generates a compact set of clusters of sequential patterns according to user interest by applying constraints in mining process. It minimizes the I/O cost involved. The proposed algorithm basically applies the well known K-means clustering algorithm along with Prefix-Projected Database construction to the set of sequential patterns. In this approach, the method first performs clustering based on a novel similarity function and then captures the sequential patterns of which are only user interesting in each cluster using a sequential pattern mining algorithm which employs pattern growth method not. The proposed work results in reduced search space as user intended sequential patterns tend to be discovered in the resulting list. Through experimental evaluation under various simulated conditions, the proposed method is shown to deliver excellent performance and leads to reasonably good clusters.

Keywords


Data Clustering, Projected Database, Sequential Patterns, K-Means

Full Text:

PDF

References


Weiling Cai, Songcan Chen, and Daoqiang Zhang, A Multiobjective Simultaneous Learning Framework for Clustering and Classification, IEEE Transactions on neural networks, Vol. 21, No. 2, pp. 185-200, February 2010.

Valerie Guralnik and George Karypis, A Scalable Algorithm for Clustering Sequential Data, IEEE, pp. 179-189.

G T Raju, Kunal and P S Satyanarayana, Knowledge Discovery from Web Usage Data: Extraction of Sequential Patterns through ART1 Neural Network based Clustering Algorithm. International Conference on Computational Intelligence and Multimedia Applications 2007, pp. 88-92.

Hejin Yuan, Yanning Zhang, Cuiru Wang, A Novel Trajectory Pattern Learning Method Based on Sequential Pattern Mining, IEEE.

Dilhan Perera, Judy Kay, Irena Koprinska, Kalina Yacef, and Osmar R. Zaane, Clustering and Sequential Pattern Mining of Online Collaborative Learning Data, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 21, NO. 6, JUNE 2009, pp. 759-772.

R. Agrawal and R. Srikant, “Mining sequential patterns”, In Proceedings of the 1995 International Conference on Data Engineering, pp. 3-14, 1995.

Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Jianyong Wang, Helen Pinto, Qiming Chen, Umeshwar Dayal, and Mei-Chun Hsu, IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 10, October 2004.

Shigeaki Sakurai, Youichi Kitahata and Ryohei Orihara , “Discovery of Sequential Patterns based on Constraint Patterns”, international journal of computational intelligence,2008.

Sahista Machchhar, C.K. Bhensdadia and A.M. Ganatra, “Scientific Understanding, Comprehensive Evolution and More Informed Evaluation of Various Sequential Pattern Mining Algorithms”,CiiT International journal of Data Mining & Knowledge engineering,pp. - Jan-2011.

R. Agrawal and R. Srikant, “Mining sequential patterns: generalizations and performance improvements”, In Proceedings of the 5th International Conference on Extending Database Technology, pp. 3-17, Avignon, France, 1996.

Yen-Liang Chen, Ya-Han Hu, “The consideration of recency and compactness in sequential pattern mining”, In Proceedings of the second

workshop on Knowledge Economy and Electronic Commerce, Vol. 42, Iss. 2 ,pp. 1203-1215, 2006.

Jiang Yuan Zhang Zhao-yang Qiu Pei-liang Zhou Dong-fang. Clustering Algorithms Used in Data Mining[J]. Journal of Electronics and Information Technology. 2005(4) 655-660.

Sun Jigui, Liu Jie, Zhao Lianyu, “Clustering algorithms Research”,Journal of Software ,Vol 19,No 1, pp.48-61,January 2008.

K.A.Abdul Nazeer, M.P.Sebastian, “Improving the Accuracy and Efficiency of the k-means Clustering Algorithm”,Proceeding of the World Congress on Engineering, vol 1,london, July 2009.

Kiri Wagsta, Claire Cardie, Seth Rogers, Stefan Schroedl, “Constrained K-means Clustering with Background Knowledge”, Proceedings of the Eighteenth International Conference on Machine Learning, 2001, pp. 577-584.

Ouyang Weiming and Cai Qingsheng, “Automatic Discovery of Generalized Sequential Patterns in Databases”, Journal of Software, pp.864-870, No-11, Volume – 8, Nov-1997.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.