Open Access Open Access  Restricted Access Subscription or Fee Access

Company Trend Analysis Using Subspace Clustering and Frequent Patterns

A. Kavitha, A. Boopathybabu

Abstract


Clustering techniques and frequent pattern mining methods are used to discover events in company data analysis. Feature selection method is used for identifying a subset of the most needed features, it produces compatible results. A feature selection algorithm is constructed with the consideration of efficiency and effectiveness factors.

Data models are analyzed with different dimensions. Object, attribute and context information are linked in the 3 dimensional data models. Cluster quality is decided with domain knowledge and parameter setting requirements.CAT Seeker is also referred as a Centroid Actionable 3D subspace clustering framework. CAT Seeker framework is used to find profitable actions. Singular value decomposition, numerical optimization and 3D frequent itemset mining methods are integrated in CAT Seeker model. Singular value decomposition (SVD) is used to calculating and pruning the homogeneous tensor. Augmented Lagrangian Multiplier Method is used to calculating the probabilities of the values. 3D closed pattern mining is used to fetch Centroid-Based Actionable 3D Subspaces (CATS).

Clustring and pattern mining techniques are integrated in the CATSeeker method. CAT Seeker framework is improved with optimal centroid estimation scheme. Intra cluster accuracy factor is used to fetch centroid values. Inter cluster distance is also considered in centroid estimation process. Dimensionality analysis is applied to improve the subspace selection process.


Keywords


Clustering, Centroid based 3D- Subspace Clustering, Singular Vector Decomposition.

Full Text:

PDF

References


E. Georgii, K. Tsuda, and B. Scholkopf, “Multi-Way Set Enumeration in Weight Tensors,” Machine Learning, pp. 123-155, 2010.

H.-P. Kriegel, P. Kroger, and A. Zimek, “Clustering High-Dimensional Data: A Survey on Subspace Clustering, Pattern-Based Clustering, and Correlation Clustering,” ACM Trans. Knowledge Discovery from Data, pp. 1-58, 2009.

G. Moise and J. Sander, “Finding Non-Redundant, Statistically Significant Regions in High Dimensional Data: A Novel Approach to Projected and Subspace Clustering,” Proc. 14th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, pp. 533-541, 2008.

G. Liu, K. Sim, J. Li, and L. Wong, “Efficient Mining of Distance-Based Subspace Clusters,” Statistical Analysis Data Mining, pp. 427-444, 2009.

X. Xu, Y. Lu and A.K.H. Tung, “Finding Time-Lagged 3D Clusters,” Proc. IEEE Int’l Conf. Data Eng. (ICDE), pp. 445-456, 2009.

H.-P. Kriegel et al., “Future Trends in Data Mining,” Data Mining Knowledge Discovery, vol. 15, no. 1, pp. 87-97, 2007.

J. Nocedal and S.J. Wright, Numerical Optimization, pp. 497-528. Springer, 2006.

K. Sim, A.K. Poernomo, and V. Gopalkrishnan, “Mining Actionable Subspace Clusters in Sequential Data,” Proc. SIAM Int’l Conf. Data Mining (SDM), pp. 442-453. 2010.

L. Cerf, J. Besson, C. Robardet, and J.-F. Boulicaut, “Data Peeler: Constraint-Based Closed Pattern Mining in N-Ary Relations,” Proc. SIAM Int’l Conf. Data Mining (SDM), pp. 37-48, 2008.

Kelvin Sim, Ghim-Eng Yap, Gao Cong and Suryani Lukman, “Centroid-Based Actionable 3D Subspace Clustering”, IEEE Transactions on Knowledge and Data Engineering, Vol. 25, no. 6, June 2013.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.