Open Access Open Access  Restricted Access Subscription or Fee Access

Analysis of Text Clusters Based On Fuzzy and Rough K-Means Strategies

B. Gayathri


There are several issues in the present data mining industry regarding the text classification process and finding the accuracy parameters while implementing fuzzy [1]. Based on the fuzzy rule the results are purely produced on the probabilistic mean, so no one can guarantee regarding the classified output.  Regularly the rough K-Means strategy achieves the performance, or speed, or accuracy. Additionally applying the Association Rule Mining, this gives feature subset from the transactional data in an efficient manner. The major goal of this approach is to apply association rules to generate quicker results that enhance algorithm process. By means of the Ofrecca (one by one Fuzzy Relational Eigenvector Centrality-based Clustering Algorithm) in normal text, the conditions are not connected to more than one [2,3]. A rough k-means algorithm is also used for fixing the cluster heads in more efficient manner [14]. A single constituent can fit in to manifold cluster, but using the k-means algorithm every constituent can precisely fit in to one cluster. Both HFRECCA and rough k-means clustering algorithms are used with association rule to achieve the fuzzy relation in successful manner [4]. It generates the quicker results and to reduce the classifier computational weight and augment the information steadiness.


Association Rule, Fuzzy Theory, Computational Load, K-Means Algorithm

Full Text:



An Overview of Recent Machine Learning Strategies in Data Mining - Battula, B. P. and Prasad. 2005. pages 20-32

Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm - Skabar, A. and Abdalgader. pages76-83.

Measuring Similarity between Sentence Fragments - Huang, G. and Sheng. pp. 478–499

Comparative Study of Data Clustering Techniques, Proceedings National Conference on Emerging trends in Engineering, Technology & Architecture - Wazarkar, S. and Khot.pages 13-20

Jiang, S. And An, Q. (2008) Clustering Based Outlier Detection Method, Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

Loureiro, A., Torgo, L. And Soares, C. (2004) Outlier Detection using Clustering Methods: A Data Cleaning Application, in Proceedings of KDNet Symposium on Knowledge-Based Systems.

Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, “LOF Identifying density-based local outliers”, Jörg Sander, 2000 ACM SIGMOD international conference on Management of data, pp. 93-104, ACM, New York, NY, USA.

Ian H. Witten and Eibe Frank, Morgan Kaufmann, “Data Mining: Practical Machine learning tools with Java implementations”, San Francisco 2000

J.Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proceedings of 2000 ACM-SIGMOD International Conference on Management of Data (SIGMOD’00), pp. 1–12, Dallas, TX, May 2000.

Forman, G.,an Experimental Study of Feature Selection Metrics for TextCategorization.Journal of Machine Learning Research, 3 2003, pp. 1289-1305.

Y. Grandvalet and S. Canu. Adaptive scaling for feature selection in SVMs. In NIPS 15, 2002.pages2-22

Frayling N., Mladenic D., “Interaction of Feature Selection Methods and Linear Classification Models”Proc. of the 19th International Conference on Machine Learning, Australia, 2002.pp.22-32

Torkkola K., “Discriminative Features for Text Document Classification”, Proc.International Conference on Pattern Recognition,Canada, 2002pp 14-23

L. Bottou and Y. Bengio, ªConvergence Properties of the k-means Algorithms,º Advances in Neural Information Processing Systems 7, G. Tesauro and D. Touretzky, eds., pp. 585-592. MIT Press, 1995. Pages 39-42

P.S. Bradley and U. Fayyad, ªRefining Initial Points for K-means Clustering,º Proc. 15th Int'l Conf. Machine Learning, pp. 91-99, 1998.

Geo®rey J. McLachlan and Thriyambakam Krishnan. The EM Algorithm and Extensions. John Wiley & Sons, Inc., New York, 1997.pp.1190-1233

An Efficient Concept-Based Mining Model for Enhancing Text Clustering - Shehata, S., Karray, F. and Mohamed., 1994.pp. 29-42

Agrawal, Shipra and Krishnan, Vijay and Haritsa, Jayant R (2004) On Addressing Efficiency Concerns in Privacy- Preserving Mining. Proceedings4th International ConferenceAdvances in Knowledge Discovery and DataMining, volume 3918 ofLecture Notes in Computer Science.Springer Berlin / Heidelberg, 2006. , pages577–593

A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 39:1–38, 1977.

Steve Carr and Ken Kennedy. Blocking linear algebra codes for memory hierarchies. In Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, Society for Industrial and Applied Mathematics, 1989.pages 33-83

Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT '98), p. 92 100, (1998)

Omidiora Elijah Olusayo and Olabiyisi Stephen Olatunde An Exploratory Study of K-Means and Expectation Maximization Algorithms Adigun pages 23-43

John Peter. S., Department of computer science and research center St. Xavier’s College, Palayamkottai, An Efficient Algorithm for Local Outlier Detection Using Minimum Spanning Tree, International Journal of Research and Reviews in Computer Science (IJRRCS), March 2011.pages 8-20

Knorr, E. and Ng, R. (1997). A unified approach for mining outliers. In Proc. KDD, pp. 219–222.

Ron Kohavi, George H. John.1997. Wrappers for feature subset Selection, Artificial Intelligence, Vol. 97, No. 1-2. pp. 273-324


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.