Open Access Open Access  Restricted Access Subscription or Fee Access

Fuzzy SVM Clustering Using Plane Sweep Algorithm in Big Data Mining

M. M. Kavitha, Dr. B. Ananthi

Abstract


The era of Big Data is leading to an enormous growth of digital information. Text mining is akin in nature to data mining. Text mining is the practice of examining and discovering big amounts of unstructured text data assisted by software that can classify patterns, topics, concepts, keywords and other attributes in the data. Powerful technique for data classification is Support Vector Machine (SVM). SVM makes a decision surface by creating an optimal separating hyper-plane, to divide the data points of different kinds in the vector space. Clustering the text takes a special place since it is reliable and easy to configure. To cluster big datasets and to shrink the computational time, Hadoop distributed platform, MapReduce programming standard is used by issuing clustering job across various computing nodes or handling Big Data MapReduce is proven state of art technology. In Big Text Data analysis MapReduce provides a faster implementation and is a powerful tool. In the event of a big data like server having a difficulty in providing the services furthermore, additional servers in the cluster can take the load. A powerful approach for solving difficulties including geometric objects in a plane known as Plane Sweep algorithm. The key benefit of using plane sweep algorithm is that it clusters more related data into a cluster hence retrieval will be extra fast and accurate. The Evaluation result shows that the proposed approach provided better result when compared with other approaches.


Keywords


Big Data, Clustering, MapReduce, Plane Sweep, SVM.

Full Text:

PDF

References


T. BaggiyaLakshmi, G. Parimala Devi, J. Bharathi, S. Sivasankari, “Text Mining in Bigdata using RCDC Clustering Algorithm in Hadoop Environment”, International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 04 | Apr-2016.

Cao Langcai, Li Zhihui, Liu Yuanfang, “Research of Text Clustering Based on improved VSM by TF under the framework of Mahout”, Xiamen University, 2017 IEEE.

Caiquan Xiong, Zhen Hua, Ke Lv,Xuan Li, “An Improved K-means text clustering algorithmBy Optimizing initial cluster centers”, 2016 7th International Conference on Cloud Computing and Big Data.

Tanvir Habib Sardar, Zahid Ansari, “Partition Based Clustering of Large Datasets using MapReduce Framework: An Analysis of Recent Themes and Directions”, An Analysis of Recent Themes and Directions, Future Computing and Informatics Journal (2018).

Mohamed Aymen Ben HajKacem, Chiheb-Eddine Ben N’cir1, Nadia Essoussi, “One-pass MapReduce-based clustering method for mixed large scale data”, J Intell Inf Syst© The Author(s) 2017.

Abdelkarim Ben Ayed, Mohamed Ben Halima and Adel M. Alimi, “MapReduce Based Text Detection in Big Data Natural Scene Videos”, Volume 53, 2015 INNS Conference on Big Data.

Vandana Bhat, Rinkle Rani, “A Parallel Fuzzy Clustering Algorithm for Large Graphs using Pregel”, Department of Computer Science and Engineering.

Biplab Banerjee, B. Krishna Mohan, “A novel graph based fuzzy clustering technique for unsupervised classification of remote sensing images” , SPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-8, 2014 ISPRS Technical Commission VIII Symposium, 09 – 12 December 2014, Hyderabad, India.

Thomas A. Runkler, Vikram Ravindra, “Fuzzy Graph Clustering based on Non–Euclidean Relational Fuzzy c–Means” , 16th World Congress of the International Fuzzy Systems Association (IFSA) 9th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT) © 2015. The authors - Published by Atlantis Press.

Simone A. Ludwig, “MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability” , Received: 30 October 2014 / Accepted: 20 April 201 Springer-Verlag Berlin Heidelberg 2015

Chunyong Yin, Jun Xiang, Hui Zhang, Jin Wang, “A New SVM Method for Short Text Classification Based on Semi-Supervised Learning”, 2015 4th International Conference on Advanced Information Technology and Sensor Application.

Sichao Wei,Zhengtao Yu , Jianyi Guo, Peng Chen, Yantuan Xian, “The Instructional Design of Chinese Text Classification based on SVM” , 2013 25th Chinese Control and Decision Conference (CCDC).

Ruihuan Geng,Dexian Zhang, Jiajia Chai, “Research on Grain Information Classification based on SVM Decision Tree” , 2012 IEEE International Conference on Granular Computing.

Mihuandayani, Ema Utami, Emha Taufiq Luthfi, “Text Mining Based on Tax Comments as Big Data Analysis Using SVM and Feature Selection”, 2018 International Conference on Information and Communications Technology (ICOIACT).

V. Anthoni sahaya balan, S. Singaravelan, D.Murugan, “Combined Cluster Based Ranking for Web Document Using Semantic Similarity” , IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 1, Ver. IV (Jan. 2014), PP 06-11.

Xiaoyan Cai, Wenjie Li, You Ouyang, Hong Yan, “Simultaneous Ranking and Clustering of Sentences: A Reinforcement Approach to Multi-Document Summarization” , Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 134–142, Beijing, August 2010.

Rajendra Kumar Roul, Omanwar Rohit Devanand, S. K. Sahay, “Web Document Clustering and Ranking using Tf-Idf based Apriori Approach”, BITS, Pilani, K.K.Birla Goa Campus Zuarinagar.

Kuo-Chan Huang, Meng-Han Tsai, “Task Ranking and Allocation Heuristics for Efficient Workflow Schedules”, 2016 International Computer Symposium.

N.Prameela, “Text Summarization using topic modeling and cluster based MapReduce framework”, International Journal of Engineering and Computer Science ISSN: 2319-7242 Volume 5 Issue 12 Dec. 2016.

Nawsher Khan, Ibrar Yaqoob, “Big Data: Survey, Technologies, Opportunities, and Challenges”, Scientific World Journal, Volume 2014.

Hwanjo Y, Jiong Yan, “Classifying Large Data Sets Using SVMs with Hierarchical Clusters”, Department of Computer Science.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.