Open Access Open Access  Restricted Access Subscription or Fee Access

Effective Feature Selection Method for Cervical Cancer Dataset Using Data Mining Classification Analytical Model

Dr. D. Rajakumari, S. Karthika


Data mining is a set of techniques which could be used to derive hidden patterns from the data. The purpose of data mining is to find some information which is not directly visible or retrievable by reading data or executing simple queries to the data.  One  of the  key features  of using  data mining techniques is  to predict  future based  on the  data of  past  and  present. Predictions are widely required to be done for betterment of future. An accurate and timely prediction could avoid any future issue at a certain level. Healthcare is a field where it is required to diagnosis various critical diseases like cancers before they become life threatening. This paper explains how data mining techniques could be useful for healthcare purpose specially to predict possibility of a patient suffering from cervical cancer. The main goal here is to design a database which can be used in future for data mining purpose. In this paper implemented a feature model construction and comparative analysis for improving prediction accuracy of cervical cancer patients in four phases. In first phase, min-max normalization algorithm is applied on the original cervical cancer patient datasets collected from UCI repository. In cervical cancer dataset prediction second phase, by the use of feature selection, subset (data) of cervical cancer patient dataset from whole normalized cervical cancer patient datasets is obtained which comprises only significant attributes.  Third phase, classification algorithms are applied on the data set. In the fourth phase, the accuracy will be calculated using root mean square value, root mean error value. KNN and SVM algorithm is considered as the better performance algorithm after applying feature selection. Finally, the evaluation is done based on accuracy values. Thus outputs shows from proposed GA base feature extraction with classification model implementations indicate that KNN and SVM algorithm performances all other classification algorithm with the help of feature selection with an accuracy of 97.60%.


Cervical Cancer dataset, Data Mining Algorithm, KNN, SVM

Full Text:



Ashfaq Ahmed, K., Aljahdali, S., Hussain, S.N.: “Comparative prediction performance with support vector machine and random forest classification techniques”, International Journal Computer Applications. 69 (11), 12–16, 2016.

Giovanni Caocci, Roberto Baccoli, Roberto Littera, Sandro Orrù, Carlo Carcassi and Giorgio La Nasa, “Comparison Between an Artificial Neural Network and Logistic Regression in Predicting Long Term Kidney Transplantation Outcome”, Chapter 5, an open access article distributed under the terms of the Creative Commons Attribution License,,2017.

K. Petry, J. Horn, A. Luyten, and R. Mikolajczyk, "Punch biopsies shorten time to clearance of high-risk human papillomavirus infections of the uterine cervix," BMC cancer, vol. 18, p. 318, 2018.

Lakshmi. K.R, Nagesh. Y and VeeraKrishna. M “Performance Comparison of Three Data Mining Techniques for Predicting Kidney Dialysis Survivability”, International Journal of Advances in Engineering & Technology, Mar., Vol. 7, Issue 1, pg no. 242-254, 2016.

Neha Sharma, Er. Rohit Kumar Verma, “Prediction of Kidney Disease by using Data Mining Techniques”, Prediction of Kidney Disease by using Data Mining Techniques, 2016.

S. Fong, W. Song, R. Wong, C. Bhatt, and D. Korzun, "Framework of Temporal Data Stream Mining by Using Incrementally Optimized Very Fast Decision Forest," in Internet of Things and Big Data Analytics toward Next-Generation Intelligence, ed: Springer, 2018, pp. 483-502.

Swathi Baby P and Panduranga Vital T, “Statistical Analysis and Predicting Kidney Diseases using Machine Learning Algorithms”, International Journal of Engineering Research & Technology (IJERT), 2015.

Talha Mahboob Alam, Muhammad milhan afzal khan,”Cervical Cancer Prediction through different screening methods using data mining”, International Journal of Advanced Computer Science and Applications (IJACSA), 2019.

Veenita Kunwar, Khushboo Chandel, A. Sai Sabitha, and Abhay Bansal, “Chronic Kidney Disease Analysis Using Data Mining Classification Techniques”, IEEE, 2016.

Vijayarani, S., Dhayanand, S.: “Data mining classification algorithms for kidney disease prediction”, International Journal of Cybern. Inf. (IJCI) 4(4), 13–25, 2017.

Vijayarani, S., Dhayanand, S.: “Kidney disease prediction using SVM and ANN algorithms”, International Journal Comput. Business Res. 6(2), 2017.

Sharma, S., Sharma, V., & Sharma, A. (2017). Performance Based Evaluation of Various Machine Learning Classification Techniques for Chronic Kidney Disease Diagnosis. arXiv preprint arXiv:1606.09581,2017.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.