Open Access Open Access  Restricted Access Subscription or Fee Access

Fast and Efficient Cancer Prediction System using Data Mining Techniques

H. Lubsen, R. Mizumoto, C. Gravalos


Cancer is a one of the deadly diseases. Detection of the cancer in the initial stage will be helpful for curing cancer completely. A disease that is commonly make an incorrect diagnosis is lung cancer. Many of their lives was saved because of the earlier diagnosis of lung cancer, if it is not which may lead to other critical problems causing sudden death. It’s accurate and prediction depends mainly on the early detection and diagnosis of the disease. One of the most common mistakes of medical malpractices globally is an error in diagnosis. Knowledge can be derived from application of data mining techniques in healthcare system. In this study, we briefly examine the potential use of classification-based data mining techniques such as Rule Based, Decision tree, Naïve Bayes and Artificial Neural Network to massive volume of healthcare data which, unfortunately, are not “mined” to discover hidden information. For gathering information before processing will helpful in making effective decisions. One Dependency Augmented Naïve Bayes Classifier (ODANB) and Naïve Creedal Classifier 2 (NCC2) are used. This is an extension of Naïve Bayes to imprecise probabilities that aims at delivering robust classifications also when dealing with small or incomplete data sets. It is difficult to recover discovery of hidden patterns and relationships. Diagnosis of Lung Cancer can answer complex “What if” queries which traditional decision support systems cannot. Using generic lung cancer symptoms such as Age, Sex, Wheezing, Shortness of breath, Pain in shoulder, chest, arm, it can predict the likelihood of patients getting a lung cancer disease. Aim of the paper is to propose a model for early detection and correct diagnosis of the disease which will help the doctor in saving the life of the patient.


Cancer Prediction, Machine Learning, Data Classification.

Full Text:



Bradford, James R., et al. "Insights into protein–protein interfaces using a Bayesian network prediction method." Journal of molecular biology 362.2 (2006): 365-386.

Harleen Kaur and Siri Krishan Wasan, Empirical Study on Applications of Data Mining Techniques in Healthcare, Journal of Computer Science 2 (2): 194-200, 2006ISSN 1549-3636.

Julia, Dunn. "Pre-diagnosis morbidity related to health care utilization in young women with breast cancer: a matched case-control study." PhD diss., University of British Columbia, 2018.

Krishnaiah, V., Dr G. Narsimha, and Dr N. Subhash Chandra. "Diagnosis of lung cancer prediction system using data mining classification techniques." International Journal of Computer Science and Information Technologies 4.1 (2013): 39-45.

Krishnapuram, B., et al., A Bayesian approach to joint feature selection and classifier design. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2004. 6(9): p. 1105-1111.

L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001.

Miller, Kimberly D., et al. "Cancer treatment and survivorship statistics, 2016." CA: a cancer journal for clinicians 66.4 (2016): 271-289.

Nakte, Jyotsna, and Varun Himmatramka. "Breast cancer prediction using Data mining techniques." International Journal on Recent and Innovation Trends in Computing and Communication 4.11 (2016): 55-60.

Papernot, Nicolas, et al. "Distillation as a defense to adversarial perturbations against deep neural networks." 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 2016.

R. D´ıaz-Uriarte and A. de André’s. Gene selection and classification of microarray data using random forest. BMC bioinformatics, 7(1):3, 2006.

R. Linder, T. Richards, and M. Wagner. Microarray data classified by artificial neural networks. METHODS IN MOLECULAR BIOLOGYCLIFTON THEN TOTOWA-, 382:345, 2007.

R.S. Michal ski and K. Kaufman. Learning patterns in noisy data: TheAQ approach. Machine Learning and its Applications, Springer-Verlag, pages 22–38, 2001.

Sang Min Park, Min Kyung Lim, Soon Ae Shin & Young Ho Yun 2006. Impact of prediagnosis smoking, Alcohol, Obesity and Insulin resistance on survival in Male cancer Patients: National Health Insurance corporation study. Journal of clinical Oncology, Vol 24 Number 31 November 2006.

Sellappan Palaniappan, Rafiah Awang, Intelligent Heart Disease Prediction System Using Data Mining Techniques, 978-1-4244-1968-5/08/$25.00 ©2008 IEEE.

Shantakumar B.Patil, Y.S.Kumaraswamy, Intelligent and Effective Heart Attack Prediction System Using Data Mining and Artificial Neural Network, European Journal of Scientific Research ISSN 1450-216X Vol.31 No.4 (2009), pp.642-656 © Euro Journals Publishing, Inc. 2009.

Wiggins, M., et al. "Evolving a Bayesian classifier for ECG-based age classification in medical applications." Applied soft computing 8.1 (2008): 599-608.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.