Open Access Open Access  Restricted Access Subscription or Fee Access

Logistic Regression for Breast Cancer Analysis

Bhoomi Sharma, Abhimanyu Abhimanyu, Anuradha Anuradha, Yogita Gigras

Abstract


In this study, logistic regression on mammograms is used to diagnose breast cancer. The aim of using logistic regression is to obtain the significant clinical factors contributing more towards higher probability of breast cancer. The sample data set is taken from UC Irvine repository and modeled using the regression model. A 10-fold cross validation is applied on the training data set to avoid the over fitting problem. The sample data set contains mammograms samples collected by a survey conducted by the Radiologist. The classification table of 450 samples illustrations the correct classification percentage for mammogram as 96.6%. The result is then compared with 30 validated samples, correct classification 68.9%.The simulation results claims that the used linear regression model is able to map relationships among attributes by giving more accurate classification

Keywords


Breast Cancer, Mammograms, Prediction, Logistic Regression, Factors and Accuracy.

Full Text:

PDF

References


. Al-Ghamdi, A. S. Using logistic regression to estimate the influence of accident factors on accident severity. Accident Analysis & Prevention 34(6) (2002): 729-741.

. Archer, K. J., S. Lemeshow, and Hosmer, D. W., Goodness-of-fit tests for logistic regression models when data are collected using a complex sampling design. Computational Statistics & Data Analysis 51 (9) (2007): 4450-4464.

. P. C. and J. V. Tu, Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. Journal of Clinical Epidemiology 57(11) (2004): 1138-1146.

. Bagley, S. C., H. White, and Golomb, B. A. Logistic regression in the medical literature: Standards for use and reporting, with particular attention to one medical domain. Journal of Clinical Epidemiology 54(10) (2001): 979-985.

. Balleyguier, C., S. Ayadi, K. V. Nguyen, D. Vanel, C. Dromain, and R. Sigal ,BIRADS(TM) classification in mammography. European Journal of Radiology 61(2) (2007): 192-194.

. Colditz, G. A., W. C. Willett, D. J. Hunter,M. J. Stampfer, J. E. Manson, C. H. Hennekens, B. A. Rosner, and F. E. Speizer, Family History, Age, and Risk of Breast Cancer: Prospective Data From the Nurses' Health Study. Journal of Clinical Medicine 270(3) (1993): 338-343.

. Kamber, M., Winstone, L., Gong, W., Cheng, S., & Han, J. Generalization and decision tree induction: efficient classification in data mining. In Research Issues in Data Engineering, 1997. Proceedings. Seventh International Workshop on 1997:. 111-120.

. Ngai, E. W., Xiu, L., & Chau, D. C. Application of data mining techniques in customer relationship management: A literature review and classification. Expert systems with applications, 36(2) (2009):2592-2602.

. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11(1) (2009):10-18.

. Steinbach, M., Karypis, G., & Kumar, V. A comparison of document clustering techniques. In KDD workshop on text mining 400(1) (2000): 525-526.

. Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer research, 27(2 Part 1) (1967):209-220.

. Ng, A. Y., Jordan, M. I., & Weiss, Y. On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems (2002): 849-856.

. Al-Hajj, M., Wicha, M. S., Benito-Hernandez, A., Morrison, S. J., & Clarke, M. F. (2003). Prospective identification of tumorigenic breast cancer cells. Proceedings of the National Academy of Sciences, 100(7) (2003):3983-3988.

. Gunjal, B. L. Wavelet based color image watermarking scheme giving high robustness and exact corelation. International Journal of Emerging Trends in Engineering and Technology (IJETET), 1(1) (2011): 21-30.

. Concato, J., Feinstein, A. R., & Holford, T. R. The risk of determining risk with multivariable models. Annals of internal medicine, 118(3) (1993): 201-210.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.