Open Access Open Access  Restricted Access Subscription or Fee Access

Feature Selection with Naïve Bayes Classifier

Dr. E. Chandra, K. Nandhini

Abstract


The ability to predicting the performance of a student is very essential task of all educational institutions. This will not be decided by using only the academic excellence of a student. The behaviors such as aptitude, attitude, communications, technological, interpersonally, problem solving ability etc., should be taken into care to predict the real excellence of a student. This form a heterogeneous dataset covering cross section of categorical, integer type data types etc. This has given rise to a high dimensional dataset which will hamper classification process. Since this is the task of prediction and mining the classification algorithms of data mining is used. The decision tree algorithms of classification are one of the fine grained methods to bring the more accuracy of prediction. The first phase of the work is collecting the wide cross section of atabase of values for attributes which are quite cross functional. The second phase plays vital role for effective classification by narrowing down by selection of predictive attributes. This phase is done by FeatureExtraction techniques to reduce the high dimensional dataset in to a low dimensional dataset. The third phase applying the algorithms uses the Naive Bayes and tree induction of decision tree methods for actual classification of the data. The scalability of these methods has improved by perception based learning. Also, there is a school of thought that one can take up the classification and data mining without incorporating any Dimensionality reduction techniques like Feature Extraction. This work compare results obtained by the both process and study the performance of the Prediction accuracy. It is not that only the student domain can be used for excellence prediction. It can be applied for any kind of domain


Keywords


Data Mining, Decision Tree, Feature Extraction, Performance Prediction

Full Text:

PDF

References


Beck, J., ed. ITS2004 workshop on Analyzing Student Tutor Interaction Logs to Improve Educational Outcomes. Maceio, Brazil (2004).

Iida Hakkinen, Do University entrance exams predict academic achievement? Working Paper Series, Department of Economics, Uppsala University, 2004.

Mostow, J. "Some Useful Design Tactics for Mining ITS Data" in Proceedings of ITS2004 workshop Analyzing Student-Tutor Interaction Logs to Improve Educational Outcomes, Maceio, Brazil (2004).

Paul Golding & Opal Donaldson, Predicting academic performance. Proc. 36th ASEE/IEEE Frontier in Education Conference, 2006

Breiman L, et al Classification and regression trees (Wadsworth: Belmont CA, 1984)

Han, J. & M. Kamber, Data mining: concepts and techniques, San Francisco: Morgan Kaufman (2001).

F. V. Jensen., An introduction to Bayesian network (London. U. K: University College London Press, 1996).

Pearl J., Probabilistic reasoning in intelligent systems: networks of plausible inference, (Morgan Kaufmann: San Mateo CA, 1988).

Matjaz Gams. lecture about bayesian classifiers http://aij. si/Mezi/pedagosko/AU4Bayeskm6. ppt

Chandra. E, Nandhini. K, Presented a paper on “Predicting Student Performance using Classification Techniques” in SPIT-IEEE Colloquium 2008, Sardar Patel Institute of Technology, Mumbai.

Carver, C. A., Howard, R. A., Lane, W. D. : Enhancing Student Learning Trough Hypermedia Courseware and Incorporation of Student Learning Styles, IEEE Transactions on Education, vol 42, nº 1 (1999) 33:38

Felder, R. M., Soloman, B. A. : Index of Learning Style Questionnaire, available online at in: http://www2. ncsu. edu/unity/lockers/users/f/felder/public /ILSdir/ilsweb. html

Gama, J., Castillo, G. : Adaptive Bayes for User Modeling. Advances in Artificial Intelligence-IBERAMIA 2002, LNAI 2527, Springer Verlag (2002) 765:774

Koychev, I., Schwab, I. : Adaptation to Drifting User's Interests. In Proceedings of ECML2000 Workshop: Machine Learning in New Information Age, Spain (2000)

Webb, G., Pazzani, M., Billsus, D. : Machine Learning for User Modeling. In User Modeling and User-Adapted Interaction, 11 (2001) 19:29

Breese, J., R. Goldman and P. Wellman (1994). Introduction to the special section on knowledge-based construction of probabilistic and decision models. IEEE Transactions on Systems, Man, and Cybernetics. 24: 1577 1579.

Bunt, A., C. Conati, M. Hugget and K. Muldner (2001). On Improving the Effectiveness of Open Learning Environments through Tailored Support for Exploration. AIED 2001, 10th World Conference of Artificial Intelligence and Education, San Antonio, TX

Charniak, E. and R. P. Goldman (1992). “A Bayesian model of plan recognition. ” Artificial Intelligence 64: 53-79.

Conati, C. and K. VanLehn (1996). POLA: A student modeling framework for probabilistic on-line assessment of problem solving performance. Proceedings of UM-96, the Fifth International Conference on User Modeling. D. N. Chin, M. Crosby, S. Carberry and I. Zukerman. Kailua-Kona, Hawaii, ser Modeling, Inc. : 75-82.

Corbett, A., M. McLaughlin and K. C. Scarpinatto (2000). “Modeling Student Knowledge: Cognitive Tutors in High School and College. ” User Modeling and User-Adapted Interaction 10: 81-108.

Corbett, A. T. and A. Bhatnagar (1997). Student modeling in the ACT programming tutor: Adjusting a procedural learning model with declarative knowledge. Proceedings of the Sixth International Conference on User Modeling.

Gertner, A., C. Conati and K. VanLehn (1998). Procedural help in Andes: Generating hints using a Bayesian network student model. Proceedings of the 15th National Conference on Artificial Intelligence.

Gertner, A. S. and K. VanLehn (2000). Andes: A coached problem solving environment for physics. Intelligent Tutoring Systems: 5th International Conference, Montreal, Canada, Springer, New York.

Henrion, M. (1989). Some practical issues in constructing belief networks. Third Conference on Uncertainty in Artificial Intelligence, Elsevier Science.

Jameson, A. (1995). “Numerical uncertainty management in user and student modeling: An overview of systems and issues. ” User Modeling and User-Adapted Interaction 5

Martin, J. and K. VanLehn (1995). “Student assessment using Bayesian nets. ” International Journal of Human Computer Studies 42: 575-591.

Reye, J. (1998). Two-phase updating of student models based on dynamic belief networks. 4th International Conference on Intelligent Tutoring Systems (ITS '98), San Antonio, Texas.

Bhargava, H. K. (1999) Data Mining by Decomposition: Adaptive Search for Hypothesis Generation. INFORMS Journal on Computing, 11, 239.

Clarke, P. J., Marshall, V. M. & Ryff, C. D. (2001) Measuring psychological well-being in the Canadian Study of Health and Aging. International Psychogenetics, 13, 79-90.

Dash, M., & Liu, H (1997) Feature selection for classification. Intelligent Data Analysis, 131-156.

Dunham, M. H. (2003b) Data mining introductory and advanced topics, Upper Saddle River, NJ : Prentice Hall/Pearson Education.

Goodger, B., Byles, J., Higganbotham, N. & Mishra, G. (1999) Assessment of a short scale to measure social support among older people. Australian and New Zealand Journal of Public Health, 23, 260-265.

Hawthorne, G., Osborne, R., Taylor, A. & Sansoni, J. (2007) The SF36 Version 2: critical analyses of population weights, scoring algorithms and population norms. Quality of Life Research, 16, 661- 673.

Hsieh, C. -M. (2005) Age and relative importance of major life domains. Journal of Aging Studies, 19, 503.

Idler, E. L. & Kasi, S. (1991) Health perceptions and survival: do global evaluations of health status really predict mortality. J Gerontol., 46, 555-65.

Idler, E. L., Kasi, S. V. & Lemke, J. H. (1990) Selfevaluated health and mortality among the elderly in New Haven, Connecticut, and Iowa and Washington counties, Iowa, 1982-1986. American Journal of Epidemiology, 131, 91-103.

Kawada, T. (2003) Self-rated health and life prognosis. Arch Med Res, 34, 343-7.

Koenig, H. G., Westland, R. E., George, L. K., Hughes, D. C., Blazer, D. G. & Hybels, C. (1993) Abbreviating the Duke Social Support Index for Use in Chronically Ill Elderly Individuals. Psychosomatics, 34, 61-69.

Kohavi, R. & John, G. H. (1997) Wrappers for Feature Subset Selection. Artificial Intelligence, 97, 273-324.

Langley, P. (1994) Selection of relevant features in machine learning. AAAI Fall Symposium on Relevance. LEE, Y. (2000) The predictive value of self assessed general, physical, and mental health on functional decline and mortality in older adults. J Epidemiol. Community Health, 54, 123-9.

Lyyra, T. -M., Törmäkangas, T. M., Read, S., Rantanen, T. & Berg, S. (2006) Satisfaction With Present Life Predicts Survival.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.