Open Access Open Access  Restricted Access Subscription or Fee Access

Analyzing Data Mining Algorithms using Car Dataset

R. Deepa Lakshmi, N. Radha


The “Car Manufacturing” sector occupies a prime position in the development of automobile industry. In this paper, a proposed data mining application in car manufacturing domain is explained and experimented. The datasets are retrieved from UCI Machine learning repository. The purpose of this paper is to establish a classifier that is much more reliable in classifications for future objects. The classifier should provide sophisticated prediction to indicate the car data for a new input instance with some attributes, such as car type, body-style, horsepower and fuel. Such analysis helps in providing car market with base for more accurate result for the future market. The physical characteristics of a car viz. aspiration, number of doors, body-style, normalized losses, car-type, drivewheels, engine-location, wheel-base, curb-weight, horse-power, bore, stroke, city-mpg, highway-mpg, price, engine size, etc., are considered to determine the performance of a car. Hence development of such a classifier, though a voluminous task, is immensely essential in car manufacturing realm. Machine learning techniques can help in the integration of computer-based systems in predicting the quality of car and to improve the efficiency of the system. The classification models were trained by using 214 datasets. The predicted values for the classifiers were evaluated using 10-fold cross validation and the results were compared.


Machine learning Techniques, Navies Bayes, J48, BF Trees, Decision trees, Car market, Data mining, WEKA classification.

Full Text:



Ian H. Witten, Eibe Frank, Len Trigg, Mark Hall, Geoffrey Holmes, Sally Jo Cunningham, “WEKA: Practical Machine Learning Tools and Techniques with Java Implementations,”.

“Introduction to Machine Learning and Data Mining”, Peng Du, Wenxiang Yao H. Poor

Wei, W., Zhang, Q., Wang, M.: A method of vehicle classification using models and neural networks. IEEE Vehicular Technology Conference,IEEE, 2001.

N. Kerdprasop, and K. Kerdpraso, “Moving data mining tools toward a business intelligence system”, Enformatika;, vol. 19, pp. 117-122, 2007.

Yoshida, T., Mohottala, S., Kagesawa, M., Ikeuchi, K.: Vehicle Classification Systems with Local-Feature Based Algorithm using CG Model Images. IEICE Trans., Vol. E00-A, No. 12, December 2002.

Gupte, S., Masoud, O., Martin, R.F.K. Papanikolopoulos, N.P.:Detection and Classifica-tion.

Jiawei Han and Micheline Kamber (2001). Data Mining: concepts and techniques. Academic Press,San Diego, California.

Data Mining: Practical Machine Learning Tools and Techniques with JAVA Implementation, by I. H. Witten and E. Frank, Morgan Kanfmann Publishers, 2000

Introduction to Machine Learning and Data Mining: Peng Du, Wenxiang Yao.

P. O. Bobbie, C.-Z., Arif, H. Chauhdari,Homecare Telemedicine: Analysis and Diagnosis of Tachycardia Condition in an M8051 Microcontroller, 2nd IEEE-EMBS International Summer School and Symposium on Medical Devices and Biosensors (ISSSMDBS), Hong Kong, June 25- July 2, 2004, (CD-Volume-ISBN 0-7803- 8613-2).

U. M. Fayyad, G. Piatetsky-Shapiro, P.Smyth, and R. G. R. Uthurusamy,Advances inKnowledge Discovery and Data Mining, AAAI Press / The MIT Press, Menlo Park, CA. 1996.

G. Piatetsky-Shapiro and W. J. Frawley, Knowledge Discovery in Databases, AAAI Press, Menlo Park, CA, 1991.

D. Michie, Methodologies from Machine Learning in Data Analysis and Software, Computer Journal, Vol. 34, No. 6, 1991, pp. 559-565.

M. Pazzani and D. Kibler, The Utility of Knowledge in Inductive Learning, Machine Learning, Vol. 9, No. 1, 1992, pp. 57-94.

G. Bortolan and W. Pedrycz, An Interactive framework for an analysis of ECG signals, Artificial Intelligence in Medicine, Vol. 24, 2002, pp.109-132.

S. Palu, The Use of Java in Machine Learning, December 19, 2002,

J. de la Calleja and O. Fuentes, Machine learning and image analysis for morphological galaxy classification, Monthly Notices of the Royal Astronomical Society , Vol. 349, 2004, pp. 87-93.

M. Embrechts, B. Szymanski, K. Sternickel, T. Naenna, and R. Bragaspathi , Use of Machine Learning for Classification of Magnetocardiograms, Proc. IEEE Conference on System, Man and Cybernetics, Washington DC, October 2003, pp. 1400-1405.

H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers, San Francisco, CA, 2000.

UCI Machine Learning Repository ~mlearn /MLRepository.Html

WEKA web sitehttp://www.cs.waikato weka/index.html

R. C. Holt, Very Simple classification rules perform well on most commonly used datasets, Machine Learning, Vol 11, 1993, pp.69-90.

P. Langley, W. Iba, and K. Thompson, An Analysis of Bayesian Classifiers, Proceedings of the 10th National Conference in Artificial Intelligence, 1992, pp. 223-228

Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques. Second edition, 2005. Morgan aufmann

Han J, Kamber M. Data Mining: Concepts and Techniques. Second edition, 2006. Morgan Kaufmann


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.