Open Access Open Access  Restricted Access Subscription or Fee Access

Data Preprocessing Methods and Unified Framework for a Cardiac Database

R. Kavitha Kumar, Dr. R. M. Chandrasekar


Data in the real world is dirty. If the data in not in the quality then it is not possible to get the correct result in the mining. There are some measures for data qualities. Accuracy, Completeness, Consistency, Timeliness, Believability, Value added,Interpretability, and Accessibility are the measures. To have these measures preprocessing in very important. In this paper we have proposed the improved algorithms for data cleaning, integration, and discretization. Thus our goal is to provide best suited algorithms for data cleaning of the cardiology data set.


Data preprocessing, cardiology data set, data, data mining.

Full Text:



E.M. Knorr, R. Ng “Algorithms for Mining Distance-Based Outliers in Large Datasets”, , Proc. Of 24th VLDB Conf. 1998

Arthur D. Chapman 2005, “Principles and Methods of Data Cleaning – Primary Species and Species-Occurrence Data” , version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen, 2005.

Bitton D and DeWitt, “Duplicate Record Elimination in Large Data Files” , ACM Transactions on Database Systems 8 (1983), No. 2, 255-265. #312.

A. E. Monge and C. P. Elkan. ,” The field matching problem:Algorithms and applications”, SIGMOD workshop on research issues on knowledge discovery and data mining, pages 267-270, 1996.

S. B. Kotsiantis, D. Kanellopoulos and P. E. Pintelas “Data Preprocessing for Supervised Leaning” , International Journal of Computer Science Volume 1 Number 2, page 111-117

Nguyen Hung Son , “Data cleaning and preprocessing” , PPT

JiaweiHan and MichelineKamber,Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, Second Edition, 2006

Deng-Chao Feng ZheWang and Jian-Fang Shi J.M. Dias Pereira “Research on Missing Value Estimation in Data Mining”, Proceedings of the 7th World Congress on Intelligent Control and Automation,Publication Date: 25-27 June 2008, on page(s): 2048-2052

N. Zhang and W. F. Lu , “An Efficient Data Preprocessing Method for Mining Customer Survey Data”, 2007 5th IEEE International Conference on Volume 1, Issue , 23-27 June 2007 Page(s):573 – 578,Member, IEEE

CIS664-Knowledge Discovery and Data Mining by Jiawei Han and Micheline Kamber , Spring Lecturer Notes, April 16, 2005



  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.