Open Access Open Access  Restricted Access Subscription or Fee Access

Solving Imbalanced Data Problem with a New Approach

T. Deepthi, M.A. Jabbar, A. Chandrasekhar Sharma

Abstract


In today’s real world domains, data is increasing at a rapid rate and growing exponentially. Domains like security, Internet, Banking, marketing, finance and others domains had continuous expansion of data. It is very difficult to understand and analyze this raw data.In order to understand this raw data we need some tools ,techniques and methodologies so that we can make decision making process easily. There are many knowledge discovery techniques available but the problem of imbalanced data domains is a great challenge in every academy and industry. This imbalanced data problem deals how various algorithms can be applied on imbalanced data and considering their performance levels.

Keywords


Assessment Metrics, Classification, Imbalanced Data, Synthetic Sampling Methods, Support Vector Machines

Full Text:

PDF

References


R.C. Holte, L. Acker, and B.W. Porter, “Concept Learning and the Problem of Small Disjuncts,” Proc. Int’l J. Conf. Artificial Intelligence, pp. 813-818, 1989.

N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, “SMOTE: Synthetic Minority Over-Sampling Technique,” J. Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.

H. Han, W.Y. Wang, and B.H. Mao, “Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning,” Proc. Int’l Conf. Intelligent Computing, pp. 878-887, 2005.

R. Akbani, S. Kwek, and N. Japkowicz, “Applying Support Vector Machines to Imbalanced Data Sets,” Lecture Notes in Computer Science, vol. 3201, pp. 39-50, 2004.

J. Zhang and I. Mani, “KNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction,” Proc. Int’l Conf. Machine Learning (ICML ’2003), Workshop Learning from Imbalanced Data Sets, 2003.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.