Open Access Open Access  Restricted Access Subscription or Fee Access

Review of Adaptive Data Stream Classification

Mayura B. Shinde, Hetal V. Gandhi

Abstract


Data stream classification is a method of mining knowledge from continuous data points. It is classification and prediction task for evolving data streams. For a non-stationary dataset, the data stream classification is posed with the number of challenges like concept drift, infinite length, concept evolution and feature evolution. Data stream is an unending flow of data, which is generated continuously at a rapid rate. As data streams are of infinite length, traditional multi-pass learning algorithms are not applicable as they may require large amount of storage space and training time. Concept drift arrives when the class definition of some instances changes with time. Concept evolution is emergence of new class as stream progresses. However, it is possible that both concept drift and concept evolution may arrive at the same time. By considering these problems, it is challenging to learn a classification model that is consistent with the current concept. Feature evolution occurs when feature space changes with new stream instances, then the feature space of classification model and new unlabelled data would be different, which affects classification accuracy. This paper discusses the different approaches to solve the issues in data stream classification.

Keywords


Data Stream, Ensemble Learning, Outliers, Novel Class

Full Text:

PDF

References


Valerio Grossi, Alessandro Sperduti “Kernel-Based Selective Ensembl Learning for Streams of Trees” in Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence 2010.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68- 73.

Yan-Nei Law and Carlo Zanily entitled” An Adaptive Nearest Neighbor Classification Algorithm for Data Streams” in PKDD 2005, LNAI 3721, pp. 108–120, 2005.

Dewan Md. Farid, Li Zhang, Alamgir Hossain, Chowdhury Mofizur Rahman, Rebecca Strachan, Graham Sexton, and Keshav Dahal “An Adaptive Ensemble Classifier for Mining Concept-Drifting Data Streams”, Expert Systems with Applications,2013

Li Su Xi, Hong-yan Liu, Zhen-Hui Song. “A New Classification Algorithm for Data Stream”

Charu C. Aggarwal ,Jiawei Han, Jianyong Wang, Philip S. Yu “A Framework for On-Demand Classification of Evolving Data Streams” in ECML PKDD 2010, Part II, LNAI 6322, pp.

J. Kolter and M. Maloof, “Using Additive Expert Ensembles to Cope with Concept Drift,” Proc. 22nd Int’l Conf. Machine Learning (ICML), pp. 449-456, 2005.

H. Wang, W. Fan, P.S. Yu, and J. Han, “Mining Concept-Drifting Data Streams Using Ensemble Classifiers,” Proc. ACM SIGKDD Ninth Int’l Conf. Knowledge Discovery and Data Mining, pp. 226-235, 2003.

Mohammad M. Masud, Member, Qing Chen, LatifurKhan,Charu C. Aggarwal, Jing Gao, Jiawei Han,Ashok Srivastava, and Nikunj C. Oza, “Classification and Adaptive Novel Class Detection of Feature-Evolving Data Streams” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 7, JULY 2013

Mohammad M. Masud, Qing Chen, Jing Gao, Latifur Khan, Jiawei Han, and BhavaniThuraisingham “Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space” in ISMIS 2009, LNAI 5722, pp. 552

M.M. Masud, J. Gao, L. Khan, J. Han, and B.M. Thuraisingham, “Integrating Novel Class Detection with Classification for Concept- Drifting Data Streams,” Proc. European Conf. Machine Learning and Knowledge Discovery in Databases (ECML PKDD), pp. 79-94, 2009.

A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavalda`, “New Ensemble Methods for Evolving Data Streams,” Proc. ACM SIGKDD 15th Int’l Conf. Knowledge Discovery and Data Mining, pp. 139-148, 2009.

Mohammad Hussein. Nejat, Vahe. Aghazarian and Ali Reza Hedayati, “Comparative Study of the Performance of Ensemble and Base Classifiers in Text Data Categorization,” International Conference on Power and VLSI Engineering (ICPVE2012) August 11-12,2012 Phuket (Thailand)

Mohamed Medhat Gaber, WIREs Data Mining Knowl Discov 2012, 2: 79–85 doi: 10.1002/widm.52

I. Katakis, G. Tsoumakas, and I. Vlahavas, “Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams,” Proc. Int’l Workshop Knowledge Discovery from Data Streams (ECML/PKDD), pp. 102-116, 2006.

R. Jin and G. Agrawal, “Efficient Decision Tree Construction on Streaming Data,” Proc. Ninth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, pp. 571-576, Aug. 2003.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.