Open Access Open Access  Restricted Access Subscription or Fee Access

Effect of Knowledge Representations on Performance of Classification Algorithms

Manasi Gyanchandani, R. N. Yadav, J. L. Rana

Abstract


A lot of classification algorithms have been evaluated over Knowledge Discovery and Data Mining 1999 (KDDCUP ’99) dataset. All of them use test data which contains attacks present in train data plus some additional attacks. Most of the multilevel classifier have signature based Intrusion Detection Systems (IDS) at the first level and statistical IDS at the next level. Such multilevel classifiers will perform better only if both IDS are complementary. Signature based IDS perform better for known attacks. So it is expected that statistical IDS should perform better for new attacks i.e. novelty attacks. This paper evaluates performance of feature selection algorithm, different classification algorithms, classifier combinations using bagging, boosting and stacking over the KDD’99 dataset for novelty attacks as well as for original data. It is found that the performance of statistical based intrusion detection system is better for novelty attack. It also evaluates the impact of knowledge representations on the performance of network based IDS. This work compares the performance of different classification algorithm for selection of different number of classes for attack such as 41 class, 5 class, and 2 class knowledge representations.

Keywords


Probability of Detection, False Alarm Rate, Novelty Detection, Error due to Variance

Full Text:

PDF

References


Norbik Bashah, Idris Bharanidharan Shanmugam, and Abdul Manan Ahmed, “ Hybrid Intelligent Intrusion Detection System”, World Academy of Science, Engineering and Technology 11, pp 23-26, 2005

D.E.Denning, “An Intrusion Detection Model”, IEEE Transactions on Software Engineering, SE-13, pp. 222-232, 1987.

Markos Markou, Sameer Singh, “Novelty Detection; a review-part 2 : Neural Network based approaches” , Signal Processing 83, pp 2499-2521, 2003

Srilatha Chebrolu, Ajith Abraham, Johnson P. Thomas,” Feature Deduction and ensemble design of Intrusion Detection”, Computer and Security, 24, pp 295-307, 2005.

Jiong Zhang and Mohammad Zulkernine “A Hybrid Network Intrusion Detection Technique Using Random Forests”, Proceedings of the First International Conference on Availability, Reliability and Security IEEE, pp. 262-269, April 2006.

The KDD Archive. KDD99 cup dataset, 1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

Dewan Md. Farid, Nouria Harbi, and Md. Zahidur Rahman “Combining Naïve Bayes and Decision tree for adaptive Intrusion Detection”, International Journal of Network Security and its Applications, Vol 2, pp 14-25, Number 2, April 2010.

"http://www.statsoft.com/textbook/naive-bayes-classifier/"

Quinlan, J.R. C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993.

Data Mining, Practical Machine Learning Tools and Techniques, 2nd edition by Ian H. Witten and Eibe Frank, Elsevier Publication, 2006.

Ricardo Aler, Daniel Borrajo, and Agapito Ledezma, “Heuristic Search Based Stacking of Classifiers”, Chapter IV, Universidad Carlos III, Avda, Universidad, 30, 28911 Leganés (Madrid), pp-54-67, 2002.

http://www2.cs.cmu.edu/afs/cs/project/jair/pub/volume11/opitz99ahtml/node3.html and node4.html

Danica Kragic, “Combining Classifiers: Bagging and Boosting”, 2D1431 Machine Learning, 2004

Shevaun Ryan, Mark Hall, “Practical Data Mining: COMP-321B”, University of Waikato, Hamilton, New Zealand, June 30, 2008.

Mark Hall, Lloyd Smith, “ Feature Selection for Machine Learning: Comparing a Correlation-based Filter Approach to the Wrapper”, Proceedings of 12th International FLAIRS Conference, University of Waikato, Hamilton, New Zealand, pp 235-239, May 1999.

R.C. Chen, and S.P. Chen, “Intrusion detection using a hybrid support vector machine based on entropy and TF-IDF,” International Journal of Innovative Computing, Information, and Control (IJICIC), vol. 4, no. 2, pp. 413-424,2008.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.