Open Access Open Access  Restricted Access Subscription or Fee Access

Provisional Exceptional Property Pair Extraction Using Generic Approach

R. Krishnaveni

Abstract


A noisy data elimination, attribute and property discovery is a major consideration in the proposed method. From the overall given population the system predicts the sub population effectively. The subpopulation and exceptional property pair which is known as outliers. With the aim of effective outlier detection, the proposed PEP algorithm applies a provisional model which identifies the exceptional property par with the best fit method implementation. There are several outlier detection methods have been introduced with certain domains and applications, but the techniques were more generic and suffer from confidentiality problem. The proposed concept effectively implements Genetic modal based approach which is named as GENEX algorithm and PEP algorithm for the detection of sub population scores for both numerical and categorical datasets. Additionally the system performs the best fit method in order to find best class based on the score and label. The proposed algorithm can reduce the computation cost and lack of accuracy problem by applying best data mining and suitable pruning techniques. The experiments and the results provides the mild and extreme outlier ranges with best fit values.

Keywords


Generic Approach, anomaly detection, outlier detection , PEP

Full Text:

PDF

References


“Outlier Detection Algorithms in Data Mining” Jingke Xi ; Sch. of Comput. Sci. & Technol., China Univ. of Min. & Technol., Xuzhou.

“Association rules based algorithm for identifying outlier transactions in data stream” Li-Jen Kao ; Yo-Ping Huang Systems, Man, and Cybernetics (SMC), 2012 IEEE

A Survey of Outlier Detection Methods in Network Anomaly Identification Prasanta Gogoi1” , D K Bhattacharyya1 , B Borah1 and Jugal K Kalita2

C. C. Aggarwal, and P. S. Yu, Outlier detection for high dimensional data, ACM SIGMOD Conference on Management of Data, (2001).

Ji Zhang, Meng Lou, Tok Wang Ling and Hai Wang, HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data, In: Proc. Int‟l Conf. Very Large Databases (VLDB ‟04), Toronto Canada, 2004.

Ji Zhang, Qiang Gao and Hai Wang, A Novel Method for Detecting Outlying Subspaces in Highdimensional Databases Using Genetic Algorithm, In: Proc. Int‟l Conf. Data Mining (ICDM ‟06), 2006.

Statistical Methods for Research Workers. Oliver and Boyd, 1954.

J. Kubica and A. Moore. Probabilistic noise identification and data cleaning, 2002.

S. Schwarm and S. Wolfman. Cleaning data with bayesian methods, 2000

Edwin M. Knorr and Raymond T. Ng. Algorithms for mining distance-based outliers in large datasets. In Proc. 24th VLDB, pages 392–403, 24–27 1998

Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. Efficient algorithms for mining outliers from large data sets. pages 427–438, 2000.

J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.

I. T. Jolliffe. Principal Component Analysis. Springer Verlag-New York, 2nd edition, 2002.

V. Hodge and J. Austin, “A Survey of Outlier Detection Methodologies,” Artificial Intelligence Rev., vol. 22, no. 2, pp. 85- 126, 2004.

N.V. Chawla, N. Japkowicz, and A. Kotcz, “Editorial: Special Issue on Learning from Imbalanced Data Sets,” SIGKDD Explorations, vol. 6, no. 1, pp. 1-6, 2004.

D. Tax, “One-Class Classification,” PhD dissertation, Delft Univ.of Technology, 2001.

E. Knorr and R. Ng, “Algorithms for Mining Distance-Based Outliers in Large Data Sets,” Proc. Int‟l Conf. Very Large Data Bases (VLDB‟ 98), pp. 392-403, 1998.

F. Angiulli and C. Pizzuti, “Outlier Mining in Large High- Dimensional Data Sets,” IEEE Trans. Knowledge and Data Eng.,vol. 17, no. 2, pp. 203-215, Feb. 2005.

F. Angiulli and F. Fassetti, “Dolphin: An Efficient Algorithm for Mining Distance-Based Outliers in Very Large Data Sets,” ACM Trans. Knowledge Discovery from Data, vol. 3, no. 1, article 4, Mar. 2009.

M. Breunig, H.-P. Kriegel, R. Ng, and J. Sander, “LOF: Identifying Density-Based Local Outliers,” Proc. ACM SIGMOD Int‟l Conf. Management of Data (SIGMOD), pp. 93-104, 2000,

S. Papadimitriou, H. Kitagawa, P. Gibbons, and C. Faloutsos, “LOCI: Fast Outlier Detection Using the Local Correlation Integral,” Proc. Int‟l Conf. Data Eng. (ICDE), pp. 315-326, 2003,

F. Angiulli, G. Greco, and L. Palopoli, “Outlier Detection by Logic Programming,” ACM Trans. Computational Logic, vol. 9, no. 1, article 7, 2007.

F. Angiulli, R. Ben-Eliyahu-Zohary, and L. Palopoli, “Outlier Detection Using Default Reasoning,” Artificial Intelligence, vol. 172, nos. 16/17, pp. 1837-1872, Nov. 2008.

F. Angiulli and F. Fassetti, “Outlier Detection Using Inductive Logic Programming,” Proc. Ninth IEEE Int‟l Conf. Data Mining (ICDM), pp. 693-698, 2009.

G. Dong and J. Li, “Efficient Mining of Emerging Patterns: Discovering Trends and Differences,” Proc. Fifth ACM SIGKDD Int‟l Conf. Knowledge Discovery and Data Mining (KDD), pp. 43-52, 1999.

X. Zhang, G. Dong, and K. Ramamohanarao, “Exploring Constraints to Efficiently Mine Emerging Patterns from Large High-Dimensional Data Sets,” Proc. ACM SIGKDD Int‟l Conf. Knowledge Discovery and Data Mining (KDD), pp. 310-314, 2000.

P.K. Novak, N. Lavrac, and G.I. Webb, “Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining,” J. Machine Learning Research, vol. 10, pp. 377-403, 2009.

J. Li, G. Dong, and K. Ramamohanarao, “Making Use of the Most Expressive Jumping Emerging Patterns for Classification,” Knowledge and Information Systems, vol. 3, no. 2, pp. 1-29, 2001.

J. Bailey, T. Manoukian, and K. Ramamohanarao, “Classification Using Constrained Emerging Patterns,” Proc. Int‟l Conf. Advances in Web-Age Information Management, pp. 226-237, 2003.

F. Angiulli, F. Fassetti, and L. Palopoli, “Detecting Outlying Properties of Exceptional Objects,” ACM Trans. Database Systems, vol. 34, no. 1, article 7, 2009.

D.M. Hawkins, Identification of Outliers. Chapman and Hall, 1980.

M. Breunig, H.-P. Kriegel, R.T. Ng, and J. Sander, “LOF: Identifying Density-Based Local Outliers,” Proc. ACM SIGMOD Int‟l Conf. Management of Data, 2000.

H.D.K. Moonesignhe and P. Tan, “Outlier Detection Using Random Walks,” Proc. IEEE 18th Int‟l Conf. Tools with Artificiual Intelligence (ICTAI ‟06), 2006.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.