Open Access Open Access  Restricted Access Subscription or Fee Access

Privacy Preservation in Data Mining Using Hybrid Approach

S. Deepajothi, Dr. S. Selvarajan

Abstract


Data sharing has become common now days and there is an exponential growth in the amount of information. Data mining is the extraction of large amount of useful information from massive databases. Data mining is considered as the core of the KDD (Knowledge Discovery in Data mining) process. Privacy is at risk when the personal and sensitive information about a person is leaked and hence privacy must be preserved. Privacy Preservation In Data Mining (PPDM) is the area in data mining which protects the sensitive data or information from unauthorized disclosure. There are many privacy preserving algorithms like k-anonymity, l-diversity, t-closeness, slicing existing to protect the privacy of data. In this paper, we propose a new approach for preserving the privacy of data which is the hybrid approach. Our approach which is the combination of task independent privacy preservation and k-anonymization efficiently preserves the sensitive data and provides high accuracy when compared to the accuracy provided by the above mentioned algorithms when applied separately on the datasets.


Keywords


Data Mining, KDD, PPDM

Full Text:

PDF

References


Privacy Preserving Data Mining: A Process Centric View from a European Perspective1Martin Meints and Jan Möller Unabhängiges Landeszentrum für Datenschutz Schleswig­Holstein Holstenstr. 9824103 Kiel {meints | moeller}@datenschutzzentrum.de

M. Kantarcioglu, J. Jin, and C. Clifton, “When Do Data Mining Results Violate Privacy?” Proc. 2004 Int’l Conf. Knowledge Discovery and Data Mining, pp. 599-604, 2004.

Task Independent Privacy Preserving Data Mining on Medical Dataset by E. Poovammal and M. Ponnavaikko.2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies.

R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, pages 439–450, Dallas, TX, May 14-19 2000. ACM.

Y. Lindell and B. Pinkas. Privacy preserving data mining.In Advances in Cryptology – CRYPTO 2000, pages 36–54.Springer-Verlag, Aug. 20-24 2000.

t-Closeness: Privacy Beyond k-Anonymity and !-Diversity byNinghui Li Tiancheng Li,Suresh Venkatasubramanian.

K. Lefevre, R. Agrawal, V. Ercegovac, R. Rmakrishnan,Y. Xu, D. Dewit, “Limiting disclosure in Hippocratic databases”, 30th international conference on Very Large Databases, Toronto, Canada, August 2004

K. Kenthapadi, N. Mishra, and K. Nissim, “Simulatable auditing, PODS, 2005

Dinur and K. Nissim.,”Revealing information while preserving privacy”, PODS, pages 202–210, 2003

Agrawal R., Srikant R.,”Privacy-Preserving Data Mining”,ACM SIGMOD Conference, 2000

Agrawal D., Aggarwal C. C., “On the Design and Quantification of Privacy- Preserving Data MiningAlgorithms”, ACM PODS Conference, 2002

A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkita-subramaniam. !-diversity: Privacy beyond k-anonymity. In Proc. 22nd Intnl. Conf. Data Engg. (ICDE), page 24, 2006.

Pierangela Samarati, “Protecting respondents identities in micro data release”, TKDE, 13(6), 1010-1027, 2001

L. Sweeney, "Achieving k-anonymity privacy protection using generalization and suppression," International

Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10 (5), pp. 571-588, 2002

Bayardo. R. J, Rakesh Agrawal, “Data privacy through optimal k- anonymization” , ICDE, 217-228,2005

K LeFevre, David J. DeWitt, Raghu Ramakrishnan,“Incognito: Efficient full domain k – anonymity”,SIGMOD, 49-60, 2005

Machanavajjhala A., Gehrke J., Kifer D.,and Venkitasubramaniam M, “l-Diversity: Privacy Beyond k-Anonymity”, pp.24-35, ICDE, 2006

Ninghui Li , Tiancheng Li and Suresh.V, “t-Closeness:Privacy beyond k-anonymity and l-diversity”, ICDE, 2007

T. M. Truta and B. Vinay. Privacy protection: p-sensitive k-anonymity property. In Proceedings of the 22nd International Conference on Data Engineering Workshops, the Sec-ond International Workshop on Privacy Data Management (PDM’06), page 94, 2006.

S. R. M. Oliveira and O. R. Zaïane, "Privacy Preservation When Sharing Data for Clustering", International Workshop on Secure Data Management in a Connected World, 2004

X. Xiao and Y. Tao. Personalized privacy preservation. In Proceedings of ACM Conference on Management of Data (SIGMOD’06), pages 229–240, June 2006.

D.J. Newman, S. Hettich, C.L. Blake, and C.J. Merz, “UCI Repository of Machine Learning Databases”, Available at www .ics. uci. edu/~ mlearn/MLRepository.html,University of California, Irvine, 1998

Wenliang Du, Zhijun Zhang, "A Practical Approach to Solve Secure Multi-party Computation," in NSPW '02:workshop on New security paradigms, pp. 127-135, 2002

R.Agrawal, J.kiernan, R.Srikant, Y. Xu, “Hippocratic databases”, 28th International Conference on very large Databases, Hong Kong, China, August 2002


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.