Preserving Privacy by Quantizing

Korra Sathya Babu; Sanjay Kumar Jena

Preserving Privacy by Quantizing

Korra Sathya Babu, Sanjay Kumar Jena

Abstract

Advances in Data Mining resulted in collection of
sensitive information from the published data. The web rendering a platform for data publishing and development of automated software technologies added fuel to the burning problem of personal privacy. The sensitive data need to be anonymized and published in the web. A number of methods were proposed earlier for anonymization. Methods include data partitioning, data swapping, generalization, suppression, randomization, perturbation, secure multiparty computation etc. A method of perturbation is discussed in this paper. Domain values of the private table are clustered together using clustering algorithms. To anonymize the private table the values are represented by the cluster
head. This decreases the utility of data to be published. Care need to be taken that while anonymizing a balance of utility and privacy need to be maintained. F-measure and distortion are the metrics deployed to find the utility of the data that get perturbed.

Keywords

Anonymization, Privacy Preserving, Quantization, Utility.

Full Text:

PDF

References

Qiang Yang, Xindong Wu, et al., ”10 Challenging Problems in Data

Mining Research”, International Journal of Information Technology and

Decision Making. World Scientific Journals, Volume 5(4), pp 597-604,

Qiang Yang, ”Three Challenges in Data Mining”, Frontiers of Computer

Science in China, Springer, Volume 4, Number 3, pp.324-333.

P. Samarati and L.Sweeny, ”Generalizing Data to Provide Anonymity

when disclosing information”. In proceeding of 17th ACM

SIGAT-SIGMOD Symposium on Principle of DataBase System

(PODS‟98) pp 188 Seatlle WA 1998.

Benjamin C. M. Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S.

Yu,”Introduction to Privacy-Preserving Data Publishing Concepts and

Techniques”, CRC Press, 2011, 43-44.

C.C. Aggarwal, and P.S. Yu,”Privacy Preserving Data Mining: Models

and Algorithm”, Volume 34 of Advances in Database System,

Springer-Velag, Newyork,2008

Jaideep Vaidya , Christopher W. Clifton and Yu Michael Zhu, “Privacy

Preserving Data Mining”, Chapter 1, Springer, New York, 2006.

L. Sweeney, ”Achieving k-anonymity privacy preotection using

generalization and suppression”, International Journal on Uncertainty,

Fuzziness and Knowledge-based Systems,10(5), 571-588.

Machanavajjhala, A., Kifer, D., Gehrke, J., and Venkitasubramaniam M.

“l-Diversity: Privacy beyond k-anonymity”, ACM Trans. Knowledge

Discovery of Data, 1(3),March 2007.

Ninghui Li et al., “t-Closeness: Privacy Beyond k-Anonymity and

l-Diversity”, Proceedings of IEEE 23rd ICDE, 2007, pp. 106-115, April

Korra Sathya Babu and Sanjay Kumar Jena, “Balancing between Utility

and Privacy for k-Anonymity”, Edition 1 of Communications in

Computer and Information Science, Volume 191, Advances in

Computing and Communications, Part 1, Pages 1-8. springer-verlag

berlin heidelberg 2011.

Oliveira S.R.M, Zaiane Osmar R., A Privacy-Preserving Clustering

Approach Toward Secure and Effective Data Analysis for Business

Collaboration, In Proceedings of the International Workshop on Privacy

and Security Aspects of Data Mining in conjunction with ICDM 2004,

Brighton, UK, November 2004.

Wang Qiang , Megalooikonomou, Vasileios, A dimensionality reduction

technique for efficient time series similarity analysis, Inf. Syst. 33, 1

(Mar.2008), 115- 132.

Nancy Chinchor, “Evaluation Metrics”, in Proc. of the Fourth Message

Understanding Conference, pp. 22–29, 1992.

Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang

Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing Liu,

Philip S. Yu, Zhi-Hua Zhou, Michael Steinbach, David J. Hand, Dan

Steinberg, ”Top 10 algorithms in data mining”, Knowl. Inf. Syst. 14(1):

pp 1-37, 2008.

UCI Repository of machine learning databases, University of California,

Irvine. http://archive.ics.uci.edu/ml/

www.un.org/en/documents/udhr/

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me