Novel Algorithms-- K-Gen and L-Gen for Implementation of k-Anonymity and l-Diversity Properties

Ganesh Yernally; Dr. Andhe Pallavi

Novel Algorithms-- K-Gen and L-Gen for Implementation of k-Anonymity and l-Diversity Properties

Ganesh Yernally, Dr. Andhe Pallavi

Abstract

Nowadays, the collection of personal data by research organizations and sharing of this data with other organization for business intensions has been increased tremendously. Medical data of individuals is most sensitive among other shared private data. Although some specific values like names and id numbers etc. are removed from shared data to protect the individual privacy the medical data released to research organizations is still susceptible to linking attack which can compromise the patients’ privacy. To prevent linking attack k-Anonymity Property is used. In a k-anonymized dataset, each record is indistinguishable from at least k−1 other records with respect to certain ―identifying‖ attributes. As k-anonymity cannot prevent attribute disclosure, to go beyond k-anonymity the notion of l-diversity is used to address this; l-diversity requires that each equivalence class has at least l well-represented values for each sensitive attribute. In this paper we implemented a system using VB.Net for k-anonymous and distinct l-diverse table generation. We propose two algorithms K-Gen and L-Gen for the system implementation with generalization index values. The generalization index values controls the levels of generalization for each ―identifying‖ attribute. Experimental results show that the proposed algorithms have lesser Discernibility value and comparable information loss as compared with existing methods. The system implemented in this paper protects patients’ identity by efficient implementation of k-anonymity and distinct l-diversity properties, providing faster data release with intuitive GUI. It also offers a cost-effective solution to patients’ data holding organizations, as they need not buy any proprietary black boxes to protect the released data.

Keywords

Generalization Index Values (GIV), K-Gen, L-Gen.

Full Text:

PDF

References

Kiran P, ―A Survey on methods, Attacks and Metric for privacy Preserving Data publishing,‖ in International Journal of Computer Applications, vol. 53-No 18, 2012, pp. 20–28.

Kiran P, ―SW-SDF Based Personal Privacy with QIDB-Anonymization Method,‖ in International Journal of Advance Computer Science and Applications, vol. 3-No 8, 2012, pp. 60–66.

Sunyong Y., Moonshik S. and Doheon L., ―An Approach to Reducing Information Loss and Achieving Diversity of Sensitive Attributes in k-anonymity Methods,‖ in Interactive Journal of Medical Research, vol. 1, iss. 2, 2012.

GPO US. 2008. Part 46-Projection of human subjects. Available: http://www.gpo.gov/fdsys/pkg/CFR-2008-title45-vol1/content-detail.html. [accessed 2013-01-14]

GPO US. 2008. Part 45-Security and Privacy. Available: http://www.gpo.gov/fdsys/pkg/CFR-2008-title45-vol1/content-detail.html. [accessed 2013-01-14]

Pierangela Samarati., ―Protecting respondents’ identities in micro-data release,‖ IEEE Transactions on Knowledge and Data Engineering, November 2001.

L Sweeney, ―k-anonymity: a model for protecting privacy,‖ In International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002, pp. 557-570.

Pierangela Samarati and Latanya Sweeney, ―Generalizing data to provide anonymity when disclosing information,‖ In Proc. Of the 17th

ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, Seattle, WA, 1998, pp. 188.

Machanavajjhala A, Gehrke J, Kifer D and Venkitasubramaniam M, ―l-diversity: Privacy beyond k- anonymity,‖ In Proceedings of the 22nd IEEE International Conference on Data Engineering(ICDE), 2006.

T. Dalenius. ―Finding a needle in a haystack – or identifying anonymous census record,‖ Journal of Official Statistics, 2(3), 1986, pp. 329-336.

R J Bayardo, R Agarwal, ―Data privacy through optimal k-anonymization,‖ In Proc. of the 21st International Conference on Data Engineering(ICDE'05), Tokyo, Japan, 2005, pp. 217-228.

Xiao X and Tao Y, ―Personalized privacy preservation,‖ In Proceedings of the ACM SIGMOD Conference.ACM, New York, 2006.

Sweeney L, ―Achieving k-anonymity privacy protection using generalization and suppression,‖ In International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 10(5), 2002, pp. 571-588.

Lefevre K., Dewitt D. J. and Ramakrishnan R., ―Incognito: Efficient full-domain k-anonymity‖. In Proceedings of ACM SIGMOD. ACM, New York, 2005, pp. 49–60.

L. Sweeney. Guaranteeing anonymity when sharing medical data, the Datafly system. Proceedings, Journal of the American Medical Informatics Association. Washington, DC: Hanley & Belfus, Inc., 1997.

A. Hundepool and L. Willenborg. μ- and τ-argus: software for statistical disclosure control. Third International Seminar on Statistical Confidentiality.

K. LeFevre, D. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD’05), pages 49–60, 2005.

Sweeney L. Guaranteeing anonymity when sharing medical data, the Datafly system. In Journal of the American Medical Informatics Association, Washington, DC: Hanley & Belfus, Inc.

V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati. K-Anonymity.Available:http://www.springerlink.com/content/ht1571nl63563x16/fulltext.pdf. [accessed 2013-01-14]

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me