Open Access Open Access  Restricted Access Subscription or Fee Access

A Review on Cloud Data Mining and Data Integrity

R. Sumithra, Dr. Sujni Paul, Dr.V. Thavavel


This paper presents a review of how data mining is used in cloud computing as discussed in various research contributions. Cloud computing has become a viable mainstream solution for data processing, storage and distribution. It promises on demand, scalable, pay-as-you-go compute and storage capacity. To analyze “big data” on clouds, it is very important to research data mining strategies based on cloud computing paradigm from both theoretical and practical views. The purpose of this paper is to study a strategy of data mining on cloud. Hadoop, one of the open source implementation of Map reduce framework is very useful in distributed data mining concepts.  While outsourcing data to the mining process in cloud environment the integrity of data should be maintained. Here a discussion is being given on cloud computing security issues and solutions as given in various research paperse�x HCHX�Uoned by many people in order to improve Web service levels and address the existing Web services requested by the people. The backbone of this solution is clearly the UDDI (private) registry. Earlier for web mining service they use WSDL-S approach, which had undergone many semantic problems. Since WSDL-S is a light weight solution approach it fails in reaching the efficiency levels of web mining service. To overcome this issue I am proposing a new solution by using OWL-S upper ontologies, which is a full solution for achieving an efficient web mining service. A matching algorithm is designed in OWL-S approach which specifies the semantic matching between a service request and a service description which does Semantic-based Web data mining by combining the semantic Web and Web mining. Software that implements a given matching algorithm is called a matchmaking engine. Practical implementation of this OWL-S approach in Semantic Web makes Web mining easier to achieve, but also can improve the effectiveness of Web mining. Here I am giving knowledge about semantic web and web mining. Finally I propose to build a semantic-based Web mining model under the framework of the Agent.



Cloud Computing, Association Rule Mining, Map Reduce, Data Integrity, Outsourcing, Security Model

Full Text:



Dheresh Soni, Atish Mishra, (2011) Satyendra Singh Thakur, Applying Frequent Pattern Mining in Cloud Computing Environment, International Journal of Advanced Computer Research (ISSN: 2249-7277) Volume 1 Number 2 .

Ran Wolff , Assaf Schuster,(2004) Association Rule Mining in Peer-to-Peer Systems, IEEE transactions on systems, man, and cybernetics—part b cybernetics, vol. 34, no. 6.

Minqi Zhou, Rong Zhang, Wei Xie, Weining Qian, Aoying Zhou, (2010) Security and Privacy in Cloud Computing: A Survey, Sixth International Conference on Semantics, Knowledge and Grids(IEEE).

Jayalatchumy, D., Ramkumar, P., Kadhirvelu, D., (2010) Preserving Privacy through Data Control in a Cloud Computing Architecture using Discretion Algorithm , Third International Conference on Emerging Trends in Engineering and Technology IEEE.

(2010) Trust and Reputation management, IEEE Internet Computing.

Zhidong Shen, Qiang Tong, (2010) The security of cloud computing system enabled by trusted computing technology, 2nd International conference on signal processing systems(ICSPS).

Sravan Kumar, R., Ashutosh Saxena, (2011) Data Integrity Proofs in Cloud Storage, IEEE.

Michael, L. Brodie, (2010) Data Integration at Scale: From Relational Data Integration to Information Ecosystems, 24th IEEE International Conference on Advanced Information Networking and Applications.

Zhang Jianhong, Chen Hua, (2010) Security storage in the cloud computing: A RSA-based assumption data integrity check without original data, International conference on educational and information technology.

Zue Jing, Zhang Jian-jun, (2010) A brief survey on the security model of cloud computing, Ninth international symposium on distributed computing and application to business, engineering and science.

Xin Yue Yang, Zhen Liu, Yan Fu, (2011) MapReduce as a Programming Model for Association Rules Algorithm on Hadoop.

Robert Grossman, Yunhong Gu, (2008) Data Mining Using High Performance Data Clouds: Experimental Studies Using Sector and Sphere, University of Illinois at Chicago and Open Data Group.

Kevin chiew, Shaowen qin, (2008) Analysis of privacy-preserving mechanisms for outsourcing data mining tasks, IEEE.

Ian molloy, Ninghuri li, and Tiancheng Li, (2009) On the (In) security and (Im) practically of outsourcing precise association data mining, ninth IEEE international conference on data mining.

Bostjan Brumen, TAtjana welzer, (2010), protecting medical data for analysis.

Mohammed J. Zaki, (2000) Scalable Algorithms for Association Mining, IEEE Transactions On Knowledge And Data Engineering, Vol. 12, No. 3, May/June 2000.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.