Open Access Open Access  Restricted Access Subscription or Fee Access

Ontology Based Effective Semantic Information Retrieval for Big Data

G. Bhavani, S. Sangeetha, S. Sivakumari


A huge amount of data stored on the Internet will be useful and helpful only if it is accessed as information, not as pure data. Nowadays Big data overcomes several issues suchassearching,analysingsharingstoragetransfervisualization and querying. Among these issues, semantic retrieval is a huge issue. In order to avoid these problems, Hadoop Distributed File System (HDFS) is proposed. HDFS performs semantic analysis over the volume of documents (Big data) to find the best matched source document from the collected set of source documents for the same virtual document. In the hadoop file system, the semantic analysis is done using Dual Walk based Ranking model for providing best matched documents and the resulting documents are filtered by making use of Top K algorithm based on the frequency of the entities in the source document. But, the existing system still has issues with the ontological indexing concept and hence the accuracy of semantic information retrieval is reduced. In order to overcome this ontological indexing concept is focused to retrieve highly relevant and semantic information. Ontology based information retrieval increases the most relevant information by filtering the unrelated terms in the documents. The documents are clustered based on the Ontology and the input query is examined for semantics and expanded using domain Ontology. Thus the accuracy of the semantic information is increased and searching complexity is reduced significantly. From the experimental result, the conclusion decides that the proposed system is better than the existing system.


Big Data, HDFS, Information Retrieval, Ontology

Full Text:



Daewoo Lee, Jin-Soo Kim and SeungryoulMaenga, Large – Scale Incremental Processing with MapReduce, FGCS, vol. 36, pp. 66–79, September (2013).

Y. Zhang, An Excellent Web Content Management System, In Conference on Multimedia Technology, Hangzhou, pp. 3305–3307, June 26–28, (2011).

Butt, AnilaSahar, Armin Haller, and LexingXie. "Relationship-based top-k concept retrieval for ontology search" Knowledge Engineering and Knowledge Management.Springer International Publishing, 2014. 485-502

B. Khaled and Shaban, A Semantic Approach for Document Clustering, JOS, vol. 4, no. 5, pp. 391–404, (2013)

Mao, Ming. Ontology mapping: An information retrieval and interactive activation network based approach. Springer Berlin Heidelberg, 2007.

Blanco, Roi, et al. "Repeatable and reliable semantic search evaluation." Web Semantics: Science, Services and Agents on the World Wide Web 21 (2013): 14-29.

Li, W., et al. "Semantic-based web service discovery and chaining for building an Arctic spatial data infrastructure." Computers & Geosciences 37.11 (2011): 1752-1762.

Gracia, Jorge, Mathieu d'Aquin, and Eduardo Mena. "Large scale integration of senses for the semantic web."Proceedings of the 18th international conference on World Wide Web. ACM, 2009

Malik, Sanjay Kumar, NupurPrakash, and Sam Rizvi. "Semantic annotation framework for intelligent information retrieval using KIM architecture" International Journal of Web & Semantic Technology (IJWest) 1.4 (2010): 12

Sofia Stamou “Retrieval Effectiveness Of An Ontology-Based Model For Conceptual Indexing”Computer Engineering And Informatics Department, Patras University, 26500 GreeceStamou@Ceid.Upatras.Gr-26.

H. Alani, C. Brewster, and N. Shadbolt. Ranking Ontologies with AKTive Rank. In Proceedings of the International Semantic Web Conference (ISWC), pages 5–9.Springer-Verlag, 2006.

A. S. Butt, A. Haller, and L. Xie. Ontology search: An empirical evaluation. In Proceedings of the International Semantic Web Conference, pages 130–147, RivadelGara, Italy, 2014.

G. Cheng, W. Ge, and Y. Qu. Falcons: searching and browsing entities on the semantic web. In Proceedingsof the 17th International World Wide Web Conference,pages 1101–1102, ACM, 2008.

M. d’Aquin and H. Lewen.Cupboard – A Place toExpose Your Ontologies to Applications and the Community.In Proceedings of the 6th European Semantic Web Conference, pages 913–918, Berlin, Heidelberg,Springer-Verlag, 2009.

M. d’Aquin and E. Motta. Watson, More Than a Semantic Web Search Engine. Semantic Web, 2(1): 55– 63, 2011.

L. Ding, R. Pan, T. Finin, A. Joshi, Y. Peng, and P. Kolari. Finding and ranking knowledge on the semanticweb. In Proceedings of the International Semantic Web Conference, volume 3729 of Lecture Notesin Computer Science, pages 156–170, 2005.

Abdullah Gani, Aisha Siddiqa, Shahaboddin Shamshirband, Fariza Hanum, “A survey on indexing techniques for big data: taxonomy and performance evaluation” in Springer-Verlag London, 2015.

Jens-Erik Mai, “Analysis in indexing: document and domain centered approaches” in Elsevier, 2004

Mai, J.-E. (2000). “Deconstructing the indexing process”. Advances in Librarianship, 23, 269–298.

Mai, J.-E. (2001). Semiotics and indexing: an analysis of the subject indexing process. Journal of Documentation, 57, 591– 622.

Milstead, J. L. (1994). Needs for research in indexing. Journal of the American Society for Information Science, 45, 577–582.

Sievert, M. C., & Andrews, M. J. (1991). Indexing consistency in information science abstracts. Journal of the American Society for Information Science, 42, 1–6.

Soergel, D. (1985). Organizing information: principles of data base and retrieval systems. Orlando, FL: Academic Press.

Talja, S., Keso, H., & Pietilainen, T. (1999). The production of context in information seeking research: a metatheoretical view. Information Processing and Management, 35, 751–763.

Tibbo, H. R. (1994). Indexing for the humanities. Journal of the American Society for Information Science, 45, 607–619.

C. C. Aggarwal, S. C. Gates, P. S. Yu. On theMerits of Using Supervised Clustering for building Categorization Systems. ACM SIGKDD Conference, 1999.

C. C. Aggarwal. On the Effects of Dimensionality Reduction on High Dimensional Similarity Search. ACM PODS Conference, 2001.

C. C. Aggarwal, S. Parthasarathy. Mining Massively Incomplete Data Sets by Conceptual Reconstruction. ACM KDD Conference, 2001.

P. Anick, and S. Vaithyanathan. Exploiting Clustering and Phrases for Context-based Information Retrieval. ACMSIGIR Conference, 1997.

L. Douglas Baker, A. K. McCallum. Distributional Clustering of words for Text Classification. ACM SIGIR Conference, 1998.

C. Faloutsos Access Methods for Text. ACM Computing Surveys17, 1, March 1995.

G. Karypis, E.-I. Han. Concept Indexing: A Fast Dimensionality Reduction Technique with Applications to Document Retrieval and Categorization. CIKM Conference, 2000.

G. Salton, M. J. McGill. Introduction to Modern Information Retrieval. Mc Graw Hill, New York, 1983.

H. Schutze, C. Silverstein. Projections for efficient document clustering. ACM SIGIR Conference, 1997.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.