Open Access Open Access  Restricted Access Subscription or Fee Access

Efficient XML Keyword Search

Swati Thawari Tonge, Rashmi Phalnikar

Abstract


eXtensible Markup Language (XML) is a semi structured text format which was designed to describe the data using custom tags. Custom tag makes an XML document self-describing so that it is easily understandable by human and machine. XML is now a standard format of data exchange between applications and used in configuration files of enterprise applications. The increasing preference to store and transmit data in the XML format has led to a need for searching these xml documents to retrieve useful information. Xpath and XQuery are powerful structured languages that are used to retrieve information from xml document. But these query languages are complex for non expert user to learn. Complex formats of query language restrict the usage of the xml database. Keyword search allows such user to retrieve information without understanding syntax of complex query language or schema of database. Along with the ease of retrieval of information, keyword search has some challenges like meaningful results, intension of search, keyword ambiguity, enormous results etc. This paper presents efficient keyword search method based on clustering and relevance ranking. Experiment has been conducted to show effectiveness of the proposed method.

Keywords


Keyword Search, XML, Natural Processing, Information Retrieval, Clustering, Relevance Ranking

Full Text:

PDF

References


Gang GOU, Rada Chirkova “Efficient querying large XML data Repositories : A Survey” IEEE 2007.

S. Boag, D. Chamberlin, and M. F. Fernandez XQuery 1.0: An XML query language. W3C Working Draft 22 August 2003.

A. Berglund, S. Boag, and D. Chamberlin. XML path language (XPath) 2.0. W3C Working Draft 23 July 2004.

Y.Xu and Y.Papakonstantinou.”Efficient Keyword Search for Smallest LCAs in XML Databases”, SIGMOD, 2005.

Haitao Wu Zhenmin Tang, “An Efficient Algorithm for Meaningful SLCA in XML Keyword Search “,IEEE, 2009.

Y. Li, C. Yu, and H. V. Jagadish „Schema-Free Xquery‟ ,VLDB, 2004

Chong Sun, Chee-Yong Chan “Multiway SLCA-based Keyword Search in XML Data”, WWW 2007.

Ziyang Liu, Jeffrey Walker, Yi Chen “XSeek: A Semantic XML Search Engine Using Keywords” xseek.asu.edu/xseekdemo.pdf

Lin Guo, Feng Shao ,Chavdar Botev, Jayavel Shanmugasundaram ”XRANK: Ranked Keyword Search over XML Documents”, SIGMOD 2003.

Zhifeng Bao, Jiaheng Lu, Tok Wang Ling and Bo Chen, “Towards an Effective XML Keyword Search” , IEEE 2010.

XipingLiu,ChangxuanWan,andLeiChen “Returning Clustered Results for Keyword earch on XML Documents”, IEEE TKDE ,Dec 2011.

Arash Termehchy “Keyword and Natural Language Query Processing for Semi-Structured data Sources” , IDAR 2009.

Yi Chen, Wei Wang, Ziyang Liu, Xuemin Lin “Keyword Search on Structured and Semi- Structured Data”, SIGMOD‟09.

V. Vesper. “Let‟s do dewey” http://www.mtsu.edu/ vvesper/dewey.html.

Berkeley Java DB www.sleepycat.com

XML Data Repository: http://www.cs.washington.edu/research/xmldataset.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.