Min Hash Clustering Algorithm for Extraction of HTML Tags from Social Media
A. Arasu and H. Garcia-Molina, “Extracting Structured Data fromWeb Pages,” Proc. ACM SIGMOD, pp. 337-348, 2003.
C.-H. Chang and S.-C. Lui, “IEPAD: Information Extraction Basedon Pattern Discovery,” Proc. Int’l Conf. World Wide Web (WWW-10), pp. 223-231, 2001.
C.-H. Chang, M. Kayed, M.R. Girgis, and K.A. Shaalan, “Survey of Web Information Extraction Systems,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 10, pp. 1411-1428, Oct. 2006.
V. Crescenzi, G. Mecca, and P. Merialdo, “Knowledge and Data Engineerings,” Proc. Int’l Conf. Very Large Databases (VLDB), pp. 109-118, 2001.
C.-N. Hsu and M. Dung, “Generating Finite-State Transducers for Semi-Structured Data Extraction from the Web,” J. Information Systems, vol. 23, no. 8, pp. 521-538, 1998.
N. Kushmerick, D. Weld, and R. Doorenbos, “Wrapper Induction for Information Extraction,” Proc. 15th Int’l Joint Conf. Artificial Intelligence (IJCAI), pp. 729-735, 1997.
A.H.F. Laender, B.A. Ribeiro-Neto, A.S. Silva, and J.S. Teixeira, “A Brief Survey of Web Data Extraction Tools,” SIGMOD Record, vol. 31, no. 2, pp. 84-93, 2002.
B. Lib, R. Grossman, and Y. Zhai, “Mining Data Records in Webpages,” Proc. Int’l Conf. Knowledge Discovery and Data Mining (KDD), pp. 601-606, 2003.
I. Muslea, S. Minton, and C. Knoblock, “A Hierarchical Approachto Wrapper Induction,” Proc. Third Int’l Conf. Autonomous Agents(AA ’99), 1999.
K. Simon and G. Lausen, “ViPER: Augmenting Automatic Information Extraction with Visual Perceptions,” Proc. Int’l Conf. Information and Knowledge Management (CIKM), 2005.
J. Wang and F.H. Lochovsky, “Data Extraction and Label Assignment for Web Databases,” Proc. Int’l Conf. World Wide Web (WWW-12), pp. 187-196, 2003
Y. Yamada, N. Craswell, T. Nakatoh, and S. Hirokawa, “Testbed for Information Extraction from Deep Web,” Proc. Int’l Conf. WorldWide Web (WWW-13), pp. 346-347, 2004.
W. Yang, “Identifying Syntactic Differences between Two Programs,” Software—Practice and Experience, vol. 21, no. 7, pp. 739-755, 1991.
Y. Zhai and B. Liu, “Web Data Extraction Based on Partial TreeAlignment,” Proc. Int’l Conf. World Wide Web (WWW-14), pp. 76-85, 2005.
H. Zhao, W. Meng, Z. Wu, V. Raghavan, and C. Yu, “Fully Automatic Wrapper Generation for Search Engines,” Proc. Int’l Conf. World Wide Web (WWW), 2005.
H. Zhao, W. Meng, Z. Wu, V. Raghavan, and C. Yu, “Automatic Extraction of Dynamic Record Sections from Search Engine ResultPages,” Proc. Int’l Conf. Very Large Databases (VLDB), pp. 989-1000,2006.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.