Open Access Open Access  Restricted Access Subscription or Fee Access

Determine the Properties of Objects to Maximum Clearance

K. Venkatesh Sharma, K.Hanumanth Rao, P. Nagaraj

Abstract


In recent years, there has been significant interest in the development of ranking functions and efficient top-k retrieval algorithms to help users in ad hoc search and retrieval in databases (e.g., buyers searching for products in a catalog). We introduce a complementary problem: How to guide a seller in selecting the best attributes of a new tuple (e.g., a new product) to highlight so that it stands out in the crowd of existing competitive products and is widely visible to the pool of potential buyers. We develop several formulations of this problem. Although the problems are NP-complete, we give several exact and approximation algorithms that work well in practice. One type of exact algorithms is based on Integer Programming (IP) formulations of the problems. Another class of exact methods is based on maximal frequent item set mining algorithms. The approximation algorithms are based on greedy heuristics. A detailed performance study illustrates the benefits of our methods on real and synthetic data.

Keywords


Data Mining, Knowledge and Data Engineering Tools and Techniques, Marketing, Mining Methods and Algorithms, Retrieval Models.

Full Text:

PDF

References


S. Agrawal, S. Chaudhuri, G. Das, and F. Gionis, “Automated Ranking of Database Query Results,” Proc. Conf. Innovative Data Systems Research (CIDR), 2003.

R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., pp. 307-328, AAAI/ MIT Press, 1996.

R.J. Bayardo Jr., “Efficiently Mining Long Patterns from Data-bases,” Proc. SIGMOD Conf., vol. 1, pp. 85-93, 1998.

S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” Proc. Int’l Conf. Data Eng. (ICDE ’01), 2001.

D. Burdick, M. Calimlim, and J. Gehrke, “MAFIA: A Maximal Frequent Item Set Algorithm for Transactional Databases,” Proc. Int’l Conf. Data Eng. (ICDE), 2001.

S. Brin and L. Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Proc. World Wide Web (WWW) Conf., 1998.

S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum, “Probabilistic Ranking of Database Query Results,” Proc. Int’l Conf. Very Large Data Bases (VLDB), 2004.

G. Das, V. Hristidis, N. Kapoor, and S. Sudarshan, “Ordering the Attributes of Query Results,” Proc. SIGMOD Conf., 2006.

I. Good, “The Population Frequencies of Species and the Estimation of Population Parameters,” Biometrika, vol. 40, pp. 237-264, 1953.

I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” J. Machine Learning Research, vol. 3, pp. 1157-1182, Mar. 2003.

M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.

D. Gunopulos, R. Khardon, H. Mannila, S. Saluja, H. Toivonen, and R.S. Sharm, “Discovering All Most Specific Sentences,” ACM Trans. Database Systems, vol. 28, no. 2, pp. 140-174, 2003.

M. Gori and I. Witten, “The Bubble of Web Visibility,” Comm. ACM, vol. 48, no. 3, pp. 115-117, Mar. 2005.

K. Gouda and M.J. Zaki, “Efficiently Mining Maximal Frequent Itemsets,” Proc. Int’l Conf. Data Mining (ICDM), 2001.

J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. SIGMOD Conf., pp. 1-12, 2000.

J. Han, J. Wang, Y. Lu, and P. Tzvetkov, “Mining Top-k Frequent Closed Patterns without Minimum Support,” Proc. Int’l Conf. Data Mining (ICDM), 2002.

J.-P. Huang, C.-T. Yang, and C.-H. Fu, “A Genetic Algorithm Based Searching of Maximal Frequent Itemsets,” Proc. Int’l Conf. Applied Informatics (ICAI), 2004.

J. Kleinberg, C. Papadimitriou, and P. Raghavan, “A Microeco-nomic View of Data Mining,” Data Mining and Knowledge Discovery, vol. 2, no. 4, pp. 311-324, Dec. 1998.

D. Kossmann, F. Ramsak, and S. Rost, “Shooting Stars in the Sky: An Online Algorithm for Skyline Queries,” Proc. Int’l Conf. Very Large Data Bases (VLDB), 2002.

C. Li, B.C. Ooi, A.K.H. Tung, and S. Wang, “DADA: A Data Cube for Dominant Relationship Analysis,” Proc. SIGMOD Conf., 2006.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.