Integrating E-Commerce & Data Mining Architecture Challenges
Abstract
We show that the e-commerce domain can provide all the right ingredients for successful data mining and claim that it is killer domain for data mining. We describe an integrated architecture based on our experience for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e. g. click streams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization and OLAP,We conclude with a set of challenges.
Keywords
Full Text:
PDFReferences
Eric Schmitt, Harley Manning, Yolanda Paul, and Sadaf Roshan,ommerce Software Takes Off, Forrester Report, March 2000.
Eric Schmitt, Harley Manning, Yolanda Paul, and Joyce Tong, Measuring Web Success, Forrester Report, November 1999.
Gregory Piatetsky-Shapiro, Ron Brachman, Tom Khabaza, Willi loesgen,and Evangelos Simoudis, An Overview of Issues in Developing Industrial Data Mining and Knowledge Discovery Applications, Proceeding of the second international conference on Know edge Discovery and Datamining, 1996.
Ralph Kimball, The Data Warehouse Toolkit: Practical Techniques for Building Dimen-sional Data Warehouses, John Wiley & Sons, 1996.
Ralph Kimball, Laura Reeves, Margy Ross, Warren Thornthwaite, The Data WarehouseLifecycle Toolkit : Expert Methods for Designing,Developing, and Deploying DataWarehouses, John Wiley & Sons, 1998.
Robert Cooley, Bamshad Mobashar, and Jaideep Shrivastava, Data Preparation for Mining World Wide Web Browsing Patterns, Knowledge and Information Systems, 1, 1999.
L. Catledge and J. Pitkow, Characterizing browsing behaviors on the World Wide Web, Computer Networks and ISDN Systems, 27(6), 1995.
J. Pitkow, In search of reliable usage data on the WWW, Sixth International World WideWeb Conference, 1997.
Shahana Sen, Balaji Padmanabhan, Alexander Tuzhilin, Norman H.White, and Roger Stein, The identification and satisfaction of consumer analysis-driven information needs of marketers on the WWW, European Journal of Marketing, Vol. 32 No. 7/8 1998.
Osmar R. Zaiane, Man Xin, and Jiawei Han, Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs, Proceedings of Advances in digital
Stephen Gomory, Robert Hoch, Juhnyoung Lee, Mark Podlaseck, Edith Schonberg Analysis and Visualization of Metrics for Online Merchandizing, Proceedings of WEBKDD’99, Springer 1999.
Barry Becker, Ron Kohavi, and Dan Sommer- field, Visualizing the Simple Bayesian Classifier, KDD Workshop on Issues in the Integration of Data Mining and Data Visualization, 1997
Michael J. A. Berry and Gordon Linoff, Data Mining Techniques: For Marketing, Sales, and Customer Support, John Wiley & Sons, 2000
Saharon Rosset, Uzi Murad, Einat Neumann Yizhak Idan, and Gadi Pinkas, Discovery of Fraud Rules for Telecommunications: Challenges and solutions Proceedins of the FIFTH ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining, 1999.
Hussein Almuallim, Yasuhiro Akiba, and Shigeo Kaneda, On Handling Tree-Structured Attributes, proceedings of TWETH International confernce on Machine Learning p. 12--20, 1995.
CFO Magazine, April 2000.
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.