Open Access Open Access  Restricted Access Subscription or Fee Access

A Modeling Approach for Datamining and Predictive Modeling Decision Tree Ensembles

N. Poongodi, B. Firdaus Begam

Abstract


The data-mining classification and predictive modeling algorithms that are based on bootstrapping techniques re-use a source data set, repeatedly, to create a family of predictive and classification models that can be said to render a "holographic" view of the modeled data. The results offer a classification and prediction performance that is superior to single-model approaches. These holographic approaches are applied in an industrial setting that involves text mining warranty claims at a major international car, truck, and heavy equipment manufacturer. This paper explains the methods used, how they work, and how they perform in the text-mining area as applied to warranty claims. Combined text and quantitative data models are developed, tested, and validated in order to address the goal of achieving "better-than-human" classification performance on warranty claims.

Keywords


Bootstrap, Classification, Holographic, Predictive, Text Mining.

Full Text:

PDF

References


Amit, Y. and D.Geman. 1996,1997. “Shape uantization and Recognition with Randomized Trees.” Neural Computation 9:1545-1588.

Berry, M., J., A., & Linoff, G., S., (2000). Mastering data mining. New York:Wiley.

Breiman,L. 1996. “Bagging Predictors.” Machine Learning 24(2):123-140.

Breiman, L. 1998. “Arcing Classifiers.” The Annals of Statistics 26(3):801-849.

Breiman, L. 2001. “Random Forests.” Machine Learning 45(1):5-32.

Edelstein, H., A. (1999). Introduction to data mining and knowledge discovery (3rd ed). Potomac, MD: Two Crows Corp.

Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (1996). Advances in knowledge discovery & data mining. Cambridge, MA: MIT Press.

Freund, Y. and R. E. Schapire. 1995. “A Decision-theoretic Generalization of On-line Learning and an Application to Boosting.” Barcelona, Spain. pp. 23-37.

Han, J., Kamber, M. (2000). Data mining: Concepts and Techniques. New York: Morgan-Kaufman.

Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction. New York: Springer.

Pregibon, D. (1997). Data Mining. Statistical Computing and Graphics, 7, 8.

Weiss, S. M., & Indurkhya, N. (1997). Predictive data mining: A practical guide. New York: Morgan-Kaufman.

Westphal, C., Blaxton, T. (1998). Data mining solutions. New York: Wiley.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.