Open Access Open Access  Restricted Access Subscription or Fee Access

Comparative Analysis of Techniques to Predict Fault Proneness

Ranbir Singh, Seema Bagla


Software Quality and reliability are essential parts of software development process. Fault Proneness is a measure of data that can help the programmers to predict fault prone areas in the projects during testing or development process. This knowledge can prove very beneficial in improving software quality. Software Quality Estimation models can be broadly classified as classification and prediction. Classification techniques are used to predict probability of occurrence of fault but cannot be used to predict the number of faults. Whereas, count models such as the Poisson regression model, and the zero-inflated Poisson regression model can be used to obtain both a qualitative classification, and a quantitative prediction for software quality. In this paper we are reviewing models such as count and classification models to bring in light the most often used techniques by the researchers and academicians.


Software Quality, Fault Proneness, Count Models, Classification Models, Analysis

Full Text:



N. Ohlsson, M. Zhao, and M. Helander, “Application of multivariate analysis for software fault prediction,” Software Quality Journal, vol. 7, no. 1, pp. 51–66, 1998.

T. M. Khoshgoftaar, K. Gao, and R. M. Szabo, “An application of zero-inflated Poisson regression for software fault prediction,” in Procedings of the Twelfth International Symposium on Software Reliability Engineering, Hong Kong,November 2001,pp.66-73,China.

T. M. Khoshgoftaar, K. Gao, and R. M. Szabo, “Classification with count models,” in Proceedings of Eighth ISSAT International Conference on Reliability and Quality in Design, Anaheim, California,USA, August 7–9, 2002, pp. 180–184.

T. M. Khoshgoftaar, B. Cukic, and N. Seliya, “Predicting fault-prone modules in embedded systems using analogy-based classification models,” International Journal of Software Engineering and Knowledge Engineering, vol. 12, no. 2, pp. 201–221, Apr. 2002, World Scientific Publishing.

T. M. Khoshgoftaar, E. B. Allen, and J. Deng, “Using regression trees to classify fault-prone software modules,” IEEE Trans. Reliability,vol.51, no. 4, pp. 455–462, Dec. 2002.

T. M. Khoshgoftaar, E. B. Allen, and R. Shan, “Improving tree-based models of software quality with principal components analysis,” in Proceedings: Eleventh International Symposium on Software Reliability Engineering, San Jose, CA, USA, Oct. 2000, pp. 198–209.

L. C. Briand, W. L. Melo, and J. Wust, “Assessing the applicability of fault-proneness models across object-oriented software projects,” IEEE Trans. Software Engineering, vol. 28, no. 7, pp. 706–720, July 2002.

M.C.Ohlsson and C.Wohlin, “Identification of green,yellow and red legacy components,”in Proceedings International Coferenceon Software Maintenance, Bethesday, Washington D.C., U.S.A.November 1998,pp.6-15.

R. Takahashi, Y. Muraoka, and Y. Nakamura, “Building software quality classification trees Approach, experimentation,evaluation,” in Proceeding : 8th Inter-national Symposium on Software Reliablity Engineering, Albuquerque, NM, USA, November 1997, pp. 222–233.

W. H. Green, Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models Economics Department, New York University, Technical Report EC-94-10, 1994.

J.Mullahy, “Specification and testing of some modified count data models,” Journal of Econometrics, vol. 33, pp. 341–365, 1986.

T.M.Khoshgoftaar,N.Seliya,and N.Sundaresh. “An empirical study of predicting software faults with case-based reasoning”.Software Quality Control,14(2):85-111,2006.

Hosmer D, Lemeshow S. 1989. Applied Logistic Regression. John Wiley and Sons: USA.

W. R. Dillon, and M. Goldstein, Multivariate Analysis: Methods and Applications, John Wiley Sons, New York, 1984.

T.M.Khoshgoftaar, E. B. Allen, and J. Deng, “Using regression trees to classify fault-prone software modules,” IEEE Trans. Reliability, vol. 51, no. 4, pp. 455–462, Dec. 2002

D.Lambert,“Zero-inflated Poisson Regression,wUsing regression with an application to defects in manufacturing ,”Technometrics, vol. 34, no. 1, pp. 1–14, Feb.1992.

W.H.Greene,Econometric Analysis,4th ed. Upper Saddle River,New Jersey:New York University:Prentice-Hall Inc.,2000.

T.M.Khoshgoftaar,and K.Gao. “Count Models for Software Quality Estimation,”IEEE Trans. On Reliability,vol.56,no.2,June 2007.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.