Open Access Open Access  Restricted Access Subscription or Fee Access

A Model for Early Prediction of Faults in Software Systems

Raman Goyal, Parvinder S. Sandhu, Amanpreet S. Brar

Abstract


Quality of a software component can be measured in terms of fault proneness of data. Quality estimations are made using fault proneness data available from previously developed similar type of projects and the training data consisting of software measurements. To predict faulty modules in software data different techniques have been proposed which includes statistical method, machine learning methods, neural network techniques and clustering techniques. Predicting faults early in the software life cycle can be used to improve software process control and achieve high software reliability. The aim of proposed approach is to investigate that whether metrics available in the early lifecycle (i.e. requirement metrics), metrics available in the late lifecycle (i.e. code metrics) and metrics available in the early lifecycle (i.e. requirement metrics) combined with metrics available in the late lifecycle (i.e. code metrics) can be used to identify fault prone modules using decision tree based Model in combination of K-means clustering as preprocessing technique. This approach has been tested with CM1 real time defect datasets of NASA software projects. The high accuracy of testing results show that the proposed Model can be used for the prediction of the fault proneness of software modules early in the software life cycle.


Keywords


Clustering, Decision Tree, K-means, Software Quality.

Full Text:

PDF

References


NASA IV &V Facility. Metric Data Program. Available from http: //MDP.ivv.nasa.gov/.

Jiang Y., Cukic B. and Menzies T. (2007), “Fault Prediction Using Early Lifecycle Data”. ISSRE 2007, the 18th IEEE Symposium on Software Reliability Engineering, IEEE Computer Society, Sweden, pp. 237-246.

Seliya N., Khoshgoftaar T.M. and Zhong S. (2005), “Analyzing software quality with limited fault-proneness defect data”, in proceedings of the Ninth IEEE international Symposium on High Asssurance System Engineering, Germany, pp. 89-98.

Audris Mockus, Nachiappan Nagappan and Trung T.Dinh-Trong (2009),“Test Coverage and Post-Verification Defects: A Multiple Case Study,”, To Appear: ACM-IEEE Empirical Software Engineering and Measurement Conference (ESEM), Orlando, FL

Bindu Goel & Yogesh Singh (2008),“Emperical Investigation of Metrics for Fault Prediction on Object Oriented Software” the Book series in Computational Intelligence.

Fenton, N.E. and Neil, M(2009), “A critique of software defect prediction models”, Software Engineering, IEEE Transactions on, Volume: 25 Issue: 5, pp: 675 -689.

Cagatay Catal & Banu Diri(2009) , “ A Systematic Review of Software Fault Prediction Studies” Journal of Expert Systems with Applications, Volume 36, Issue 4.

Khoshgoftaar, T.M. and Munson, J.C.(2009),“Predicting Software Development Errors using Complexity Metrics”, Selected Areas in Communications, IEEE Journal on, Volume: 8 Issue: 2, pp: 253 -261.

Khoshgoftaar T. M. and Seliya, N.(2002), "Tree-based software quality estimation models for fault prediction", METRICS 2002, the Eighth IIIE Symposium on Software Metrics pp. 203-214.

Fenton N.E. and Pfleeger S.L. (1997), “Software Metrics: A Rigorous and Practical Approach”. PWS publishing Company: ITP, Boston, MA, 2nd edition, pp.132-145.

Khoshgoftaar, T.M. and Munson, J.C.(2009),“Predicting Software Development Errors using Complexity Metrics”, Selected Areas in Communications, IEEE Journal on, Volume: 8 Issue: 2, pp: 253 -261.

Munson J.C. and Khoshgoftaar T.M. (1992), “The detection of fault-prone programs”, IEEE Transactions on Software Engineering, vol. 18, issue: 5, pp. 423-433.

Bellini P. (2005), “Comparing Fault-Proneness Estimation Models”, 10th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS'05), China, pp. 205-214.

Lanubile F., Lonigro A., and Visaggio G. (1995) “Comparing Models for Identifying Fault-Prone Software Components”, Proceedings of Seventh International Conference on Software Engineering and Knowledge Engineering, USA, pp. 12-19.

Fenton N.E. and Neil M. (1999), “A Critique of Software Defect Prediction Models”, IEEE Transactions on Software Engineering, vol. 25, issue: 5, pp. 675-689.

Runeson, Wohlin C. and Ohlsson M.C. (2001), “A Proposal for Comparison of Models for Identification of Fault-Proneness”, Journal of System and Software, vol. 56, issue: 3, pp. 301–320

Runeson, Wohlin C. and Ohlsson M.C. (2001), “A Proposal for Comparison of Models for Identification of Fault-Proneness”, Journal of System and Software, vol. 56, issue: 3, pp. 301–320.

Challagulla V.U.B., Bastani F.B., Yen I. L. and Paul (2005) “Empirical assessment of machine learning based software defect prediction techniques”, 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems, USA, pp. 263-270.

Basu S., Banerjee A. and Moorey R.. (2002) “Semi-Supervised Clusering by Seeding”. In Proceedings of the 19th International Conference on Machine Learning, Sydney, pp. 19-26

Brodely C.E. and Friedl. M.A. (1999) “Identifying mislabeled training Data”. Journal of Artificial Intelligence Research, vol. 11, pp.131-167.

Ma Y. and Guo L. (2006),“A Statistical Framework for the Prediction of Fault-Proneness”, Product Focused Process Improvement, Edition: First, Publisher: Springer Berlin/Heidelberg, pp. 204-214.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.