Open Access Open Access  Restricted Access Subscription or Fee Access

Detecting Neglected Conditions in Software through Data Mining

A.V.K. Shanthi, Dr.G. Mohan Kumar


Recent studies have shown that neglected conditions are common and can have dramatic consequences to the operations of a Software defects. Neglected conditions in software can compromise the accuracy of a software product or even cause global disruptions to the quality of a Product. Several solutions have been proposed. However, these solutions share a common limitation: they are based on rules which need to be known beforehand. Violations of these rules are deemed to the problem of detecting the neglected conditions. As policies typically differ among different software’s, these approaches are limited in the scope of mistakes they can detect. In this paper, we address the problem of detecting neglected conditions in software using data mining technique. We apply association rules mining to the program files of the software product across an administrative domain to discover local, software product-specific policies. Deviations from these local policies are potential for detecting the neglected conditions. In this evaluation, we focused on three aspects of the configurations: syntax, interfaces and function calls. Here the advanced data mining techniques is used to discover implicit conditional rules in a code base and to discover rule violations that indicate neglected conditions. More interestingly, our empirical study indicates that our approaches are able to discover ordering rules, which involve ordering of function calls, and the corresponding rule violations also.


Neglected Conditions; Data Mining; Software Defects

Full Text:



Ray-Yuang Chang, Andy Podgurski, Jiong Yang. Finding What’s Not There: A New Approach to Revealing Neglected Conditions in Software

M. Acharya, T. Xie, and J. Xu. Mining Interface Specifications for Generating Checkable Robustness Properties. In Proc. ISSRE, pages 311–320, November 2006.

A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Princiles, Techniques, and Tools. Addison-Wesley, 1986.

G. Ammons, R. Bodik, and J. R. Larus. Mining specifications. In Proc. POPL, pages 4–16, 2002.

R.-Y. Chang, A. Podgurski, and J. Yang. Finding what’s not there: a new approach to revealing neglected conditions in software. In Proc. ISSTA, pages 163–173, 2007.

D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: a general approach to inferring errors in systems code. In Proc. SOSP, pages 57–72, 2001.

M. Ernst, J. Cockrell, W. Griswold, and D. Notkin. Dynamically Discovering Likely Program Invariants to Support Program Evolution. IEEE Trans. Softw. Eng., 27(2):99–123, 2001.

Google Code Search Engine, 2006.

S. Hangal and M. S. Lam. Tracking Down Software Bugs Using Automatic Anomaly Detection. In Pro. ICSE, pages 291–301, 2002.

T. Lethbridge, J. Singer, and A. Forward. How software engineers use documentation: The state of the practice. In IEEE Software, pages 35–39, 2003.

Z. Li and Y. Zhou. PR-Miner: Automatically Extracting Implicit Programming Rules and Detecting Violations in Large Software Codes. In Proc. FSE, pages 306–315, 2005.

V. B. Livshits and T. Zimmermann. DynaMine: Finding Common Error Patterns by Mining Software Revision Histories. In Proc. ESEC/FSE, pages 296–305, 2005.

M. K. Ramanathan, A. Grama, and S. Jagannathan. Path-Sensitive Inference of Function Precedence Protocols. In Proc. ICSE, pages 240–250, 2007.

S. Shoham, E. Yahav, S. Fink, and M. Pistoia. Static specification mining using automata-based abstractions. In Proc. ISSTA, pages 174–184, 2007.

S. Thummalapenta and T. Xie. PARSEWeb: A Programmer Assistant for Reusing Open Source Code on the Web. In Proc. ASE, November 2007.

A. Wasylkowski, A. Zeller, and C. Lindig. Detecting Object Usage Anomalies. In Proc. ESEC/FSE, September 2007.

C. C. Williams and J. K. Hollingsworth. Recovering system specific rules from software repositories. In Proc. MSR, pages 1–5, 2005.

T. Xie and J. Pei. MAPO: Mining API usages from open source repositories. In Proc. of MSR, pages 54–57, 2006.

J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: mining temporal API rules from imperfect traces. Proc. ICSE, pages 282–291, 2006.

M. Howard and D. LeBlanc, Writing Secure Code, second ed. Microsoft Press, 2003.

W. Howden, ―Reliability of the Path Analysis Testing Strategy,‖ IEEE Trans. Software Eng., vol. 2, pp. 208-215, Sept. 1976.

J. Huan, W. Wang, and J. Prins, ―Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism,‖ Proc. Third IEEE Int’l Conf. Data Mining, pp. 549-552, 2003.

J. Huan, W. Wang, J. Prins, and J. Yang, ―SPIN: Mining Maximal Frequent Subgraphs from Graph Database,‖ Proc. 10th Int’l Conf Knowledge Discovery and Data Mining, 2004.

[24] IBM, ―Orthogonal Defect Classification,‖ Center for SoftwareEng.,, 2008.

J. Krinke, ―Identifying Similar Code with Program Dependence Graphs,‖ Proc. Eighth Working Conf. Reverse Eng., 2001.

[26] M. Kuramochi and G. Karypis, ―Finding Frequent Patterns in a Large Sparse Graph,‖ Data Mining and Knowledge Discovery, vol. 11, no. 3, pp. 243-271, Nov. 2005.

M. Kuramochi and G. Karypis, ―GREW—A Scalable Frequent Subgraph Discovery Algorithm,‖ Proc. Fourth IEEE Int’l Conf. Data Mining, pp. 439-442, Nov. 2004.

Z. Li and Y. Chou, ―PR-Miner: Automatically Extracting Implicit Programming Rules and Detecting Violations in Large Software Code,‖ Proc. Fifth Joint Meeting of the European Software Eng. Conf. and the ACM SIGSOFT Symp. Foundations of Software Eng., 2005.

Z. Li, S. Lu, and S. Myagmar, ―CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code,‖ IEEE Trans. Software Eng., vol. 33, no. 3, pp. 176-192, Mar. 2006.

C. Liu, X. Yan, and J. Han, ―GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis,‖ Proc. 12th Int’l Conf. Knowledge Discovery and Data Mining, 2006.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.