Arama Sonuçları

Listeleniyor 1 - 1 / 1
  • Yayın
    Software defect prediction using Bayesian networks and kernel methods
    (Işık Üniversitesi, 2012-07-01) Okutan, Ahmet; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Doktora Programı
    There are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian modelling to determine the influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, We define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We wxtract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, We learn both the marginal defect proneness probability of the whole software system and the set of most effective metrics. Our experiments on nine open source Promise data repository data sets show that respense for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most efective metrics whereas coupling between objets (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less efective metris on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are unstustworthy. On tthe other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, We observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness.However, futher investigation involving a greater number of projects, is need to confirm our findings. Furthermore, we propose a novel technique for defect prediction that uses plagiarism detection tools. Although the defect prediction problem haz been researched for a long time, the results achieved are not so bright. We use kernel programming to model the relationship between source code similarity and defectiveness. Each value in the kernel matrix shows how much parallelism exit between the corresponding files ib the kernel matrix shows how much parallelism exist between the corresponding files in the software system chosen. Our experiments on 10 real world datasets indicate that support vector machines (SVM) with a precalculated kernel matrix performs better than the SVM with the usual linear and RBF kernels and generates comparable results with the famous defect prediction methods like linear logistic regression and J48 in terms of the area under the curve (AUC).Furthermore, we observed that when the amount of similarity among the files of a software system is high, then the AUC found by the SVM with precomputed kernel can be used to predict the number of defects in the files or classes of a software system, because we observe a relationship between source code similarity and the number of defects. Based on the results of our analysis, the developers can focus on more defective modules rather than on less or non defective ones during testing activities. The experiments on 10 Promise datasets indicate that while predicting the number of defects, SVM with a precomputed kernel performs as good as the SVM with the usual linear and RBF kernels, in terms of the root mean square error (RMSE). The method proposed is also comparable with other regression methods like linear regression and IBK. The results of these experiments suggest that source code similarity is a good means of predicting both defectiveness and the number of defects in software modules.