6 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 6 / 6
Yayın Parallel proposition bank construction for Turkish(Işık Üniversitesi, 2019-04-02) Ak, Koray; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Doktora ProgramıPropBank is the bank of propositions which contains hand-annotated corpus for predicate-argument information and semantic roles or arguments. It aims to provide an extensive dataset for enhancing NLP applications such as information retrieval, machine translation, information extraction, and question answering by adding a semantic information layer to the syntactic annotation. Via the added semantic layer, syntactic parser re?nements can be achieved which increases the e?ciency and improves application performance. The aim of this thesis is to construct proposition bank for Turkish Language. Only preliminary studies were carried out in terms of Turkish PropBank. This study is one of the pioneers for the language. In this study, a hand annotated Turkish PropBank is constructed from the translation of the parallel English PropBank corpus, other PropBank studies for Turkish language examined and compared with the proposition bank constructed, automatic PropBank construction for Turkish from both parallel sentence trees and phrase sentences is analyzed and automatic proposition banks generated for Turkish.Yayın Facial expression recognition based on facial anatomy(Işık Üniversitesi, 2013-06-06) Benli, Kristin Surpuhi; Eskil, Mustafa Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Doktora ProgramıIn this thesis we propose to determine the underlying muscle forces that compose a facial expression under the constraint of facial anatomy. Muscular activities are novel features that are highly representative of facial expressions. We model human face with a 3D generic wireframe model that embeds all major muscles. The input to our expression recognition system is a video with marked set of landmark points on the first frame. We use these points and a semi-automatic fitting algorithm to register the 3D face model to the subject's face. The influence regions of facial muscles are estimated and projected to the image plane to determine feature points. These points are tracked on the image plane using optical flow algorithm. We estimate the rigid body transformation of the head through a greedy search algorithm. This stage enables us to align the 3D face model with the subject's head in consecutive frames of the video. We use ray tracing from the perspective reference point and through the image plane to estimate the new coordinates of model vertices. The estimated vertex coordinates indicate how the subject's face is deformed in the progression of an expression. The relative motion of model vertices provides us an over-determined linear system of equations where unknown parameters are the muscle activation levels. This system of equations is solved using constrained least square optimization. Muscle activity based features are evaluated in a classification problem of seven basic facial expressions. We demonstrate the representative power of muscle force based features on four classifiers; Linear Discriminant Analysis, Naive Bayes, k-Nearest Neighbor and Support Vector Machine. The best performance on the classification problem of seven expressions including neutral was 87.1 %, obtained by use of Support Vector Machine. The results we attained in this study are close to the human recognition ceiling of 87-91.7 % and comparable with the state of the art algorithms in the literature.Yayın Software defect prediction using Bayesian networks and kernel methods(Işık Üniversitesi, 2012-07-01) Okutan, Ahmet; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Doktora ProgramıThere are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian modelling to determine the influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, We define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We wxtract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, We learn both the marginal defect proneness probability of the whole software system and the set of most effective metrics. Our experiments on nine open source Promise data repository data sets show that respense for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most efective metrics whereas coupling between objets (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less efective metris on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are unstustworthy. On tthe other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, We observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness.However, futher investigation involving a greater number of projects, is need to confirm our findings. Furthermore, we propose a novel technique for defect prediction that uses plagiarism detection tools. Although the defect prediction problem haz been researched for a long time, the results achieved are not so bright. We use kernel programming to model the relationship between source code similarity and defectiveness. Each value in the kernel matrix shows how much parallelism exit between the corresponding files ib the kernel matrix shows how much parallelism exist between the corresponding files in the software system chosen. Our experiments on 10 real world datasets indicate that support vector machines (SVM) with a precalculated kernel matrix performs better than the SVM with the usual linear and RBF kernels and generates comparable results with the famous defect prediction methods like linear logistic regression and J48 in terms of the area under the curve (AUC).Furthermore, we observed that when the amount of similarity among the files of a software system is high, then the AUC found by the SVM with precomputed kernel can be used to predict the number of defects in the files or classes of a software system, because we observe a relationship between source code similarity and the number of defects. Based on the results of our analysis, the developers can focus on more defective modules rather than on less or non defective ones during testing activities. The experiments on 10 Promise datasets indicate that while predicting the number of defects, SVM with a precomputed kernel performs as good as the SVM with the usual linear and RBF kernels, in terms of the root mean square error (RMSE). The method proposed is also comparable with other regression methods like linear regression and IBK. The results of these experiments suggest that source code similarity is a good means of predicting both defectiveness and the number of defects in software modules.Yayın KeNet: a comprehensive Turkish wordNet and its applications in text clustering(Işık Üniversitesi, 2018-06-07) Ehsani, Razieh; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Doktora ProgramıIn this thesis, we summarize the methodology and the results of our e?orts to construct a comprehensive WordNet for Turkish. Most languages have access to comprehensive language resources. Traditional resources like bilingual dictionaries, monolingual dictionaries, thesauri and lexicons are developed by lexicographers. As computer processing of languages gain popularity, a new set of resources become necessary. One such resource is WordNet which was initially constructed for English language in Princeton University. A WordNet contains much of the information contained in a classic dictionary, but it also contains additional relationship information. These relations go beyond synonym relation and give information about relations such as a word being“is-a” or “is-a-part-of” another. These semantic relations are used in many text analysis tasks. A WordNet also categorizes words under common concepts. These concepts are called as synsets. As a result of all these, WordNet is a comprehensive dictionary which is readable by the computers and a useful language resource for text analysis and other research based on human language. In Turkish language, our WordNet is not the ?rst. The previous WordNet is part of BalkaNet project which is a multilingual WordNet including Turkish and Balkan languages. BalkaNet contains only common words between these languages, as such BalkaNet does not contain all Turkish words and su?ers from top-down constructing method disadvantages. BalkaNet project has not been updated or expanded in recent years. In this work we construct a Turkish WordNet from scratch using a bottom-up method. In general there are two methods for constructing WordNets. Bottomup method means that we create the WordNet from scratch while top-down approach uses other WordNets by translating them. We use Turkish Contemporary Dictionary (CDT) which is an online Turkish dictionary provided by Turkish Language Institute. Bottom-up approach has its own di?culties, since constructing a WordNet from scratch requires more resources and a lot of e?ort. In this work, we extract synonyms from CDT and ask experts to match common meanings for pairs of synonyms. We developed an application which makes annotation step easier and more accurate. We also use two groups of annotators to measure inter-annotator agreement. We used some automatic approaches to extract semantic relations from Turkish Wikipedia (Vikipedi) and Vikisözlük. We processed CDT to extract candidate synonyms and used rule based approaches to ?nd synonym sets. There is no thesaurus for Turkish, so as an application we construct a thesaurus automatically and measured accuracy with our manually constructed synsets. We named our WordNet “KeNet”. Finally, in this thesis we developed a novel approach to represent a text document in a vector space. This approach uses WordNet semantic relations. This part of thesis is an application of KeNet. We used our approach to represent text documents and implemented two di?erent clustering algorithms over these vectors. We tested our method over Turkish Wikipedia articles, domains of which are labeled by Wikipedia.Yayın Morpholex Turkish: a morphological Lexicon for Turkish(European Language Resources Association (ELRA), 2022-06-25) Arıcan, Bilge Nas; Kuzgun, Aslı; Marşan, Büşra; Aslan, Deniz Baran; Sanıyar, Ezgi; Cesur, Neslihan; Kara, Neslihan; Kuyrukçu, Oğuzhan; Özçelik, Merve; Yenice, Arife Betül; Doğan, Merve; Oksal, Ceren; Ercan, Gökhan; Yıldız, Olcay TanerMorphoLex is a study in which root, prefix and suffixes of words are analyzed. With MorphoLex, many words can be analyzed according to certain rules and a useful database can be created. Due to the fact that Turkish is an agglutinative language and the richness of its language structure, it offers different analyzes and results from previous studies in MorphoLex. In this study, we revealed the process of creating a database with 48,472 words and the results of the differences in language structure.Yayın Retinal disease diagnosis in OCT scans using a foundational model(Springer Science and Business Media Deutschland GmbH, 2025) Nazlı, Muhammet Serdar; Turkan, Yasemin; Tek, Faik Boray; Toslak, Devrim; Bulut, Mehmet; Arpacı, Fatih; Öcal, Mevlüt CelalThis study examines the feasibility and performance of using single OCT slices from the OCTA-500 dataset to classify DR (Diabetic Retinopathy) and AMD (Age-Related Macular Degeneration) with a pre-trained transformer-based model (RETFound). The experiments revealed the effective adaptation capability of the pretrained model to the retinal disease classification problem. We further explored the impact of using different slices from the OCT volume, assessing the sensitivity of the results to the choice of a single slice (e.g., “middle slice”) and whether analyzing both horizontal and vertical cross-sectional slices could improve outcomes. However, deep neural networks are complex systems that do not indicate directly whether they have learned and generalized the disease appearance as human experts do. The original dataset lacked disease localization annotations. Therefore, we collected new disease classification and localization annotations from independent experts for a subset of OCTA-500 images. We compared RETFound’s explainability-based localization outputs with these newly collected annotations and found that the region attributions aligned well with the expert annotations. Additionally, we assessed the agreement and variability between experts and RETFound in classifying disease conditions. The Kappa values, ranging from 0.35 to 0.69, indicated moderate agreement among experts and between the experts and the model. The transformer-based RETFound model using single or multiple OCT slices, is an efficient approach to diagnosing AMD and DR.












