Arama Sonuçları

Listeleniyor 1 - 6 / 6
  • Yayın
    Adaptive locally connected recurrent unit (ALCRU)
    (Springer Science and Business Media Deutschland GmbH, 2025-07-03) Özçelik, Şuayb Talha; Tek, Faik Boray
    Research has shown that adaptive locally connected neurons outperform their fully connected (dense) counterparts, motivating this study on the development of the Adaptive Locally Connected Recurrent Unit (ALCRU). ALCRU modifies the Simple Recurrent Neuron Model (SimpleRNN) by incorporating spatial coordinate spaces for input and hidden state vectors, facilitating the learning of parametric local receptive fields. These modifications add four trainable parameters per neuron, resulting in a minor increase in computational complexity. ALCRU is implemented using standard frameworks and trained with back-propagation-based optimizers. We evaluate the performance of ALCRU using diverse benchmark datasets, including IMDb for sentiment analysis, AdditionRNN for sequence modelling, and the Weather dataset for time-series forecasting. Results show that ALCRU achieves accuracy and loss metrics comparable to GRU and LSTM while consistently outperforming SimpleRNN. In particular, experiments with longer sequence lengths on AdditionRNN and increased input dimensions on IMDb highlight ALCRU’s superior scalability and efficiency in processing complex data sequences. In terms of computational efficiency, ALCRU demonstrates a considerable speed advantage over gated models like LSTM and GRU, though it is slower than SimpleRNN. These findings suggest that adaptive local connectivity enhances both the accuracy and efficiency of recurrent neural networks, offering a promising alternative to standard architectures.
  • Yayın
    An approach to anaylse Turkish syntax at morphosyntactic level
    (Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, 2025-01-20) Özenç, Berke; Solak, Ercan; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı; Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
    Syntactic analysis allows us to analyse the sentence structure in various ways. Constituency parsing is one of the various ways of conducting syntactic analysis. This parsing method defines sentence structure as hierarchical relationships between words or phrases and represents them in tree form. Constituency parsing employs constituency grammar which defines how constituents combine and form other constituents. In this grammar, any syntactic structure from the sentence to the words is represented by the constituents. Although this approach is designed to focus on universal aspects of the languages, English has always been in its focus. This situation makes the constituency approach miss the details that the morphology puts in the syntax of morphologically rich languages. In this study, we implement an extension for the constituency parsing which overcomes the challenges in parsing of MRL (Morphologically Rich Language). We propose ideas tailored to Turkish, yet they can be used for any language like Turkish. Our extension enables the constituency parsing to start at the morpheme level. Thus, we involve morphemic structures in the parsing process and express their syntactic effects on the structure. We have our implementations by extending the CYK (Cocke Younger Kasami) algorithm. During parsing, we utilize extra rules to transfer the ambiguity in morphology to the parsing. In addition, we designed a morpheme-focused constituency set for Turkish. This set involves affixes, stems and phrases headed by a stem. We demonstrate our work with a mini treebank and the grammar generated from it.
  • Yayın
    TURSpider: a Turkish Text-to-SQL dataset and LLM-based study
    (Institute of Electrical and Electronics Engineers Inc., 2024-11-25) Kanburoğlu, Ali Buğra; Tek, Faik Boray
    This paper introduces TURSpider, a novel Turkish Text-to-SQL dataset developed through human translation of the widely used Spider dataset, aimed at addressing the current lack of complex, cross-domain SQL datasets for the Turkish language. TURSpider incorporates a wide range of query difficulties, including nested queries, to create a comprehensive benchmark for Turkish Text-to-SQL tasks. The dataset enables cross-language comparison and significantly enhances the training and evaluation of large language models (LLMs) in generating SQL queries from Turkish natural language inputs. We fine-tuned several Turkish-supported LLMs on TURSpider and evaluated their performance in comparison to state-of-the-art models like GPT-3.5 Turbo and GPT-4. Our results show that fine-tuned Turkish LLMs demonstrate competitive performance, with one model even surpassing GPT-based models on execution accuracy. We also apply the Chain-of-Feedback (CoF) methodology to further improve model performance, demonstrating its effectiveness across multiple LLMs. This work provides a valuable resource for Turkish NLP and addresses specific challenges in developing accurate Text-to-SQL models for low-resource languages.
  • Yayın
    Large language model based automated translation of natural language to SQL
    (Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, 2025-01-22) Kanburoğlu, Ali Buğra; Tek, Faik Boray; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı; Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
    The field of Text-to-SQL, which involves converting natural language into SQL queries, has seen significant advancements, but challenges remain, particularly for low-resource languages like Turkish. This thesis introduces three key contributions to address these challenges. Our first contribution is the development and open-access release of TUR2SQL, the first cross-domain Turkish Text-to-SQL dataset, which consists of 10,809 natural language sentences paired with their corresponding SQL queries. We evaluate the performance of SQLNet, a deep learning model specifically designed for this task, and one of the most successful Large Language Models (LLMs), ChatGPT, on this dataset. The results demonstrate the superior performance of ChatGPT. The second major contribution is the construction and publicly available release of TURSpider, the most extensive Turkish Text-to-SQL dataset. TURSpider is built by translating the widely used cross-domain Spider dataset from English to Turkish. This dataset includes complex queries with varying difficulty levels, facilitating the training and comparison of large language models for Turkish Text-to-SQL tasks. Our comparative analysis shows that fine-tuned Turkish LLMs achieve competitive performance, with some models surpassing OpenAI models in query accuracy. To further enhance performance, we apply the Chainof-Feedback (CoF) methodology, demonstrating its effectiveness across multiple models. Finally, we explore the Mixture-of-Agents (MoA) framework, which combines outputs from multiple models to improve the performance of open-source LLMs for Text-to-SQL tasks. By integrating MoA with the CoF technique, we propose MoAF-SQL, an approach that significantly improves performance, particularly on complex queries. Our experiments show that MoAF-SQL achieves competitive results, highlighting its potential to enhance the Text-to-SQL capabilities of open-source LLMs.
  • Yayın
    Object recognition with competitive convolutional neural networks
    (Işık Üniversitesi, 2023-06-12) Erkoç, Tuğba; Eskil, M. Taner; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı; Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
    In recent years, Artificial Intelligence (AI) has achieved impressive results, often surpassing human capabilities in tasks involving language comprehension and visual recognition. Among these, computer vision has experienced remarkable progress, largely due to the introduction of Convolutional Neural Networks (CNNs). CNNs are inspired by the hierarchical structure of the visual cortex and are designed to detect patterns, objects, and complex relationships within visual data. One key advantage is their ability to learn directly from pixel values without the need for domain expertise, which has contributed to their popularity. These networks are trained using supervised backpropagation, a process that calculates gradients of the network’s parameters (weights and biases) with respect to the loss function. While backpropagation enables impressive performance with CNNs, it also presents certain drawbacks. One such drawback is the requirement for large amounts of labeled data. When the available data samples are limited, the gradients estimated from this limited information may not accurately capture the overall data behavior, leading to suboptimal parameter updates. However, obtaining a sufficient quantity of labeled data poses a challenge. Another drawback is the requirement of careful configuration of hyperparameters, including the number of neurons, learning rate, and network architecture. Finding optimal values for these hyperparameters can be a time-consuming process. Furthermore, as the complexity of the task increases, the network architecture becomes deeper and more complex. To effectively train the shallow layers of the network, one must increase the number of epochs and experiment with solutions to prevent vanishing gradients. Complex problems often require a greater number of epochs to learn the intricate patterns and features present in the data. It’s important to note that while CNNs aim to mimic the structure of the visual cortex, the brain’s learning mechanism does not necessarily involve back-propagation. Although CNNs incorporate the layered architecture of the visual cortex, the reliance on backpropagation introduces an artificial learning procedure that may not align with the brain’s actual learning process. Therefore, it is crucial to explore alternative learning paradigms that do not rely on backpropagation. In this dissertation study, a unique approach to unsupervised training for CNNs is explored, setting it apart from previous research. Unlike other unsupervised methods, the proposed approach eliminates the reliance on backpropagation for training the filters. Instead, we introduce a filter extraction algorithm capable of extracting dataset features by processing images only once, without requiring data labels or backward error updates. This approach operates on individual convolutional layers, gradually constructing them by discovering filters. To evaluate the effectiveness of this backpropagation-free algorithm, we design four distinct CNN architectures and conduct experiments. The results demonstrate the promising performance of training without backpropagation, achieving impressive classification accuracies on different datasets. Notably, these outcomes are attained using a single network setup without any data augmentation. Additionally, our study reveals that the proposed algorithm eliminates the need to predefine the number of filters per convolutional layer, as the algorithm automatically determines this value. Furthermore, we demonstrate that filter initialization from a random distribution is unnecessary when backpropagation is not employed during training.
  • Yayın
    Deep learning-based analysis of retinal OCT scans for detection of Alzheimer’s disease
    (Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, 2026-01-23) Turkan, Yasemin; Tek, Faik Boray; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı; Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
    Alterations in retinal layer thickness have been associated with neurodegenerative diseases such as Alzheimer’s disease (AD). These structural changes can be measured using a noninvasive imaging technology called Optical Coherence Tomography (OCT). Previous research has mostly focused on the statistical associations between segmented retinal layer thickness and AD derived from OCT or OCTA devices. Unlike conventional medical image classification tasks, early detection is more challenging than diagnosis because imaging precedes clinical diagnosis by several years. Deep learning (DL), particularly through convolutional neural networks (CNNs) and transfer learning, has demonstrated strong performance in image-based disease detection tasks. However, the application of DL directly on unsegmented raw OCT B-scan images for early AD detection remains underexplored. Therefore, in this thesis, we address this research gap by proposing a deep learning-based approach that uses raw OCT images for early Alzheimer’s disease detection. All related studies in the literature have heavily relied on private and in-situ cohorts that lack interoperability. In contrast, the UK Biobank (2022) offers a unique resource for investigating the associations between retinal structure and systemic health, comprising over 85,000 OCT scans linked to cognitive and health-related data. Between the initial scan period (2010–2015) and July 2023, 539 participants in the dataset were diagnosed with AD. Although the UK Biobank is somewhat limited by the absence of OCTA scans, we utilized this dataset to detect early AD using OCT scans. After a rigorous data-exclusion process, this study used a targeted 4-year window, selecting participants diagnosed with AD within 4 years of their baseline assessments. The AD group was matched by age, sex, eye, and instance with a randomly selected balanced Healthy Control group (N = 30). We first evaluated the predictive value of isolated 2D B-scans using pretrained deep learning architectures. In these tests, the ResNet-34 model achieved a mean AUC of 0.624 ± 0.060. Saliency map analysis of these B-scans highlighted the critical importance of the central macular region, whereas peripheral areas showed negligible contribution to the model’s decision. To overcome the limitations of isolated B-scans and leverage 3D information, we generated a 3D-informed en-face thickness projection map from the OCT B-scans. This pipeline was optimized to focus on the diagnostically relevant 3 mm inner macular region, effectively filtering out peripheral noise. Our study of thickness maps identified the Ganglion Cell Layer (GCL) as the most significant indicator of preclinical AD. The VGG-19 model, trained on GCL thickness maps with a year-weighted loss function, achieved a peak mean AUC of 0.750 ± 0.037. Notably, the traditional clinical benchmark, the Retinal Nerve Fiber Layer (RNFL), exhibited negligible predictive value in this pre-symptomatic cohort. We also developed a Multi-Modal Soft-Voting Ensemble model to further increase predictive accuracy and emulate clinical decision-making. This model integrates structural insights from B-scans and GCIPL thickness maps with clinical and demographic data. The ensemble approach achieved the highest mean AUC of 0.85 and significantly outperformed individual modalities. Furthermore, an ablation study using only image modalities (B-scans and thickness maps) yielded an AUC of 0.84. This result highlights the strong complementary value of combined structural data. Longitudinal sensitivity analysis also established a “diagnostic horizon” for retinal biomarkers. We observed that predictive accuracy is highest between 4 and 8 years prior to clinical diagnosis. However, these signals progressively converge toward baseline by the 12-year mark. When benchmarked against the current literature, our framework outperformed existing baselines for the diagnosis of symptomatic Mild Cognitive Impairment (MCI). This demonstrates its robustness in the more challenging task of preclinical prediction. Consequently, it establishes a viable pathway for integrating retinal imaging into the early diagnostic pipeline for Alzheimer’s disease.