Işık Üniversitesi Kurumsal Akademik Belleği :: DSpace Angular

Arama Sonuçları

Listeleniyor 1 - 10 / 13

Using uncertainty metrics in adversarial machine learning as an attack and defense tool
(Işık Ünivresitesi, 2022-12-19) Tuna, Ömer Faruk; Eskil, Mustafa Taner; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı
Deep Neural Network (DNN) models are widely renowned for their resistance to random perturbations. However, researchers have found out that these models are indeed extremely vulnerable to deliberately crafted and seemingly imperceptible perturbations of the input, defined as adversarial samples. Adversarial attacks have the potential to substantially compromise the security of DNN-powered systems and posing high risks especially in the areas where security is a top priority. Numerous studies have been conducted in recent years to defend against these attacks and to develop more robust architectures resistant to adversarial threats. In this thesis study, we leverage the use of various uncertainty metrics obtained from MC-Dropout estimates of the model for developing new attack and defense ideas. On defense side, we propose a new adversarial detection mechanism and an uncertaintybased defense method to increase the robustness of DNN models against adversarial evasion attacks. On the attack side, we use the quantified epistemic uncertainty obtained from the model’s final probability outputs, along with the model’s own loss function, to generate effective adversarial samples. We’ve experimentally evaluated and verified the efficacy of our proposed approaches on standard computer vision datasets.
English to Turkish machine translation using synchronous grammars
(Işık Üniversitesi, 2022-06-14) Görgün, Onur; Tüysüz Erman, Ayşegül; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı
Machine translation (MT) has been one of the hot topics in NLP research over recent years. However, most of the related studies have been done for specific languages, and there are a limited number of comprehensive studies for languages with free word order, such as Turkish. English-Turkish is also one of the least frequently studied language pairs in translation due to the morphological and syntactic gaps between the two languages. This also makes it hard to build parallel corpora, which is crucial for the machine translation task. This thesis aims to be the first statistical syntax tree-based machine translation approach to the English-Turkish language pair, as well as a parallel corpus for translation tasks. We construct an English-Turkish parallel treebank of approximately 17K sentences by following a three-phased approach: manual transformation of English trees from Penn Treebank (PTB) by constraining the translated trees to the reordering of the children and gloss replacement; morphological analysis of the translated gloss; and morphological enrichment of the target tree. For translation consistency, we also developed a set of tools. We also apply the transformation schema to the closed-domain and build 8.3K sentences corpus. We employ both corpora on machine translation task. In our experiments, we obtained a 12.8 BLEU score in the open-domain and a 26.8 BLEU score in the closed-domain. We also evaluate both corpora intrinsically through perplexity analysis. The results show that our studies on making a corpus can be repeated, and studies on machine translation using the small corpus look promising.
Adaptive locally connected recurrent unit (ALCRU)
(Springer Science and Business Media Deutschland GmbH, 2025-07-03) Özçelik, Şuayb Talha; Tek, Faik Boray
Research has shown that adaptive locally connected neurons outperform their fully connected (dense) counterparts, motivating this study on the development of the Adaptive Locally Connected Recurrent Unit (ALCRU). ALCRU modifies the Simple Recurrent Neuron Model (SimpleRNN) by incorporating spatial coordinate spaces for input and hidden state vectors, facilitating the learning of parametric local receptive fields. These modifications add four trainable parameters per neuron, resulting in a minor increase in computational complexity. ALCRU is implemented using standard frameworks and trained with back-propagation-based optimizers. We evaluate the performance of ALCRU using diverse benchmark datasets, including IMDb for sentiment analysis, AdditionRNN for sequence modelling, and the Weather dataset for time-series forecasting. Results show that ALCRU achieves accuracy and loss metrics comparable to GRU and LSTM while consistently outperforming SimpleRNN. In particular, experiments with longer sequence lengths on AdditionRNN and increased input dimensions on IMDb highlight ALCRU’s superior scalability and efficiency in processing complex data sequences. In terms of computational efficiency, ALCRU demonstrates a considerable speed advantage over gated models like LSTM and GRU, though it is slower than SimpleRNN. These findings suggest that adaptive local connectivity enhances both the accuracy and efficiency of recurrent neural networks, offering a promising alternative to standard architectures.
Transforming tourism experience: AI-based smart travel platform
(Association for Computing Machinery, 2023) Yöndem, Meltem Turhan; Özçelik, Şuayb Talha; Caetano, Inés; Figueiredo, José; Alves, Patrícia; Marreiros, Goreti; Bahtiyar, Hüseyin; Yüksel, Eda; Perales, Fernando
In this paper, we propose the development of a novel personalized tourism platform incorporating artificial intelligence (AI) and augmented reality (AR) technologies to enhance the smart tourism experience. The platform utilizes various data sources, including travel history, user activity, and personality assessments, combined with machine learning algorithms to generate tailored travel recommendations for individual users. We implemented fundamental requirements for the platform: secure user identification using blockchain technology and provision of personalized services based on user interests and preferences. By addressing these requirements, the platform aims to increase tourist satisfaction and improve the efficiency of the tourism industry. In collaboration with various universities and companies, this multinational project aims to create a versatile platform that can seamlessly integrate new smart tourism units, providing an engaging, educational, and enjoyable experience for users.
Semantic relation extraction by enriching word embeddings exploiting Turkish morphology
(Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, 2025-03-18) Ercan, Gökhan; Yıldız, Olcay Taner; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı
Distributed representations (DR) are used to capture semantic and syntactic patterns in language by analyzing the distributional relationships of words within textual data. The modeling methods that produce DR are based on the assumption (distributional hypothesis) that "words that occur in the same context tend to have similar meanings," which is inherent to the nature of language. These modeling methods, due to their unsupervised nature, can be trained without human judgment input, allowing researchers to train large datasets at relatively low costs. Although word-based models perform effectively for languages with limited vocabularies, such as English, they exhibit considerable inefficiency when applied to morphologically rich languages with unlimited vocabularies, such as Turkish. We observed that n-gram and statistical segmentation methods, which are commonly used in subword modeling to address the issues of out-of-vocabulary and rare-words, are highly sensitive to orthographic similarity. Consequently, these methods struggle to distinguish between unrelated concepts (e.g., shrink - shrine). Moreover, we noted that the impact of morphological segmentation methods on these types of problems has shown inconsistent results in the literature. This thesis aims to make conceptual assumptions and improvements concerning different types of semantic relationships (e.g., relatedness and similarity), to model the role of language morphology as an input in subword DR models, and to develop the dataset generation methodologies and evaluation methods to measure this effect. Within the scope of the study, different models and segmentation methods were empirically tested, the AnlamVer and OSimUnr datasets were produced, and the task of relatedness classification and associated evaluation methods were proposed to measure the noise introduced by segmentation to the model. Our experiments demonstrate that morphological segmentation produces significantly less noise compared to n-gram-based methods and can lead to substantial performance improvements depending on the nature of the task.
An approach to anaylse Turkish syntax at morphosyntactic level
(Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, 2025-01-20) Özenç, Berke; Solak, Ercan; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı; Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
Syntactic analysis allows us to analyse the sentence structure in various ways. Constituency parsing is one of the various ways of conducting syntactic analysis. This parsing method defines sentence structure as hierarchical relationships between words or phrases and represents them in tree form. Constituency parsing employs constituency grammar which defines how constituents combine and form other constituents. In this grammar, any syntactic structure from the sentence to the words is represented by the constituents. Although this approach is designed to focus on universal aspects of the languages, English has always been in its focus. This situation makes the constituency approach miss the details that the morphology puts in the syntax of morphologically rich languages. In this study, we implement an extension for the constituency parsing which overcomes the challenges in parsing of MRL (Morphologically Rich Language). We propose ideas tailored to Turkish, yet they can be used for any language like Turkish. Our extension enables the constituency parsing to start at the morpheme level. Thus, we involve morphemic structures in the parsing process and express their syntactic effects on the structure. We have our implementations by extending the CYK (Cocke Younger Kasami) algorithm. During parsing, we utilize extra rules to transfer the ambiguity in morphology to the parsing. In addition, we designed a morpheme-focused constituency set for Turkish. This set involves affixes, stems and phrases headed by a stem. We demonstrate our work with a mini treebank and the grammar generated from it.
Grammar or crammer? the role of morphology in distinguishing orthographically similar but semantically unrelated words
(Institute of Electrical and Electronics Engineers Inc., 2025) Ercan, Gökhan; Yıldız, Olcay Taner
We show that n-gram-based distributional models fail to distinguish unrelated words due to the noise in semantic spaces. This issue remains hidden in conventional benchmarks but becomes more pronounced when orthographic similarity is high. To highlight this problem, we introduce OSimUnr, a dataset of nearly one million English and Turkish word-pairs that are orthographically similar but semantically unrelated (e.g., grammar - crammer). These pairs are generated through a graph-based WordNet approach and morphological resources. We define two evaluation tasks - unrelatedness identification and relatedness classification - to test semantic models. Our experiments reveal that FastText, with default n-gram segmentation, performs poorly (below 5% accuracy) in identifying unrelated words. However, morphological segmentation overcomes this issue, boosting accuracy to 68% (English) and 71% (Turkish) without compromising performance on standard benchmarks (RareWords, MTurk771, MEN, AnlamVer). Furthermore, our results suggest that even state-of-the-art LLMs, including Llama 3.3 and GPT-4o-mini, may exhibit noise in their semantic spaces, particularly in highly synthetic languages such as Turkish. To ensure dataset quality, we leverage WordNet, MorphoLex, and NLTK, covering fully derivational morphology supporting atomic roots (e.g., '-co_here+ance+y' for 'coherency'), with 405 affixes in Turkish and 467 in English.
TURSpider: a Turkish Text-to-SQL dataset and LLM-based study
(Institute of Electrical and Electronics Engineers Inc., 2024-11-25) Kanburoğlu, Ali Buğra; Tek, Faik Boray
This paper introduces TURSpider, a novel Turkish Text-to-SQL dataset developed through human translation of the widely used Spider dataset, aimed at addressing the current lack of complex, cross-domain SQL datasets for the Turkish language. TURSpider incorporates a wide range of query difficulties, including nested queries, to create a comprehensive benchmark for Turkish Text-to-SQL tasks. The dataset enables cross-language comparison and significantly enhances the training and evaluation of large language models (LLMs) in generating SQL queries from Turkish natural language inputs. We fine-tuned several Turkish-supported LLMs on TURSpider and evaluated their performance in comparison to state-of-the-art models like GPT-3.5 Turbo and GPT-4. Our results show that fine-tuned Turkish LLMs demonstrate competitive performance, with one model even surpassing GPT-based models on execution accuracy. We also apply the Chain-of-Feedback (CoF) methodology to further improve model performance, demonstrating its effectiveness across multiple LLMs. This work provides a valuable resource for Turkish NLP and addresses specific challenges in developing accurate Text-to-SQL models for low-resource languages.
Large language model based automated translation of natural language to SQL
(Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, 2025-01-22) Kanburoğlu, Ali Buğra; Tek, Faik Boray; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı; Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
The field of Text-to-SQL, which involves converting natural language into SQL queries, has seen significant advancements, but challenges remain, particularly for low-resource languages like Turkish. This thesis introduces three key contributions to address these challenges. Our first contribution is the development and open-access release of TUR2SQL, the first cross-domain Turkish Text-to-SQL dataset, which consists of 10,809 natural language sentences paired with their corresponding SQL queries. We evaluate the performance of SQLNet, a deep learning model specifically designed for this task, and one of the most successful Large Language Models (LLMs), ChatGPT, on this dataset. The results demonstrate the superior performance of ChatGPT. The second major contribution is the construction and publicly available release of TURSpider, the most extensive Turkish Text-to-SQL dataset. TURSpider is built by translating the widely used cross-domain Spider dataset from English to Turkish. This dataset includes complex queries with varying difficulty levels, facilitating the training and comparison of large language models for Turkish Text-to-SQL tasks. Our comparative analysis shows that fine-tuned Turkish LLMs achieve competitive performance, with some models surpassing OpenAI models in query accuracy. To further enhance performance, we apply the Chainof-Feedback (CoF) methodology, demonstrating its effectiveness across multiple models. Finally, we explore the Mixture-of-Agents (MoA) framework, which combines outputs from multiple models to improve the performance of open-source LLMs for Text-to-SQL tasks. By integrating MoA with the CoF technique, we propose MoAF-SQL, an approach that significantly improves performance, particularly on complex queries. Our experiments show that MoAF-SQL achieves competitive results, highlighting its potential to enhance the Text-to-SQL capabilities of open-source LLMs.
Object recognition with competitive convolutional neural networks
(Işık Üniversitesi, 2023-06-12) Erkoç, Tuğba; Eskil, M. Taner; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı; Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
In recent years, Artificial Intelligence (AI) has achieved impressive results, often surpassing human capabilities in tasks involving language comprehension and visual recognition. Among these, computer vision has experienced remarkable progress, largely due to the introduction of Convolutional Neural Networks (CNNs). CNNs are inspired by the hierarchical structure of the visual cortex and are designed to detect patterns, objects, and complex relationships within visual data. One key advantage is their ability to learn directly from pixel values without the need for domain expertise, which has contributed to their popularity. These networks are trained using supervised backpropagation, a process that calculates gradients of the network’s parameters (weights and biases) with respect to the loss function. While backpropagation enables impressive performance with CNNs, it also presents certain drawbacks. One such drawback is the requirement for large amounts of labeled data. When the available data samples are limited, the gradients estimated from this limited information may not accurately capture the overall data behavior, leading to suboptimal parameter updates. However, obtaining a sufficient quantity of labeled data poses a challenge. Another drawback is the requirement of careful configuration of hyperparameters, including the number of neurons, learning rate, and network architecture. Finding optimal values for these hyperparameters can be a time-consuming process. Furthermore, as the complexity of the task increases, the network architecture becomes deeper and more complex. To effectively train the shallow layers of the network, one must increase the number of epochs and experiment with solutions to prevent vanishing gradients. Complex problems often require a greater number of epochs to learn the intricate patterns and features present in the data. It’s important to note that while CNNs aim to mimic the structure of the visual cortex, the brain’s learning mechanism does not necessarily involve back-propagation. Although CNNs incorporate the layered architecture of the visual cortex, the reliance on backpropagation introduces an artificial learning procedure that may not align with the brain’s actual learning process. Therefore, it is crucial to explore alternative learning paradigms that do not rely on backpropagation. In this dissertation study, a unique approach to unsupervised training for CNNs is explored, setting it apart from previous research. Unlike other unsupervised methods, the proposed approach eliminates the reliance on backpropagation for training the filters. Instead, we introduce a filter extraction algorithm capable of extracting dataset features by processing images only once, without requiring data labels or backward error updates. This approach operates on individual convolutional layers, gradually constructing them by discovering filters. To evaluate the effectiveness of this backpropagation-free algorithm, we design four distinct CNN architectures and conduct experiments. The results demonstrate the promising performance of training without backpropagation, achieving impressive classification accuracies on different datasets. Notably, these outcomes are attained using a single network setup without any data augmentation. Additionally, our study reveals that the proposed algorithm eliminates the need to predefine the number of filters per convolutional layer, as the algorithm automatically determines this value. Furthermore, we demonstrate that filter initialization from a random distribution is unnecessary when backpropagation is not employed during training.

Filtreler

Yazar

Konu

Tarih

İndeks

WoS Q

Scopus Q

Dil

Tür

Kategori

Bölüm

Erişim Hakkı

Tam Metin

Öğe Türü

Ayarlar

Sırala

Sayfa Başına Sonuç

Arama Sonuçları