Sarcasm detection in text using deep neural networks

Gümüşçekiçci, Gizem

dc.contributor.advisor	Dehkharghani, Rahim	en_US
dc.contributor.author	Gümüşçekiçci, Gizem	en_US
dc.contributor.other	Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı	en_US
dc.contributor.other	Işık University, School of Graduate Studies, Computer Science Engineering Master Program	en_US
dc.date.accessioned	2024-03-14T17:27:44Z
dc.date.available	2024-03-14T17:27:44Z
dc.date.issued	2024-02-25
dc.identifier.citation	Gümüşçekiçci, G. (2024). Sarcasm detection in text using deep neural networks. İstanbul: Işık Üniversitesi Lisansüstü Eğitim Enstitüsü.	en_US
dc.identifier.uri	https://hdl.handle.net/11729/5920
dc.description	Text in English ; Abstract: English and Turkish	en_US
dc.description	Includes bibliographical references (leaves 54-56)	en_US
dc.description	ix, 57 leaves	en_US
dc.description.abstract	Sarcasm is a form of irony which is generally used in expressing negative opinions. Sarcasm poses a linguistic challenge due to its figurative nature where intended meaning contradicts with literal interpretation. Sarcasm is widely used in our Daily lives and also upon many social platforms. Detecting sarcasm in written text is a challenging process that has captured the interest of many researchers. Hence, sarcasm has become a crucial task in the Natural Language Processing (NLP) field. This thesis study explores the concept of sarcasm, and its importance on existing sarcasm research. The automatic process of sarcasm detection involves dataset selection, preprocessing steps, and selecting proper approaches, including rule-based methods, Machine Learning (ML), Deep Learning (DL) and Transformer architectures. This study surveys previous research on sarcasm detection, specifically examining the dataset, methodology and performance. This thesis study attempts to automatically detect sarcasm by utilizing various ML, DL and transformer and hybrid neural network architectures on news headlines datasets. To overcome the dataset and performance limitations on existing approaches, we propose various methodologies to detect sarcastic text mostly focusing on DL, hybrid neural networks and transformer architectures. We combine appropriate architectures with several hand-crafted features and utilizing different word embedding models. To further extend the performance of our proposed methods and also enhance the existing news headlines dataset, we proposed several modifications. We contribute to the existing dataset by applying augmentation to increase the dataset size to help enhance the performance of the proposed models with overcoming dataset limitations. Our methodologies correctly identify sarcasm with 97.68% F1 score.	en_US
dc.description.abstract	Alaycılık, genellikle olumsuz görüşlerin ifade edilmesinde kullanılan bir ironi biçimidir. Alaycılık, amaçlanan anlamın gerçek yorumla çeliştiği mecazi doğası nedeniyle dilsel bir zorluk teşkil etmektedir. Alaycılık günlük yaşamımızda ve birçok sosyal platformda yaygın olarak kullanılmaktadır. Yazılı metinlerde alaycılığın tespit edilmesi birçok araştırmacının ilgisini çeken zorlu bir süreçtir. Dolayısıyla alaycılık, Doğal Dil İşleme (NLP) alanında çok önemli bir görev haline geldi. Bu tez çalışması alaycılık kavramını ve bu kavramın mevcut alaycılık araştırmaları üzerindeki önemini incelemektedir. Otomatik alaycılık algılama süreci, veri kümesi seçimini, ön işleme adımlarını ve kural tabanlı yöntemler, Makine Öğrenimi (ML), Derin Öğrenme (DL) ve Transformer mimarileri dahil olmak üzere uygun yaklaşımların seçilmesini içerir. Bu çalışma, özellikle veri kümesini, metodolojiyi ve performansı inceleyerek alaycılığın tespitine ilişkin önceki araştırmaları incelemektedir. Bu tez çalışması, haber başlıkları veri seti üzerinde çeşitli ML, DL ve transformatör ve hibrit sinir ağı mimarilerini kullanarak alaycılığı otomatik olarak tespit etmeye çalışmaktadır. Mevcut yaklaşımlardaki veri kümesi ve performans sınırlamalarının üstesinden gelmek için, çoğunlukla DL, hibrit sinir ağları ve transformatör mimarilerine odaklanan alaycı metinleri tespit etmek için çeşitli yöntemler öneriyoruz. Uygun mimarileri, farklı kelime temsil modellerini kullanarak çeşitli el yapımı özelliklerle birleştiriyoruz. Önerilen yöntemlerimizin performansını daha da genişletmek ve mevcut haber başlıkları veri setini geliştirmek için çeşitli değişiklikler önerdik. Önerilen modellerin performansının veri kümesi sınırlamalarının üstesinden gelmesine yardımcı olmak amacıyla veri kümesi boyutunu artırmak için büyütme uygulayarak mevcut veri kümesine katkıda bulunuyoruz. Metodolojilerimiz alaycılığı %97,68 F1 puanıyla doğru bir şekilde tespit edebiliyor.	en_US
dc.description.tableofcontents	Introduction to Sarcasm	en_US
dc.description.tableofcontents	Sarcasm in Social Platforms	en_US
dc.description.tableofcontents	Aspects of Sarcasm	en_US
dc.description.tableofcontents	Sarcasm Detection	en_US
dc.description.tableofcontents	Sarcasm Detection Studies on the News Headlines Dataset	en_US
dc.description.tableofcontents	Sarcasm Detection Studies on Other Datasets	en_US
dc.description.tableofcontents	EXPLORATORY DATA ANALYSIS	en_US
dc.description.tableofcontents	PROPOSED METHODOLOGY	en_US
dc.description.tableofcontents	Dataset Preprocessing	en_US
dc.description.tableofcontents	Feature Engineering	en_US
dc.description.tableofcontents	Data Augmentation	en_US
dc.description.tableofcontents	Framework of Proposed Methodology	en_US
dc.description.tableofcontents	Word Embedding	en_US
dc.description.tableofcontents	Word2Vec	en_US
dc.description.tableofcontents	Activation Functions	en_US
dc.description.tableofcontents	ReLU	en_US
dc.description.tableofcontents	Hyperbolic Tangent (Tanh)	en_US
dc.description.tableofcontents	Sigmoid	en_US
dc.description.tableofcontents	Loss Functions	en_US
dc.description.tableofcontents	Callback Functions	en_US
dc.description.tableofcontents	Early Stopping	en_US
dc.description.tableofcontents	Reduce Learning Rate	en_US
dc.description.tableofcontents	Model Checkpoint	en_US
dc.description.tableofcontents	Classification Models	en_US
dc.description.tableofcontents	SVM	en_US
dc.description.tableofcontents	Decision Tree	en_US
dc.description.tableofcontents	Random Forest	en_US
dc.description.tableofcontents	Convolutional Neural Network (CNN)	en_US
dc.description.tableofcontents	Bidirectional Long Short-Term Memory (BiLSTM)	en_US
dc.description.tableofcontents	BERT Transformer	en_US
dc.description.tableofcontents	EXPERIMENTAL EVALUATION	en_US
dc.description.tableofcontents	Quantitative Results	en_US
dc.description.tableofcontents	Machine Learning Models	en_US
dc.description.tableofcontents	Deep Learning Models	en_US
dc.description.tableofcontents	Transformer Models	en_US
dc.description.tableofcontents	Pre-processing stages	en_US
dc.description.tableofcontents	Features presented in (Jariwala, 2020)	en_US
dc.description.tableofcontents	Summary of Sarcasm Detection Studies on the News Headlines Dataset	en_US
dc.description.tableofcontents	Summary of Sarcasm Detection Studies on Other Datasets	en_US
dc.description.tableofcontents	Sarcastic, non-sarcastic data example	en_US
dc.description.tableofcontents	Class distributions in v1 dataset	en_US
dc.description.tableofcontents	Class distributions in v2 dataset	en_US
dc.description.tableofcontents	Most used methodologies and useful libraries for pre processing stages	en_US
dc.description.tableofcontents	Pre processing applied to news headlines datasets	en_US
dc.description.tableofcontents	Textual Data Augmentation Example	en_US
dc.description.tableofcontents	Top 15 Category Labels	en_US
dc.description.tableofcontents	Differences of Binary cross entropy and Sparse categorical cross entropy	en_US
dc.description.tableofcontents	Results for ML models	en_US
dc.description.tableofcontents	Results for DL models	en_US
dc.description.tableofcontents	Results for Transformer Models	en_US
dc.description.tableofcontents	Google Scholar Search Results	en_US
dc.description.tableofcontents	Frequency of Model Implementation in Sarcasm Detection Studies	en_US
dc.description.tableofcontents	Existing Sarcasm Datasets Used in Studies	en_US
dc.description.tableofcontents	Sample Dataset version1 (v1)	en_US
dc.description.tableofcontents	Sample Dataset version2 (v2)	en_US
dc.description.tableofcontents	Character length frequency distributions in headlines	en_US
dc.description.tableofcontents	Word length density in headlines	en_US
dc.description.tableofcontents	Frequency of the length of each headline in the dataset	en_US
dc.description.tableofcontents	Word Cloud Representation of Non-sarcastic and Sarcastic Headlines	en_US
dc.description.tableofcontents	Proposed Sarcasm Detection Classifier Framework	en_US
dc.description.tableofcontents	Sentiment Polarity Label Pipeline	en_US
dc.description.tableofcontents	News Categorization Label Pipeline	en_US
dc.description.tableofcontents	Dataset v2 with handcrafted features	en_US
dc.description.tableofcontents	Category distribution of the news headlines	en_US
dc.description.tableofcontents	Framework of the Proposed Sarcasm Detector	en_US
dc.description.tableofcontents	ReLU graph	en_US
dc.description.tableofcontents	Tanh graph	en_US
dc.description.tableofcontents	Sigmoid graph	en_US
dc.description.tableofcontents	SVM Linear Separable and Not Linearly Separable Example	en_US
dc.description.tableofcontents	Transformer Model Architecture	en_US
dc.language.iso	en	en_US
dc.publisher	Işık Üniversitesi	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Sarcasm	en_US
dc.subject	News headlines	en_US
dc.subject	Sarcasm classification	en_US
dc.subject	Transformers	en_US
dc.subject	Text augmentation	en_US
dc.subject	Alaycılık	en_US
dc.subject	Haber manşetleri	en_US
dc.subject	Alaycılık sınıflandırması	en_US
dc.subject	Metin arttırma	en_US
dc.subject.lcc	QA76.9.N38 G86 2024
dc.subject.lcsh	Irony in literature.	en_US
dc.subject.lcsh	Irony -- Detection.	en_US
dc.subject.lcsh	Natural language processing (Computer science).	en_US
dc.subject.lcsh	Text data mining.	en_US
dc.title	Sarcasm detection in text using deep neural networks	en_US
dc.title.alternative	Derin sinir ağları kullanarak metin içinde alaycılık tespiti	en_US
dc.type	Master Thesis	en_US
dc.department	Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı	en_US
dc.department	Işık University, School of Graduate Studies, Computer Science Engineering Master Program	en_US
dc.authorid	0000-0002-9502-7817
dc.authorid	0000-0002-9502-7817	en_US
dc.relation.publicationcategory	Tez	en_US
dc.institutionauthor	Gümüşçekiçci, Gizem	en_US