Extracting meaningful information student surveys with NLP

Pourjalil, Kajal

Extracting meaningful information student surveys with NLP

dc.authorid	0009-0000-5832-3062
dc.contributor.advisor	Ekin, Emine	en_US
dc.contributor.author	Pourjalil, Kajal	en_US
dc.contributor.other	Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı	en_US
dc.contributor.other	Işık University, School of Graduate Studies, Master’s Program in Computer Engineering	en_US
dc.date.accessioned	2025-05-30T11:15:25Z
dc.date.available	2025-05-30T11:15:25Z
dc.date.issued	2025-01-29
dc.department	Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı	en_US
dc.department	Işık University, School of Graduate Studies, Master’s Program in Computer Engineering	en_US
dc.description	Text in English ; Abstract: English and Turkish	en_US
dc.description	Includes bibliographical references (leaves 39-45)	en_US
dc.description	xii, 46 leaves	en_US
dc.description.abstract	This thesis applied NLP techniques to analyze and summarize bilingual student feedback collected via end-of-semester surveys. The dataset, which contained open-ended responses in both English and Turkish, required a model adept at preserving linguistic nuances across languages. The Llama 2-7b-hf model, which had been trained explicitly for text generation, was selected for its capability to produce coherent and contextually relevant summaries. Data preprocessing involved organizing metadata such as department, semester, course name, and section number, segregating comments by word count, and removing personal identifiers to ensure privacy. Shorter comments (fewer than ten words) were grouped and summarized using a pipeline from the Transformers library, while longer comments were fine-tuned with metadataspecific prompts for detailed summarization. To further enhance analysis, sentiment classification was performed using the “cardiffnlp/twitter-robertabase-sentiment” model, categorizing feedback into negative, neutral, and positive sentiments. Evaluation metrics included expert reviews, contextual relevance, and logical consistency with the dataset’s sentiment distribution. Compared to previous models, the Llama 2 model demonstrated superior performance in generating complete, coherent summaries while preserving the overall intent and tone of the comments. Ultimately, this research highlighted the effectiveness of LLMs in processing multilingual educational data and their potential to provide actionable insights for improving course content and student experiences.	en_US
dc.description.abstract	Bu tez, dönem sonu anketleri aracılığıyla toplanan iki dilli öğrenci geri bildirimlerini analiz etmek ve özetlemek için NLP tekniklerini uyguladı. İngilizce ve Türkçe dillerinde açık uçlu yanıtlar içeren veri kümesi, diller arası dilsel nüansları koruyabilen bir modele ihtiyaç duymuştu. Metin üretimi için özel olarak eğitilmiş Llama 2-7b-hf modeli, tutarlı ve bağlamsal olarak uygun özetler üretebilme yeteneği nedeniyle seçilmişti. Veri ön işleme aşaması, bölüm, dönem, ders adı ve şube numarası gibi üstverileri düzenlemeyi, yorumları kelime sayılarına göre ayırmayı ve gizliliği sağlamak için kişisel kimlik bilgilerini kaldırmayı içermekteydi. On kelimeden kısa yorumlar, Transformers kütüphanesinden bir ardışık düzen kullanılarak gruplandırılıp özetlenirken, daha uzun yorumlar ayrıntılı özetleme için üstveri odaklı istemlerle ince ayar yapılmıştı. Analizi daha da geliştirmek amacıyla, “cardiffnlp/twitter-robertabase-sentiment” modeli kullanılarak duygu sınıflandırması gerçekleştirilmiş ve geri bildirimler olumsuz, tarafsız ve olumlu olmak üzere üç farklı kategoriye ayrılmıştı. Değerlendirme metrikleri arasında uzman incelemeleri, bağlamsal uygunluk ve veri kümesinin duygu dağılımıyla mantıksal tutarlılık yer almıştı. Önceki modellere kıyasla, Llama 2 modeli, yorumların genel niyetini ve tonunu koruyarak daha eksiksiz ve tutarlı özetler üretmede üstün performans sergilemişti. Sonuç olarak, bu araştırma, LLM'lerin çok dilli eğitim verilerini işlemedeki etkinliğini ve ders içeriğini geliştirmek için uygulanabilir içgörüler sağlamadaki potansiyelini net bir şekilde vurgulamıştı. Bu çalışmanın sonuçları, gelecekteki araştırmalar için de yol gösterici olacaktı.	en_US
dc.identifier.citation	Pourjalil, K. (2025). Extracting meaningful information student surveys with NLP. İstanbul: Işık Üniversitesi Lisansüstü Eğitim Enstitüsü.	en_US
dc.identifier.uri	https://hdl.handle.net/11729/6422
dc.institutionauthor	Pourjalil, Kajal	en_US
dc.institutionauthorid	0009-0000-5832-3062
dc.language.iso	en	en_US
dc.publisher	Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü	en_US
dc.relation.publicationcategory	Tez	en_US
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	NLP	en_US
dc.subject	Llama 2	en_US
dc.subject	Survey	en_US
dc.subject	Summarization	en_US
dc.subject	Multilingual analysis	en_US
dc.subject	Anket	en_US
dc.subject	Özetleme	en_US
dc.subject	Üretken AI	en_US
dc.subject.lcc	QA76.9.N38 P68 2025
dc.subject.lcsh	Natural language processing (Computer science).	en_US
dc.subject.lcsh	Natural language processing (Computer science) -- Research.	en_US
dc.title	Extracting meaningful information student surveys with NLP	en_US
dc.title.alternative	NLP kullanarak öğrenci anketlerinden anlamlı bilgiler çıkarmak	en_US
dc.type	Master Thesis	en_US
dspace.entity.type	Publication	en_US

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: Extracting_meaningful_information_student_surveys_with_NLP.pdf
Boyut:: 868.19 KB
Biçim:: Adobe Portable Document Format

İndir

Lisans paketi

Listeleniyor 1 - 1 / 1

İsim:: license.txt
Boyut:: 1.17 KB
Biçim:: Item-specific license agreed upon to submission
Açıklama:

İndir

Koleksiyon

Lisansüstü Eğitim Enstitüsü Tez Koleksiyonu