Prosodic, morphological and lexical feature extraction of Turkish broadcast news data

Revidi, İzel D.

dc.contributor.advisor	Güz, Ümit	en_US
dc.contributor.author	Revidi, İzel D.	en_US
dc.contributor.other	Işık Üniversitesi, Fen Bilimleri Enstitüsü, Elektronik Mühendisliği Yüksek Lisans Programı	en_US
dc.date.accessioned	2016-06-08T06:07:31Z
dc.date.available	2016-06-08T06:07:31Z
dc.date.issued	2014-06-05
dc.identifier.citation	Revidi, İ. D., (2014). Prosodic, morpological and lexical feature extraction of Turkish broadcast news data. İstanbul: Işık Üniversitesi Fen Bilimleri Enstitüsü.	en_US
dc.identifier.uri	https://hdl.handle.net/11729/958
dc.description	Text in English ; Abstract: English and Turkish	en_US
dc.description	Includes bibliographical references (leaves 92-95)	en_US
dc.description	xii, 115 leaves	en_US
dc.description.abstract	Sentence segmentation from speech is part of a process that aims at enriching the unstructured stream of words that are the output of standard speech recognizers. Its role is to find the sentence units in this stream of words. Sentence segmentation is a preliminary step toward speech understanding. Once the sentence boundaries are detected, further syntactic and/or semantic analysis can be performed on these sentences. Usually, speech recognizer output lacks the textual cues to these entities (such as headers, paragraphs, sentence punctuation, and capitalization). However, speech provides extra non-lexical cues, related to features like pitch, energy, pause and word durations as prosodic features; verb, noun or adjective as a morphological features and also lexical features. These prosodic, morphological and lexical features are provides a complementary information for segmentation of speech into sentences. Our goal is examine feature the extraction and use of prosodic information which has been done in previous works, in addition to lexical features and morphological for spoken language processing of Turkish with open source tools.	en_US
dc.description.abstract	Cümle bölütlemesi otomatik konuşma tanıma sisteminden çıkan sözcüklerin içeriğini zenginleştirmeyi hedefleyen sürecin bir parçasıdır. Cümle bölütlemesi, gelen kelime akışının bütün bir cümle olarak tanımlanması görevini üstlenir ve konuşma anlamının çıkarılması sürecinin bir önceki aşamasını oluşturur. Cümle sınırlarının bulunması ile birlikte cümle üzerinde sözdizimi ve/veya anlamsal analiz yapılabilmektedir. Genellikle otomatik konuşma tanıma sisteminden alınan çıktılarda başlık, paragraf, noktalama, büyük/küçük harf gibi bilgileri içeren metin işaretleri yer almamaktadır. Ancak konuşma hali hazırda enerji, duraklama bilgisi, kelimenin geçiş süresi gibi bürünsel özellikleri; kelimenin yüklem, isim veya sıfat olması gibi biçimsel özellikleri ve sözcüksel özellikleri barındırmaktadır. Bu bürünsel, biçimsel ve sözcüksel özellikler cümle bölütlemesinin yapılabilmesi için tamamlayıcı bir bilgi sağlamaktadır. Yapılan çalışmadaki amacımız daha önceki çalışmalarda yapılmış bürünsel özelliklerin çıkarımı ve kullanımına ek olarak; biçimsel ve sözcüksel özellikler açık kaynak kodlu araçlar ile Türkçe Konuşma Dili üzerinde çıkarımı ve kullanımıdır.	en_US
dc.description.tableofcontents	Introduction	en_US
dc.description.tableofcontents	Related Works	en_US
dc.description.tableofcontents	Automatic Speech Recognition	en_US
dc.description.tableofcontents	Definition	en_US
dc.description.tableofcontents	ASR With Turkish Spoken Language	en_US
dc.description.tableofcontents	Start-Up	en_US
dc.description.tableofcontents	Modeling	en_US
dc.description.tableofcontents	Hidden Markov Toolkit	en_US
dc.description.tableofcontents	Prosodic Features	en_US
dc.description.tableofcontents	Features	en_US
dc.description.tableofcontents	Prosodic Feature Extraction	en_US
dc.description.tableofcontents	Morphological Features	en_US
dc.description.tableofcontents	Morphological Processes	en_US
dc.description.tableofcontents	Combining Morphemes	en_US
dc.description.tableofcontents	Morphological Feature Extraction	en_US
dc.description.tableofcontents	Lexical Features	en_US
dc.description.tableofcontents	N-gram Usage	en_US
dc.description.tableofcontents	Sentence Segmentation	en_US
dc.description.tableofcontents	Approach	en_US
dc.description.tableofcontents	Software Usage(Icsiboost)	en_US
dc.description.tableofcontents	Experiments and Conclusion	en_US
dc.description.tableofcontents	Overview	en_US
dc.description.tableofcontents	Experiments	en_US
dc.description.tableofcontents	Conclusion	en_US
dc.language.iso	eng	en_US
dc.publisher	Işık Üniversitesi	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject.lcc	TK7882.S65 R48 2014
dc.subject.lcsh	Speech processing systems.	en_US
dc.subject.lcsh	Speech synthesis.	en_US
dc.subject.lcsh	Automatic speech recognition.	en_US
dc.title	Prosodic, morphological and lexical feature extraction of Turkish broadcast news data	en_US
dc.title.alternative	Türkçe haber verisinden bürünsel, biçimsel ve sözcüksel özelliklerin çıkarımı	en_US
dc.type	masterThesis	en_US
dc.contributor.department	Işık Üniversitesi, Fen Bilimleri Enstitüsü, Elektronik Mühendisliği Yüksek Lisans Programı	en_US
dc.relation.publicationcategory	Tez	en_US
dc.contributor.institutionauthor	Revidi, İzel D.	en_US

Bu öğenin dosyaları:

Ad:: 958.pdf
Boyut:: 3.802Mb
Biçim:: PDF
Açıklama:: Master Thesis

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

FBE - Tez Koleksiyonu | Elektronik Mühendisliği / Electronics Engineering [39]
Elektronik Mühendisliği Yüksek Lisans programına ait tez koleksiyonunu içerir.

Basit öğe kaydını göster

Aksi belirtilmediği sürece bu öğenin lisansı: info:eu-repo/semantics/openAccess