Show simple item record

dc.contributor.authorDalva, Doğanen_US
dc.contributor.authorGüz, Ümiten_US
dc.contributor.authorGürkan, Hakanen_US
dc.date.accessioned2019-05-22T00:14:38Z
dc.date.available2019-05-22T00:14:38Z
dc.date.issued2018
dc.identifier.citationDalva, D., Güz, Ü. & Gürkan, H. (2018). Extension of conventional co-training learning strategies to three-view and committee-based learning strategies for effective automatic sentence segmentation. Paper presented at the 2018 IEEE Spoken Language Technology Workshop (SLT), 750-755. doi:10.1109/SLT.2018.8639533en_US
dc.identifier.isbn9781538643341
dc.identifier.isbn9781538643334
dc.identifier.isbn9781538643358
dc.identifier.issn2639-5479
dc.identifier.otherWOS:000463141800104
dc.identifier.urihttps://hdl.handle.net/11729/1594
dc.identifier.urihttp://dx.doi.org/10.1109/SLT.2018.8639533
dc.description.abstractThe objective of this work is to develop effective multi-view semi-supervised machine learning strategies for sentence boundary classification problem when only small sets of sentence boundary labeled data are available. We propose three-view and committee-based learning strategies incorporating with co-training algorithms with agreement, disagreement, and self-combined learning strategies using prosodic, lexical and morphological information. We compare experimental results of proposed three-view and committee-based learning strategies to other semi-supervised learning strategies in the literature namely, self-training and co-training with agreement, disagreement, and self-combined strategies. The experiment results show that sentence segmentation performance can be highly improved using multi-view learning strategies that we propose since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average performance when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.en_US
dc.description.sponsorshipThis material is based upon work supported by the Scientific and Technological Research Council of Turkey (TUBITAK) (Project Number: 107E182 and Project Number: 111E228) and Isik University Scientific Research Project Fund (Project Number: 09A301 and Project Number: 14A201) and J. William Fulbright Post-Doctoral Research Fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agenciesen_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.relation.isversionof10.1109/SLT.2018.8639533
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectBoostingen_US
dc.subjectCo-trainingen_US
dc.subjectSentence segmentationen_US
dc.subjectSemi-supervised learningen_US
dc.subjectProsodyen_US
dc.subjectSpeechen_US
dc.subjectLearning algorithmsen_US
dc.subjectMachine learningen_US
dc.subjectSupervised learningen_US
dc.subjectData modelsen_US
dc.subjectSemisupervised learningen_US
dc.subjectFeature extractionen_US
dc.subjectTrainingen_US
dc.subjectToolsen_US
dc.subjectTask analysisen_US
dc.subjectLearning (artificial intelligence)en_US
dc.subjectNatural language processingen_US
dc.subjectSpeech processingen_US
dc.subjectMultiview learning strategiesen_US
dc.subjectDisjoint feature setsen_US
dc.subjectManually labeled dataen_US
dc.subjectSentence boundary classification problemen_US
dc.subjectSentence boundary labeled dataen_US
dc.subjectCommittee-based learning strategiesen_US
dc.subjectProsodic informationen_US
dc.subjectLexical informationen_US
dc.subjectMorphological informationen_US
dc.subjectSelf-combined strategiesen_US
dc.subjectAutomatic sentence segmentationen_US
dc.subjectConventional co-training learningen_US
dc.subjectMultiview semisupervised machine learningen_US
dc.subjectTurkish spoken languagesen_US
dc.subjectEnglish spoken languagesen_US
dc.titleExtension of conventional co-training learning strategies to three-view and committee-based learning strategies for effective automatic sentence segmentationen_US
dc.typeconferenceObjecten_US
dc.description.versionPublisher's Versionen_US
dc.relation.journal2018 IEEE Spoken Language Technology Workshop (SLT)en_US
dc.contributor.departmentIşık Üniversitesi, Mühendislik Fakültesi, Elektrik-Elektronik Mühendisliği Bölümüen_US
dc.contributor.departmentIşık University, Faculty of Engineering, Department of Electrical-Electronics Engineeringen_US
dc.contributor.authorID0000-0002-4597-0954
dc.identifier.startpage750
dc.identifier.endpage755
dc.peerreviewedYesen_US
dc.publicationstatusPublisheden_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.contributor.institutionauthorDalva, Doğanen_US
dc.contributor.institutionauthorGüz, Ümiten_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record