A multilayer annotated corpus for Turkish
Yıldız, Olcay Taner
MetadataShow full item record
In this paper, we present the first multilayer annotated corpus for Turkish, which is a low-resourced agglutinative language. Our dataset consists of 9,600 sentences translated from the Penn Treebank Corpus. Annotated layers contain syntactic and semantic information including morphological disambiguation of words, named entity annotation, shallow parse, sense annotation, and semantic role label annotation.
Showing items related by title, author, creator and subject.
Akçakaya, Sinan; Yıldız, Olcay Taner (IEEE, 2018-01-01)This paper reports our efforts in constructing of a sense labeled Turkish corpus with respect to Turkish Language Institution's dictionary, using the traditional method of manual tagging. We tagged a pre-built parallel ...
Ak, Koray (Işık Üniversitesi, 2019-04-02)PropBank is the bank of propositions which contains hand-annotated corpus for predicate-argument information and semantic roles or arguments. It aims to provide an extensive dataset for enhancing NLP applications such as ...
Ak, Koray; Toprak, Cansu; Esgel, Volkan; Yıldız, Olcay Taner (Tubitak Scientific & Technical Research Council, 2018)This paper describes our approach to developing the Turkish PropBank by adopting the semantic role-labeling guidelines of the original PropBank and using the translation of the English Penn-TreeBank as a resource. We discuss ...