Unsupervised morphological analysis using tries
Citation
Ak, K. & Yıldız, O. T. (2012). Unsupervised morphological analysis using tries. Paper presented at the Computer and Information Sciences II, 69-75. doi:10.1007/978-1-4471-2155-8_8Abstract
This article presents an unsupervised morphological analysis algorithm to segment words into roots and affixes. The algorithm relies on word occurrences in a given dataset. Target languages are English, Finnish, and Turkish, but the algorithm can be used to segment any word from any language given the wordlists acquired from a corpus consisting of words and word occurrences. In each iteration, the algorithm divides words with respect to occurrences and constructs a new trie for the remaining affixes. Preliminary experimental results on three languages show that our novel algorithm performs better than most of the previous algorithms.