Searching for the optimal ordering of classes in rule induction

Ata, Sezin

dc.contributor.advisor	Yıldız, Olcay Taner	en_US
dc.contributor.author	Ata, Sezin	en_US
dc.contributor.other	Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı	en_US
dc.date.accessioned	2016-05-31T07:13:08Z
dc.date.available	2016-05-31T07:13:08Z
dc.date.issued	2012-09-19
dc.identifier.citation	Ata, S. (2012). Searching for the optimal ordering of classes in rule induction. İstanbul: Işık Üniversitesi Fen Bilimleri Enstitüsü.	en_US
dc.identifier.uri	https://hdl.handle.net/11729/884
dc.description	Text in English ; Abstract: English and Turkish	en_US
dc.description	Includes bibliographical references (leaves 47-50)	en_US
dc.description	x, 51 leaves	en_US
dc.description.abstract	In this thesis, we work on rule induction algorithms, basically Ripper. These algorithms solve a K>2 class problem by transforming it into a sequence ok K-1 two class problems. As a heuristic, these algorithms learn classes in the order of increasing prior probabilities. Although the heuristic works well in practice, there is much room for improvement. We propose two algorithms for that purpose. The first algorithm, namely Forward Ordering Search(FOS) starts with the ordering heuristic provided and searches for better oderings by swapping consecutive classes. For a dataset with K classes, the ordering space will be as large as K!. Since FOS is an example of Steepest Ascent Hill Climbing(Gradient Search), starting with the heuristic ordering will only give local maximum in the search space. In order to improve the performance, we use 10 random initial orderings as in Random-Restart (Steepest Ascent) Hill-Climbing. The best performance between 10 random initial orderings is the result of Random-Restart FOS. The second algorithm, namely Pairwise error Approximation (PEA), transforms the ordering search problem into an optimization problem and uses the solution of the optimization algorithm to extract the optimal ordering. In this algorithm, the number of random orderings to construct the optimization problem is a parameter and we try several values of this parameter to see the effect on the performance. We compare our algorithms with the original Ripper on 13 datasets from UCI repository [1]. Experimental results show that, our algorithms produce rule sets that are significantly better than those produced by Ripper proper in general and the number of rules and conditions of the produces rule sets are comparable with Ripper proper. Even though the accuracy of Random-Restart FOS is better than FOS, the time complexity of the algorithm is far worse than FOS. The average error estimation results of PEA promote the consistency of our pairwise assumption and show the relationship between accuracy and the number of random orderings to extract the optimal ordering.	en_US
dc.description.abstract	Bu tezde, CN2 ve Ripper kural çıkarım algoritmaları üzerinde çalıştık. Bu algoritmaların ortak özelliği K>2 sınıflı veri kümelerini sınıflandırırken, K-1 adet 2 sınıflı probleme çevirerek sınıflandırmalarıdır. Bulgusal yaklaşıma göre, bu algoritmalar sınıfları, artan önsel olasılıklarına göre öğrenirler. Biz de çalışmamızda, kural çıkarım algoritmalarının sınıf sıralamalarına bağlı olarak performanslarının nasıl değişeceğini araştırırız. Bu amaçla, iki algoritma sunarız. Sunulan ilk algoritma, FOS (ileriye doğru-sıralama arama algoritması), ilk olarak bulgusal yaklaşımın sıralamasıyla başlar. Yan yana sınıfların yer değişimleri ile oluşturulmuş sıralamaları, daha iyi performans elde edildiği sürece, iteratif şekilde karşılaştırır. Bu arama En Dik Tırmanış Algoritması'na bir örnek olduğu için tüm arama uzayında ancak yerel bir başarı noktası bulacak şekilde gerçekeleşir. Tüm arama uzayı, K>8 sınıfı veri kümeleri için 8!'den büyük bir uzaydır. Bu nedenle, performansı arttırmak için Rasgele-Başlangıç Dik Tırmanış Algoritması'nda olduğu gibi, rasgele 10 farklı başlangıç sıralamasıyla FOS algoritmasını çalıştırırız. Bu sonuçların en iyisi, Rasgele-Başlangıç FOS'un sonucunu belirler. Sunduğumuz ikinci algoritma olan İkili Hata Yaklaşıklaması Algoritması, sıralama arama problemlerimizi, sıralamaların sınıf ikililerini kullanarak, optimizasyon problemine çevirir. Problemin çözümünü optimal sıralamayı bulmak için kullanırız.Optimizasyon Probleminin parametreleri olarak rasgele sıramalar üretiriz ve çeşitli sayıda rastgele sıralamalarla,sıralama sayısının performansa etkisini gözlemleriz. Algoritmalarımız sonuçlorını Ripper kural çıkarım algoritmasıyla 13 veri kümesi üzerinde karşılaştırırız.Elde ettiğimiz sonuçlar genel olarak,bulduğumuz sıralamaların perfomans ve karşılıksızları açısından daha iyi kural kümeleri oluşturduğunu gösterir. Ayrıca Rasgele-Başlangınç FOS algoritmasının performansının FOS'tan iyi olmasına rağmen ,algoritmanın karmaşıklığının FOS'tan kat kat fazla olduğunu gözlemleriz. Son olarak, PEA algoritması için hesapladığımız ortalama kestirim harası sonuçları,algoritmayı oluşturmamıza neden olan varsayımımızın tutalılığını destekler ve dogru sonuçlarla rasgele sıralama sayısı arasındaki ilişkiyi gösterir.	en_US
dc.description.tableofcontents	Introduction	en_US
dc.description.tableofcontents	Rule Induction Algorithms	en_US
dc.description.tableofcontents	Search Direction	en_US
dc.description.tableofcontents	Top-down	en_US
dc.description.tableofcontents	Bottom-up	en_US
dc.description.tableofcontents	Bi-directional	en_US
dc.description.tableofcontents	Search Strategy	en_US
dc.description.tableofcontents	Hill-Climbing	en_US
dc.description.tableofcontents	Beam search	en_US
dc.description.tableofcontents	Best-First	en_US
dc.description.tableofcontents	Stochastic	en_US
dc.description.tableofcontents	Pruning	en_US
dc.description.tableofcontents	Pre-pruning	en_US
dc.description.tableofcontents	Post-pruning	en_US
dc.description.tableofcontents	Survey	en_US
dc.description.tableofcontents	Ripper	en_US
dc.description.tableofcontents	Proposed Algorithms	en_US
dc.description.tableofcontents	Motivation	en_US
dc.description.tableofcontents	Forward Ordering Search (FOS)	en_US
dc.description.tableofcontents	Pairwise Error Approximation (PEA)	en_US
dc.description.tableofcontents	Theory	en_US
dc.description.tableofcontents	Algorithm	en_US
dc.description.tableofcontents	Experiments	en_US
dc.description.tableofcontents	Setup	en_US
dc.description.tableofcontents	Motivation	en_US
dc.description.tableofcontents	FOS Results	en_US
dc.description.tableofcontents	PEA Results	en_US
dc.description.tableofcontents	Conclusion	en_US
dc.language.iso	eng	en_US
dc.publisher	Işık Üniversitesi	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject.lcc	QA76.9.A43 A83 2012
dc.subject.lcsh	Computer algorithms.	en_US
dc.subject.lcsh	Data structures (Computer science)	en_US
dc.title	Searching for the optimal ordering of classes in rule induction	en_US
dc.title.alternative	Kural çıkarımda optimal sınıf sıralamasını arama	en_US
dc.type	masterThesis	en_US
dc.contributor.department	Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı	en_US
dc.relation.publicationcategory	Tez	en_US
dc.contributor.institutionauthor	Ata, Sezin	en_US

Bu öğenin dosyaları:

Ad:: 11729-884.pdf
Boyut:: 314.1Kb
Biçim:: PDF
Açıklama:: MasterThesis

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

FBE - Tez Koleksiyonu | Bilgisayar Mühendisliği / Computer Engineering [73]
Bilgisayar Mühendisliği Yüksek Lisans programına ait tez koleksiyonunu içerir.

Basit öğe kaydını göster

Aksi belirtilmediği sürece bu öğenin lisansı: info:eu-repo/semantics/openAccess