Unreasonable effectiveness of last hidden layer activations for adversarial robustness

Tuna, Ömer Faruk; Çatak, Ferhat Özgür; Eskil, Mustafa Taner

dc.contributor.author	Tuna, Ömer Faruk	en_US
dc.contributor.author	Çatak, Ferhat Özgür	en_US
dc.contributor.author	Eskil, Mustafa Taner	en_US
dc.date.accessioned	2022-10-26T18:25:01Z
dc.date.available	2022-10-26T18:25:01Z
dc.date.issued	2022
dc.identifier.citation	Tuna, Ö. F., Çatak, F. Ö. & Eskil, M. T. (2022). Unreasonable effectiveness of last hidden layer activations for adversarial robustness. Paper presented at the 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 1098-1103. doi:10.1109/COMPSAC54236.2022.00172	en_US
dc.identifier.isbn	9781665488105
dc.identifier.isbn	9781665488112
dc.identifier.issn	0730-3157	en_US
dc.identifier.uri	https://hdl.handle.net/11729/5093
dc.identifier.uri	http://dx.doi.org/10.1109/COMPSAC54236.2022.00172
dc.description.abstract	In standard Deep Neural Network (DNN) based classifiers, the general convention is to omit the activation function in the last (output) layer and directly apply the softmax function on the logits to get the probability scores of each class. In this type of architectures, the loss value of the classifier against any output class is directly proportional to the difference between the final probability score and the label value of the associated class. Standard White-box adversarial evasion attacks, whether targeted or untargeted, mainly try to exploit the gradient of the model loss function to craft adversarial samples and fool the model. In this study, we show both mathematically and experimentally that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases, preventing attackers from exploiting the model's loss function to craft adversarial samples. We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets. Detailed experiments confirmed that our approach substantially improves robustness against gradient-based targeted and untargeted attack threats. And, we showed that the increased non-linearity at the output layer has some ad-ditional benefits against some other attack methods like Deepfool attack.	en_US
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	en_US
dc.relation.ispartof	2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Adversarial machine learning	en_US
dc.subject	Deep neural networks	en_US
dc.subject	Robustness	en_US
dc.subject	Trustworthy AI	en_US
dc.subject	Chemical activation	en_US
dc.subject	Multilayer neural networks	en_US
dc.subject	Activation functions	en_US
dc.subject	Hidden layers	en_US
dc.subject	Loss functions	en_US
dc.subject	Machine-learning	en_US
dc.subject	Network-based	en_US
dc.subject	Output layer	en_US
dc.subject	White box	en_US
dc.subject	Object detection	en_US
dc.subject	Deep learning	en_US
dc.subject	IOU	en_US
dc.title	Unreasonable effectiveness of last hidden layer activations for adversarial robustness	en_US
dc.type	Conference Object	en_US
dc.description.version	Publisher's Version	en_US
dc.department	Işık Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü	en_US
dc.department	Işık University, Faculty of Engineering and Natural Sciences, Department of Computer Engineering	en_US
dc.authorid	0000-0002-6214-6262
dc.authorid	0000-0003-0298-0690
dc.authorid	0000-0002-6214-6262	en_US
dc.authorid	0000-0003-0298-0690	en_US
dc.identifier.startpage	1098
dc.identifier.endpage	1103
dc.peerreviewed	Yes	en_US
dc.publicationstatus	Published	en_US
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı	en_US
dc.institutionauthor	Tuna, Ömer Faruk	en_US
dc.institutionauthor	Eskil, Mustafa Taner	en_US
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Scopus	en_US
dc.indekslendigikaynak	Conference Proceedings Citation Index- Science (CPCI-S)	en_US
dc.identifier.wosquality	N/A	en_US
dc.identifier.wos	WOS:000855983300164
dc.identifier.wos	WOS:000855983300164	en_US
dc.identifier.scopus	2-s2.0-85136991056	en_US
dc.identifier.doi	10.1109/COMPSAC54236.2022.00172
dc.identifier.scopusquality	N/A	en_US

Bu öğenin dosyaları:

Ad:: Unreasonable_effectiveness_of_ ...
Boyut:: 570.3Kb
Biçim:: PDF
Açıklama:: Publisher's Version

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Bildiri Koleksiyonu | Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering [14]
Bilgisayar Mühendisliği Bölümüne ait bildiri koleksiyonunu içerir.
Scopus İndeksli Bildiri Koleksiyonu [488]
WOS İndeksli Bildiri Koleksiyonu [412]

Basit öğe kaydını göster