Unreasonable effectiveness of last hidden layer activations for adversarial robustness
dc.authorid | 0000-0002-6214-6262 | |
dc.authorid | 0000-0003-0298-0690 | |
dc.contributor.author | Tuna, Ömer Faruk | en_US |
dc.contributor.author | Çatak, Ferhat Özgür | en_US |
dc.contributor.author | Eskil, Mustafa Taner | en_US |
dc.date.accessioned | 2022-10-26T18:25:01Z | |
dc.date.available | 2022-10-26T18:25:01Z | |
dc.date.issued | 2022 | |
dc.department | Işık Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü | en_US |
dc.department | Işık University, Faculty of Engineering and Natural Sciences, Department of Computer Engineering | en_US |
dc.description.abstract | In standard Deep Neural Network (DNN) based classifiers, the general convention is to omit the activation function in the last (output) layer and directly apply the softmax function on the logits to get the probability scores of each class. In this type of architectures, the loss value of the classifier against any output class is directly proportional to the difference between the final probability score and the label value of the associated class. Standard White-box adversarial evasion attacks, whether targeted or untargeted, mainly try to exploit the gradient of the model loss function to craft adversarial samples and fool the model. In this study, we show both mathematically and experimentally that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases, preventing attackers from exploiting the model's loss function to craft adversarial samples. We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets. Detailed experiments confirmed that our approach substantially improves robustness against gradient-based targeted and untargeted attack threats. And, we showed that the increased non-linearity at the output layer has some ad-ditional benefits against some other attack methods like Deepfool attack. | en_US |
dc.description.version | Publisher's Version | en_US |
dc.identifier.citation | Tuna, Ö. F., Çatak, F. Ö. & Eskil, M. T. (2022). Unreasonable effectiveness of last hidden layer activations for adversarial robustness. Paper presented at the 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 1098-1103. doi:10.1109/COMPSAC54236.2022.00172 | en_US |
dc.identifier.doi | 10.1109/COMPSAC54236.2022.00172 | |
dc.identifier.endpage | 1103 | |
dc.identifier.isbn | 9781665488105 | |
dc.identifier.isbn | 9781665488112 | |
dc.identifier.issn | 0730-3157 | |
dc.identifier.scopus | 2-s2.0-85136991056 | |
dc.identifier.scopusquality | N/A | |
dc.identifier.startpage | 1098 | |
dc.identifier.uri | https://hdl.handle.net/11729/5093 | |
dc.identifier.uri | http://dx.doi.org/10.1109/COMPSAC54236.2022.00172 | |
dc.identifier.wos | WOS:000855983300164 | |
dc.identifier.wosquality | N/A | |
dc.indekslendigikaynak | Web of Science | en_US |
dc.indekslendigikaynak | Scopus | en_US |
dc.indekslendigikaynak | Conference Proceedings Citation Index – Science (CPCI-S) | en_US |
dc.institutionauthor | Tuna, Ömer Faruk | en_US |
dc.institutionauthor | Eskil, Mustafa Taner | en_US |
dc.institutionauthorid | 0000-0002-6214-6262 | |
dc.institutionauthorid | 0000-0003-0298-0690 | |
dc.language.iso | en | en_US |
dc.peerreviewed | Yes | en_US |
dc.publicationstatus | Published | en_US |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | en_US |
dc.relation.ispartof | 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC) | en_US |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Adversarial machine learning | en_US |
dc.subject | Deep neural networks | en_US |
dc.subject | Robustness | en_US |
dc.subject | Trustworthy AI | en_US |
dc.subject | Chemical activation | en_US |
dc.subject | Multilayer neural networks | en_US |
dc.subject | Activation functions | en_US |
dc.subject | Hidden layers | en_US |
dc.subject | Loss functions | en_US |
dc.subject | Machine-learning | en_US |
dc.subject | Network-based | en_US |
dc.subject | Output layer | en_US |
dc.subject | White box | en_US |
dc.subject | Object detection | en_US |
dc.subject | Deep learning | en_US |
dc.subject | IOU | en_US |
dc.title | Unreasonable effectiveness of last hidden layer activations for adversarial robustness | en_US |
dc.type | Conference Object | en_US |
dspace.entity.type | Publication |
Dosyalar
Orijinal paket
1 - 1 / 1
Küçük Resim Yok
- İsim:
- Unreasonable_effectiveness_of_last_hidden_layer_activations_for_adversarial_robustness.pdf
- Boyut:
- 570.35 KB
- Biçim:
- Adobe Portable Document Format
- Açıklama:
- Publisher's Version
Lisans paketi
1 - 1 / 1
Küçük Resim Yok
- İsim:
- license.txt
- Boyut:
- 1.44 KB
- Biçim:
- Item-specific license agreed upon to submission
- Açıklama: