• Türkçe
    • English
  • English 
    • Türkçe
    • English
  • Login
View Item 
  •   DSpace@Işık
  • 1- Fakülteler | Faculties
  • Mühendislik ve Doğa Bilimleri Fakültesi / Faculty Of Engineering And Natural Sciences
  • Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
  • Bildiri Koleksiyonu | Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
  • View Item
  •   DSpace@Işık
  • 1- Fakülteler | Faculties
  • Mühendislik ve Doğa Bilimleri Fakültesi / Faculty Of Engineering And Natural Sciences
  • Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
  • Bildiri Koleksiyonu | Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Unreasonable effectiveness of last hidden layer activations for adversarial robustness

Thumbnail

View/Open

Publisher's Version (570.3Kb)

Date

2022

Author

Tuna, Ömer Faruk
Çatak, Ferhat Özgür
Eskil, Mustata Taner

Metadata

Show full item record

Citation

Tuna, Ö. F., Çatak, F. Ö. & Eskil, M. T. (2022). Unreasonable effectiveness of last hidden layer activations for adversarial robustness. Paper presented at the 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 1098-1103. doi:10.1109/COMPSAC54236.2022.00172

Abstract

In standard Deep Neural Network (DNN) based classifiers, the general convention is to omit the activation function in the last (output) layer and directly apply the softmax function on the logits to get the probability scores of each class. In this type of architectures, the loss value of the classifier against any output class is directly proportional to the difference between the final probability score and the label value of the associated class. Standard White-box adversarial evasion attacks, whether targeted or untargeted, mainly try to exploit the gradient of the model loss function to craft adversarial samples and fool the model. In this study, we show both mathematically and experimentally that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases, preventing attackers from exploiting the model's loss function to craft adversarial samples. We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets. Detailed experiments confirmed that our approach substantially improves robustness against gradient-based targeted and untargeted attack threats. And, we showed that the increased non-linearity at the output layer has some ad-ditional benefits against some other attack methods like Deepfool attack.

Source

2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)

URI

https://hdl.handle.net/11729/5093
http://dx.doi.org/10.1109/COMPSAC54236.2022.00172

Collections

  • Bildiri Koleksiyonu | Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering [4]
  • Scopus İndeksli Bildiri Koleksiyonu [452]
  • WoS İndeksli Bildiri Koleksiyonu [353]



DSpace software copyright © 2002-2015  DuraSpace
Contact Us | Send Feedback
Theme by 
@mire NV
 

 




| Policy | Guide | Contact |

DSpace@Işık

by OpenAIRE
Advanced Search

sherpa/romeo

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypeLanguageDepartmentCategoryPublisherAccess TypeIşık AuthorCitationThis CollectionBy Issue DateAuthorsTitlesSubjectsTypeLanguageDepartmentCategoryPublisherAccess TypeIşık AuthorCitation

My Account

LoginRegister

Statistics

View Google Analytics Statistics

DSpace software copyright © 2002-2015  DuraSpace
Contact Us | Send Feedback
Theme by 
@mire NV
 

 


|| Policy || Guide || Library || Işık University || OAI-PMH ||

Işık University Library, Şile, İstanbul, Turkey
If you find any errors in content please report us

Creative Commons License
Işık University Institutional Repository is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 Unported License..

DSpace@Işık:


DSpace 6.2

tarafından İdeal DSpace hizmetleri çerçevesinde özelleştirilerek kurulmuştur.