Adversarial Image Detection Based on Bayesian Neural Layers

Date

Journal Title

Journal ISSN

Volume Title

Publisher

National University of Colombia

Abstract

Although Deep Neural Networks (DNNs) have repeatedly shown excellent performance, they are known to be vulnerable to adversarial attacks that contain human-imperceptible perturbations. Multiple adversarial defense strategies have been proposed to alleviate this issue, but they often demonstrate restricted practicability regarding efficiency and handling solely specific attacks. In this paper, we analyze the performance of Bayesian Neural Networks (BNNs) endowed with flexible approximate posterior distribution for detecting adversarial examples. Furthermore, we study how robust the detection method is when Bayesian layers are located at the top or throughout the DNNs to determine the role of the network's hidden layers, and we compare the results with the deterministic ones. We show how BNNs offer a powerful, and practical method of detecting adversarial examples in comparison with deterministic approaches. Finally, we discuss the impact of having well-calibrated models as detectors and how non-gaussian priors enhance the performance of the detection.

Description

Citation