重庆理工大学学报(自然科学) ›› 2023, Vol. 37 ›› Issue (10): 220-228.

• 信息·计算机 • 上一篇    下一篇

一种加权最大化激活的无数据通用对抗攻击

杨 武,刘依然,冯 欣   

  1. (重庆理工大学 计算机科学与工程学院,重庆 400054)
  • 出版日期:2023-11-20 发布日期:2023-11-20
  • 作者简介:杨武,男,博士,教授,主要从事信息检索研究,Email:yangwu@cqut.edu.cn;通信作者 明镝,男,博士,讲师,主要 从事机器学习、优化研究,Email:diming@cqut.edu.cn。

A data-free universal adversarial attack via weighted activation maximization

  • Online:2023-11-20 Published:2023-11-20

摘要: 对抗攻击产生的对抗样本能够影响神经网络在图像分类任务中的预测结果。由于 对抗样本难以察觉,具有迁移性,即同一个对抗样本能干扰不同结构模型的判断,因此制作对抗 扰动,生成对抗样本在检测模型缺陷等方面有重大意义。近几年提出的无数据通用对抗攻击在 无数据条件下仅通过最大化激活所有卷积层的激活值来制作对抗扰动,更加接近模型真实应用 场景,但忽略了不同的卷积层提取的特征差异,导致对抗样本迁移性较差。现提出一种加权最 大化激活的无数据通用攻击方法,为每个卷积层赋予相应的权重,利用不同卷积层激活值对通 用扰动的影响,提高对抗样本的迁移性。在 ImageNet验证集上的实验表明,加权最大化激活攻 击方法相比于其他方法具有良好的攻击效果;消融实验表明,通用对抗扰动能够从浅层卷积层 学习泛化特征,具有更好的迁移性。

关键词: 图像分类, 对抗攻击, 加权最大化激活, 迁移性

Abstract: Adversarial examples generated from adversarial attacks can seriously influence the prediction of convolutional neural networks in image classification tasks.Due to the difficult detection of adversarial samples and their transferability (an adversarial sample can undermine the prediction of models with different architectures),crafting adversarial perturbations and generating adversarial samples are of great importance in detecting model defects.However,existing data-free universal adversarial attacks only maximize the activation values of all the convolutional layers to craft adversarial perturbations without any data,which is practical in real-world applications,but adversarial examples are poor in transferability since the difference of features extracted by different convolutional layers is rarely considered.In this paper,a data-free universal adversarial attack method with Weighted Maximization Activation (WAM) is proposed,which assigns the corresponding weight to each convolution layer and increases the weight of activation value from the shallow convolutional layer that can extract generalized features.Experiments on the ImageNet validation set show that the weighted maximization activation attack performs better than other data-free universal methods.Additionally,the ablation experiment verifies that the universal adversarial perturbation can learn generic features from shallow convolutional layers and achieve better transferability.

中图分类号: 

  • TP391.41