重庆理工大学学报(自然科学) ›› 2024, Vol. 38 ›› Issue (2): 161-169.

• 信息计算机 • 上一篇    下一篇

基于改进 DPGN的少样本图像分类算法研究

王玲,孙莹,王鹏,白燕娥   

  1. 长春理工大学计算机科学技术学院
  • 出版日期:2024-03-22 发布日期:2024-03-22
  • 作者简介:王玲,女,博士,讲师,主要从事机器视觉、图像处理研究,E-mail:wangling0912@cust.edu.cn。

Research on image classification algorithm w ith few-shot based on im proved DPGN

  • Online:2024-03-22 Published:2024-03-22

摘要: DPGN(distribution propagation graph network)是基于深度学习的少样本图像分类算法,在数据稀疏的条件下可以顺利完成图像分类,但其分类的准确率仍需进一步提升。以DPGN算法为研究对象,提出SFOD_DPGN(SinAM_FRN_layer_ODConv_DM&EMD_distribution propagation graph network)算法。在骨干神经网络Resnet12的残差块中融入注意力机制;将Resnet12网络中批量归一化与ReLu激活函数搭配使用的方式改为滤波器响应归一化与阈值线性单元激活函数搭配使用的方式;在分类器模块中选用全维动态卷积替换普通卷积;使用马氏距离和推土机距离替换L2距离度量函数。在CUB-200-2011数据集上的实验表明,在5way-1shot和5way-5shot分类任务下,SFOD_DPGN算法比DPGN算法的准确率提升约7.97%和2.66%

关键词: 深度学习, 少样本图像分类, 注意力机制, 全维动态卷积, 马氏距离, 推土机距离

Abstract:

The distribution propagation graph network(DPGN)is a few-shot image classification algorithm based on deep learning.Unfortunately,the DPGN algorithm completely ignores semantic information,which is important for fine-grained classification.Therefore,it delivers poor classification performances.This paper proposes a new Few-shot learning algorithm based on the DPGN algorithm,SinAM-FRN_layer-ODConv-DM&EMD_Distribution Propagation Graph Network(SFOD_DPGN).

First,to address the inability to extract image features by the feature extraction module of the DPGN algorithm,the SimAM attention mechanism is integrated into four residual blocks of the feature extraction network ResNet12.The SimAM attention mechanism can generate three-dimensional weights for feature maps from both spatial and channel dimensions,and then aggregates the generated weights with the feature maps to enable the improved ResNet12 to learn more and richer image features;Second,in view that the normalization method of the ResNet12 is affected by the number of images selected in training,the combination of batch normalization and the ReLu activation function in the main path of each residual block of the ResNet12 is changed to the combination of the filter response normalization(FRN)and the threshold linear unit activation function(TLU).Because of the FRN without mean operation,it easily leads to activation with arbitrary bias far from zero.If the FRN combines with the ReLu activation function,this bias has adverse effects on training.This paper employs the TLU after the FRN to address the problem.The SFOD_DPGN algorithm improves the classification accuracy and ensures its inference speed.Then,it optimizes the classifier module of the DPGN algorithm.To solve poor classification performance of the classifier module,the full dimensional dynamic convolution(ODConv)is selected to replace the common convolution in the classifier module.The ODconv employs a linear combination of n convolutional kernels and parallel strategies to introduce multidimensional attention mechanisms for dynamic weighting,making the convolution operation dependent on the input.The ODconv improves the robustness of the SFOD_DPGN algorithm.Finally,the DPGN algorithm uses the L2 distance measurement method in the classifier module,easily causing errors in calculating the distance between samples.Based on the characteristics of distance measurement methods,the Mahalanobis Distance(MD)is suitable for calculating the distance between samples(point graphs).The Earth Moves’s Distance(EMD)distance ismore suitable for calculating the distance between distribution graphs.This paper uses the MD and EMD to replace the L2 in order to improve the ability of the classifier to measure the distance between samples.It improves the classification accuracy of the SFOD_DPGN algorithm.

Experiments on the CUB-200-2011 dataset shows the SFOD_DPGN algorithm is superior to the DPGN algorithm over 5way-1shot and 5way-5shot classification tasks.The accuracy improves by 7.97% and 2.66% respectively.Meanwhile,ablation experiments are performed for each part to verify the effect of the improved ResNet12 and the classifier module.Compared to the DPGN algorithm,after the SimAM attention mechanism is integrated into the ResNet12,the accuracy improves by 2.77% and 1.16% over 5way-1shot and 5way-5shot classification tasks respectively.Furthermore,after the improving the normalization method and activation function of the ResNet12,the accuracy is 5.00% and 2.04% higher respectively over 5way-1shot and 5way-5shot classification tasks.After the further replacement of the common convolution with the ODconv,the accuracy is up by 7.25% and 2.42% respectively over 5way-1shot and 5way-5shot classification tasks.Our experimental results demonstrate all improvements are effective to improve classification accuracy of the SFOD_DPGN algorithm.

中图分类号: 

  • TP391