重庆理工大学学报(自然科学) ›› 2024, Vol. 38 ›› Issue (1): 169-179.

• 信息计算机 • 上一篇    下一篇

结合图卷积网络的多模态仇恨迷因识别研究

刘旭东,杨亮,张冬瑜,林鸿飞   

  1. 大连理工大学计算机科学与技术学院,大连理工大学软件学院
  • 出版日期:2024-02-07 发布日期:2024-02-07
  • 作者简介:刘旭东,男,硕士研究生,主要从事自然语言处理、多模态情感分析研究,Email:liuxd1997@mail.dlut.edu.cn;通信作者林鸿飞,男,教授,主要从事自然语言处理、情感分析与观点挖掘研究,Email:hflin@dlut.edu.cn

Research on multimodal hate meme recognition based on graph convolutional network

  • Online:2024-02-07 Published:2024-02-07

摘要: 针对现有迷因识别方法常忽视网络实体作用的情况,提出一种结合图卷积网络的迷因识别方法。提取图像中网络实体信息,利用图卷积网络对网络实体模态和文本模态进行融合,结合外源词典从多角度衡量网络实体和迷因文本之间的关系,构建跨域图;通过注意力模块对文本和图像模态进行交互,结合自蒸馏技术提高模型对信息的利用率。实验结果表明:该方法在HatefulMemes和MAMI数据集上的准确率分别达到76.03%和73.9%,性能优于现有的SOTA模型

关键词: 迷因识别, 网络实体识别, 隐式情感分析, 图卷积网络

Abstract: Memes exist in the form of images and texts and are used to describe hate speeches, rumors that spread among users on the Internet. They often use web entities such as popular figures, events, or historical figures to express hate emotions. These implicit emotional expressions are worth academic attention, but web entities are mostly ignored by existing meme identification methods. To address the problem, this paper proposes a meme recognition method based on graph convolutional network. Specifically, the web entity information contained in the image is first extracted. The web entity modality and the text modality are fused by a graph convolutional network. An external dictionary is employed to measure the relationship between the web entity and the meme text from multiple perspectives when building cross-domain graph. Then, the text and image modalities are interacted through the attention module. Finally, the self-distillation technology is employed to improve the model’s information utilization rate. Our experimental results on the Hateful Memes dataset and the MAMI dataset reach an accuracy of 76.03% and 73.9% respectively, and the performance is superior to the existing SOTA model.

中图分类号: 

  • TP39