重庆理工大学学报(自然科学) ›› 2023, Vol. 37 ›› Issue (12): 267-275.

• 智能技术 • 上一篇    下一篇

基于SlowFast网络的孤立词手语识别算法研究

黄同愿, 谭禹, 朱金江   

  1. 重庆理工大学两江人工智能学院
  • 出版日期:2024-02-04 发布日期:2024-02-04
  • 作者简介:黄同愿,男,副教授,主要从事智能信息处理与机器学习研究,E-mail:tyroneh@cqut.edu.cn

Research on isolated word sign language recognition algorithm based on SlowFast network

  • Online:2024-02-04 Published:2024-02-04

摘要: 由于运动模糊、信息冗余以及不同人手语风格多样化等原因,目前的孤立词手语识别在识别精度、背景抗干扰性和识别速度等方面仍存在不足。为此,提出了一种新的手语识别方法——基于SlowFast网络和增强手部注意力的方法(EAH-SlowFast),其使用YOLOv5和DeepSort检测并追踪手部,提高对手部信息的关注度;在骨干网络中使用Focal损失函数增加模型的分类能力;改进了SlowFast网络结构并引入通道空间注意力机制,从而提高手部信息的权重并抑制背景噪声的干扰。此外,还提出了一种关键帧提取算法,可以在一定精度的损失下大大提高效率。经实验证实,EAH-SlowFast在DEVISIGN-D数据集上的Top-5准确率达到了97.79%,优于其他先进的手语识别算法

关键词: 孤立词手语识别, 注意力机制, SlowFast, 关键帧提取

Abstract: Due to such factors as motion blur, information redundancy, and diverse sign language styles, the current isolated word sign language recognition methods still have limitations in recognition accuracy, background noise resistance, and recognition speed. To address these issues, a novel sign language recognition method based on SlowFast network and enhanced hand attention (EHA-SlowFast) is proposed. This method first employs Yolov5 and DeepSort to detect and track hands, thereby increasing the model’s focus on hand information. Secondly, the Focal loss function is adopted in the backbone network to improve the model’s classification ability. Finally, the SlowFast network structure is improved and a channel spatial attention mechanism is introduced to increase the weight of hand information and suppress background noise interference. Additionally, a keyframe extraction algorithm is proposed, which significantly improves efficiency with some accuracy loss. Experimental results demonstrate that EHA-SlowFast achieves a Top-5 accuracy of 97.79% on the DEVISIGN-D dataset, outperforming other state-of-the-art sign language recognition algorithms.

中图分类号: 

  • TP183