애스크로AIPublic Preview
← 학술논문 검색
학술논문KSII Transactions on Internet and Information Systems2025.04 발행

GIG-CAM+M: A Class Activation Mapping Method Incorporating Guided Integrated Gradients and Multi-scale Strategy

GIG-CAM+M: A Class Activation Mapping Method Incorporating Guided Integrated Gradients and Multi-scale Strategy

Yanfei Gao(Shanxi Finance & Taxation College, China); Xiongwei Miao(Shanxi Intelligent Big Data Industry Technology Innovation Research Institute, China); Guoye Zhang(Shanxi Provincial Digital Government Service Center, China)

19권 4호, 1122~1139쪽

초록

The interpretability of convolutional neural networks has garnered widespread attention, with class activation mapping (CAM)-based methods emerging as a prominent research direction. Integrated Grad-CAM is a widely used backpropagation-based CAM method, but its use of a linear path introduces noise during the integration process. To address this issue, we propose GIG-CAM, which replaces the linear path with an adaptive path. Unlike previous methods that require path specification, GIG-CAM dynamically determines the next input in the path based on saliency maps. Additionally, to enhance the resolution of saliency maps, we introduce a novel multi-scale fusion method, which recursively optimizes saliency maps at smaller scales using saliency maps at larger scales. This preserves the localization capability of the original-scale saliency maps while enhancing their resolution. Experimental results on the VOC2012 and ILSVRC2012 datasets demonstrate that GIG-CAM with fusion (GIG-CAM(F)) outperforms existing methods, achieving the highest scores in the Pointing Game (82.80% and 85.90% on ResNet50 for VOC2012 and ILSVRC2012, respectively) and Energy-Based Pointing Game (62.41% and 59.69%, respectively). Furthermore, GIG-CAM(F) achieves the lowest Drop% (22.59% and 17.04%) and highest Increase% (31.00% and 21.95%), validating its superior interpretability. Our results highlight the effectiveness of GIG-CAM in improving the quality and reliability of saliency maps, making it a robust solution for enhancing deep model transparency.

Abstract

The interpretability of convolutional neural networks has garnered widespread attention, with class activation mapping (CAM)-based methods emerging as a prominent research direction. Integrated Grad-CAM is a widely used backpropagation-based CAM method, but its use of a linear path introduces noise during the integration process. To address this issue, we propose GIG-CAM, which replaces the linear path with an adaptive path. Unlike previous methods that require path specification, GIG-CAM dynamically determines the next input in the path based on saliency maps. Additionally, to enhance the resolution of saliency maps, we introduce a novel multi-scale fusion method, which recursively optimizes saliency maps at smaller scales using saliency maps at larger scales. This preserves the localization capability of the original-scale saliency maps while enhancing their resolution. Experimental results on the VOC2012 and ILSVRC2012 datasets demonstrate that GIG-CAM with fusion (GIG-CAM(F)) outperforms existing methods, achieving the highest scores in the Pointing Game (82.80% and 85.90% on ResNet50 for VOC2012 and ILSVRC2012, respectively) and Energy-Based Pointing Game (62.41% and 59.69%, respectively). Furthermore, GIG-CAM(F) achieves the lowest Drop% (22.59% and 17.04%) and highest Increase% (31.00% and 21.95%), validating its superior interpretability. Our results highlight the effectiveness of GIG-CAM in improving the quality and reliability of saliency maps, making it a robust solution for enhancing deep model transparency.

발행기관:
한국인터넷정보학회
분류:
컴퓨터학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작
GIG-CAM+M: A Class Activation Mapping Method Incorporating Guided Integrated Gradients and Multi-scale Strategy | KSII Transactions on Internet and Information Systems 2025 | AskLaw | 애스크로 AI