Through the Static: Demystifying Malware Visualization via Explainability
Journal:
arXiv
Published Date:
Mar 4, 2025
Abstract
Security researchers grapple with the surge of malicious files, necessitating
swift identification and classification of malware strains for effective
protection. Visual classifiers and in particular Convolutional Neural Networks
(CNNs) have emerged as vital tools for this task. However, issues of robustness
and explainability, common in other high risk domain like medicine and
autonomous vehicles, remain understudied in current literature. Although deep
learning visualization classifiers presented in research obtain great results
without the need for expert feature extraction, they have not been properly
studied in terms of their replicability. Additionally, the literature is not
clear on how these types of classifiers arrive to their answers. Our study
addresses these gaps by replicating six CNN models and exploring their
pitfalls. We employ Class Activation Maps (CAMs), like GradCAM and HiResCAM, to
assess model explainability. We evaluate the CNNs' performance and
interpretability on two standard datasets, MalImg and Big2015, and a newly
created called VX-Zoo. We employ these different CAM techniques to gauge the
explainability of each of the models. With these tools, we investigate the
underlying factors contributing to different interpretations of inputs across
the different models, empowering human researchers to discern patterns crucial
for identifying distinct malware families and explain why CNN models arrive at
their conclusions. Other then highlighting the patterns found in the
interpretability study, we employ the extracted heatmpas to enhance Visual
Transformers classifiers' performance and explanation quality. This approach
yields substantial improvements in F1 score, ranging from 2% to 8%, across the
datasets compared to benchmark values.