A malware classification method based on directed API call relationships.

Journal: PloS one
PMID:

Abstract

In response to the growing complexity of network threats, researchers are increasingly turning to machine learning and deep learning techniques to develop advanced models for malware detection. Many existing methods that utilize Application Programming Interface (API) sequence instructions for malware classification often overlook the structural information inherent in these sequences. While some approaches consider the structure of API calls, they typically rely on the Graph Convolutional Network (GCN) framework, which tends to neglect the sequential nature of API interactions. To address these limitations, we propose a novel malware classification method that leverages the directed relationships within API sequences. Our approach models each API sequence as a directed graph, incorporating node attributes, structural information, and directional relationships. To effectively capture these features, we introduce First-order and Second-order Graph Convolutional Networks (FSGCN) to approximate the operations of a directed graph convolutional network (DGCN). The resulting directed graph embeddings from the FSGCN are then transformed into grayscale images and classified using a Convolutional Neural Network (CNN). Additionally, to mitigate the effects of imbalanced datasets, we employ the Synthetic Minority Over-sampling Technique (SMOTE), ensuring that underrepresented classes receive adequate attention during training. Our method has been rigorously evaluated through extensive experiments on two real-world malware datasets. The results demonstrate the effectiveness and superiority of our approach compared to traditional and graph-based malware classification techniques.

Authors

  • Cuihua Ma
    College of Information Science Technology, Hainan Normal University, Haikou, Hainan, China.
  • Zhenwan Li
    College of Information Science Technology, Hainan Normal University, Haikou, Hainan, China.
  • Haixia Long
    Department of Information Science and Technology, Hainan Normal University, Haikou 571158, China. myresearch_hainnu@163.com.
  • Anas Bilal
    College of Information Science and Technology, Hainan Normal University, Haikou, China.
  • Xiaowen Liu
    School of Informatics and Computing, Indiana University-Purdue University Indianapolis, Indianapolis, Indiana 46202, United States.