LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs
Journal:
arXiv
Published Date:
May 30, 2025
Abstract
Large Language Models (LLMs) are being extensively used for cybersecurity
purposes. One of them is the detection of vulnerable codes. For the sake of
efficiency and effectiveness, compression and fine-tuning techniques are being
developed, respectively. However, they involve spending substantial
computational efforts. In this vein, we analyse how Linear Probes (LPs) can be
used to provide an estimation on the performance of a compressed LLM at an
early phase -- before fine-tuning. We also show their suitability to set the
cut-off point when applying layer pruning compression. Our approach, dubbed
$LPASS$, is applied in BERT and Gemma for the detection of 12 of MITRE's Top 25
most dangerous vulnerabilities on 480k C/C++ samples. LPs can be computed in
142.97 s. and provide key findings: (1) 33.3 \% and 72.2\% of layers can be
removed, respectively, with no precision loss; (2) they provide an early
estimate of the post-fine-tuning and post-compression model effectiveness, with
3\% and 8.68\% as the lowest and average precision errors, respectively.
$LPASS$-based LLMs outperform the state of the art, reaching 86.9\% of accuracy
in multi-class vulnerability detection. Interestingly, $LPASS$-based compressed
versions of Gemma outperform the original ones by 1.6\% of F1-score at a
maximum while saving 29.4 \% and 23.8\% of training and inference time and
42.98\% of model size.