A comparative analysis of LLMs in mapping malware behaviors to MITRE ATT&CK techniques from textual threat intelligence reports

RESUL, Ebru; TURCANU, Dinu; RUGHINIS, Rǎzvan

A comparative analysis of LLMs in mapping malware behaviors to MITRE ATT&CK techniques from textual threat intelligence reports

RESUL, Ebru; TURCANU, Dinu; RUGHINIS, Rǎzvan

URI: https://doi.org/10.1109/RoEduNet68395.2025.11208322
https://repository.utm.md/handle/5014/35310

Date: 2025

Abstract:

Cyber Threat Intelligence (CTI) Reports are valuable resources of information for understanding adversarial behaviors and malware functionalities. However, their lack of consistency and structure often results in a challenge for security analysts in interpreting, correlating and applying them effectively. Structuring the data in a common format, such as the MITRE ATT&CK v17.1 framework, is crucial for integrating CTI into detection and response processes. This article assesses the extent to which Large Language Models (LLMs) - GPT (OpenAI), Claude (Anthropic) and Gemini (Google) - can extract and map the malware description from natural language CTI reports to specific MITRE ATT&CK techniques. To achieve this, a set of publicly available CTI reports were used that already contained verified MITRE ATT&CK techniques labels. This served as ground truth for evaluating the outputs of each model. The performance of the LLMs was measured using standard evaluation metrics: Precision, Recall, and F1-score. While differences and mistakes were found in our models execution, such as technique confusion and context loss, the results indicate a strong potential in the use of LLMs for structured threat intelligence mapping. Their ability to reduce manual effort and improve consistency could address a major gap in today's cyber threat analysis workflow.