Abstract:
Cyber Threat Intelligence (CTI) Reports are valuable resources of information for understanding adversarial behaviors and malware functionalities. However, their lack of consistency and structure often results in a challenge for security analysts in interpreting, correlating and applying them effectively. Structuring the data in a common format, such as the MITRE ATT&CK v17.1 framework, is crucial for integrating CTI into detection and response processes. This article assesses the extent to which Large Language Models (LLMs) - GPT (OpenAI), Claude (Anthropic) and Gemini (Google) - can extract and map the malware description from natural language CTI reports to specific MITRE ATT&CK techniques. To achieve this, a set of publicly available CTI reports were used that already contained verified MITRE ATT&CK techniques labels. This served as ground truth for evaluating the outputs of each model. The performance of the LLMs was measured using standard evaluation metrics: Precision, Recall, and F1-score. While differences and mistakes were found in our models execution, such as technique confusion and context loss, the results indicate a strong potential in the use of LLMs for structured threat intelligence mapping. Their ability to reduce manual effort and improve consistency could address a major gap in today's cyber threat analysis workflow.