Gamifying AI evaluation: How well do chatbots perform in cybersecurity challenges?

GHITA, Alexandru-Andrei; RUGHINIȘ, Răzvan; ȚURCANU, Dinu

dc.contributor.author	GHITA, Alexandru-Andrei
dc.contributor.author	RUGHINIȘ, Răzvan
dc.contributor.author	ȚURCANU, Dinu
dc.date.accessioned	2026-02-14T15:12:03Z
dc.date.available	2026-02-14T15:12:03Z
dc.date.issued	2025
dc.identifier.citation	GHITA, Alexandru-Andrei; Răzvan RUGHINIȘ and Dinu ȚURCANU. Gamifying AI evaluation: How well do chatbots perform in cybersecurity challenges? In: 25th International Conference on Control Systems and Computer Science, CSCS 2025, Bucharest, Romania, 27-30 May, 2025. Politehnica Bucharest. Institute of Electrical and Electronics Engineers, 2025, pp. 620-625. ISBN 979-8-3315-7344-7, eISBN 979-8-3315-7343-0, ISSN 2379-0474, eISSN 2379-0482.	en_US
dc.identifier.isbn	979-8-3315-7344-7
dc.identifier.isbn	979-8-3315-7343-0
dc.identifier.issn	2379-0474
dc.identifier.issn	2379-0482
dc.identifier.uri	https://doi.org/10.1109/CSCS66924.2025.00097
dc.identifier.uri	https://repository.utm.md/handle/5014/35198
dc.description	Acces full text: https://doi.org/10.1109/CSCS66924.2025.00097	en_US
dc.description.abstract	As general-purpose AI chatbots become more advanced, their potential as assistants in cybersecurity learning is gaining attention. This paper evaluates the effectiveness of the most popular chatbots in solving Capture The Flag (CTF) challenges, a widely used gamified approach to developing security skills. From cryptography and reverse engineering to web exploitation and forensics, we analyze how well these AI models navigate real-world CTF problems compared to human participants. Through a student-centered lens, we highlight the strengths, limitations, and occasional hilarity of AI-assisted problem-solving. Can AI crack the code, or will it get stuck in a loop of confusion? Our findings provide insights into the evolving role of general-purpose AI chatbots in cybersecurity education.	en_US
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	chatbot	en_US
dc.subject	cybersecurity	en_US
dc.subject	evaluation	en_US
dc.title	Gamifying AI evaluation: How well do chatbots perform in cybersecurity challenges?	en_US
dc.type	Article	en_US