Evaluating LLMs for automated requirement and test case generation in railway signaling systems

OȚELEA, Ionuț-Gabriel; PINTEA, Bogdan; RUGHINIȘ, Răzvan Victor; TÎRȘU, Valentina

dc.contributor.author	OȚELEA, Ionuț-Gabriel
dc.contributor.author	PINTEA, Bogdan
dc.contributor.author	RUGHINIȘ, Răzvan Victor
dc.contributor.author	TÎRȘU, Valentina
dc.date.accessioned	2026-02-17T18:29:14Z
dc.date.available	2026-02-17T18:29:14Z
dc.date.issued	2025
dc.identifier.citation	OȚELEA, Ionuț-Gabriel; Bogdan PINTEA; Răzvan Victor RUGHINIȘ and Valentina TÎRȘU. Evaluating LLMs for automated requirement and test case generation in railway signaling systems. In: 24th RoEduNet International Conference Networking in Education and Research, Chisinau, Republic of Moldova, 17-19 September, 2025. Universitatea Politehnică din Bucureşti. IEEE, 2025, pp. 1-6. ISBN 979-8-3315-5714-0, eISBN 979-8-331-55713-3, ISSN 2068-1038, eISSN 2247-5443.	en_US
dc.identifier.isbn	979-8-3315-5714-0
dc.identifier.isbn	979-8-331-55713-3
dc.identifier.issn	2068-1038
dc.identifier.issn	2247-5443
dc.identifier.uri	https://doi.org/10.1109/RoEduNet68395.2025.11208370
dc.identifier.uri	https://repository.utm.md/handle/5014/35278
dc.description	Acces full text: https://doi.org/10.1109/RoEduNet68395.2025.11208370	en_US
dc.description.abstract	Large Language Models (LLMs) have shown potential in supporting requirements engineering through automation, especially in regulated and safety-critical domains. This paper evaluates the capabilities of 3 well-known LLMs (GPT-4, Claude, Gemini) in transforming user requirements into structured product requirements and corresponding test cases within the context of railway signaling. A custom dataset of client requirements, inspired by realistic signaling scenarios, was developed to enable consistent evaluation across models. Each model's outputs were assessed using defined metrics, including completeness, correctness, consistency, and traceability. The comparative results highlight variations in quality and structure of the generated artifacts, with specific strengths observed for different tasks. While all three models demonstrate promise, their reliability and consistency vary, and human oversight remains essential. This study provides practical insights into the applicability of current LLMs for augmenting early-stage requirements and verification workflows in critical systems engineering.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE (Institute of Electrical and Electronics Engineers)	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	requirements engineering	en_US
dc.subject	requirements transformation	en_US
dc.title	Evaluating LLMs for automated requirement and test case generation in railway signaling systems	en_US
dc.type	Article	en_US