\"Not Aligned\" is Not \"Malicious\": Being Careful about Hallucinations of Large Language Models' Jailbreak

Publication
Proceedings of the 31st International Conference on Computational Linguistics, COLING 2025, Abu Dhabi, UAE, January 19-24, 2025