Deepfake Voice Detection for Underrepresented Languages A Romanian Case Study
Authors
Marco Olescu
NTT Data Romania
Robert-Alin Bălășoiu
Ioana-Daniela Ivașcu
Cantemir Mihu
Robert-Andrei Muntean
Sorana-Ioana Resiga
Abstract
The rapid advancement of artificial intelligence and its subsequent application in deepfake media present a significant global security concern. Despite this widespread issue, effective detection solutions are notably absent for less commonly used languages. This paper proposes a potential solution for identifying generated audio in Romanian. The solution centres on an SVM-based algorithm, previously demonstrated to perform effectively in English language tests, adapted with a dataset specifically tailored to the Romanian language. The resulting language-specific model exhibits better performance in differentiating between authentic and synthetic Romanian audio, thereby offering an improvement over general-purpose, Anglocentric systems. This constitutes an effective strategy for developing localised solutions for a language with limited resources.