DGLR-Publikationsdatenbank - Detailansicht

Autor(en):
S. Ternus, K.K.R. Nareddy, J. Niebling, A. Papenfuss
Zusammenfassung:
Automatic Speech Recognition (ASR) has seen significant advances in aviation, particularly in Air Traffic Control (ATC), however intra-cockpit communication between pilots has remained largely unexplored despite its central role in teamwork and decision-making. This paper takes an application-oriented perspective and examines how openly available state-of-the-art ASR models perform when applied to intra-cockpit communication without any domain-specific adaptation. We evaluate OpenAI’s Whisper (Large-v3 and turbo variant), Wav2Vec2-XLSR-53 as a base model with fine-tuned English, German and multilingual versions, and Meta’s Massively Multilingual Speech (MMS) model. Using a dataset of 409 manually transcribed speech segments collected from simulator flights, this paper classifies cockpit communication into six categories and assess performance using Word Error Rate (WER) for each model and category. Results show that Whisper Large consistently achieves the lowest average error rates and demonstrates strong multilingual handling, though it is prone to outliers and occasional hallucinations. Wav2Vec-based models, while less accurate overall, avoid generative errors, with monolingual fine-tuned models working better in language-specific contexts and multilingual variants being able to adapt to code-switching in some cases. The findings highlight trade-offs between consistency, multilingual capability, and computational work, and point to the potential of domain-specific fine-tuning, as this enables improvements in specialized terminology handling. These insights provide a foundation for applying ASR to cockpit communication in both human factors research and future Human–AI Teaming (HAT) applications.
Veranstaltung:
Deutscher Luft- und Raumfahrtkongress 2025, Augsburg
Verlag, Ort:
Deutsche Gesellschaft für Luft- und Raumfahrt - Lilienthal-Oberth e.V., Bonn, 2025
Medientyp:
Conference Paper
Sprache:
englisch
Format:
21,0 x 29,7 cm, 10 Seiten
URN:
urn:nbn:de:101:1-2511071313335.889758741105
DOI:
10.25967/650258
Stichworte zum Inhalt:
Automatic Speech Recognition, Cockpit Communication, Human–AI Teaming
Verfügbarkeit:
Download - Bitte beachten Sie die Nutzungsbedingungen dieses Dokuments: Copyright protected  
Kommentar:
Zitierform:
Ternus, S.; Nareddy, K.K.R.; et al. (2025): Automatic Speech Recognition in the Cockpit: A Comparative Study of ASR Models for Pilot Communication. Deutsche Gesellschaft für Luft- und Raumfahrt - Lilienthal-Oberth e.V.. (Text). https://doi.org/10.25967/650258. urn:nbn:de:101:1-2511071313335.889758741105.
Veröffentlicht am:
07.11.2025