Automatic Speech Recognition in the Cockpit: A Comparative Study of ASR Models for Pilot Communication

S. Ternus; K.K.R. Nareddy; J. Niebling; A. Papenfuss

doi:10.25967/650258

DGLR-Publikationsdatenbank - Detailansicht

Titel:

Automatic Speech Recognition in the Cockpit: A Comparative Study of ASR Models for Pilot Communication

Autor(en):

S. Ternus, K.K.R. Nareddy, J. Niebling, A. Papenfuss

Zusammenfassung:

Automatic Speech Recognition (ASR) has seen significant advances in aviation, particularly in Air Traffic Control (ATC), however intra-cockpit communication between pilots has remained largely unexplored despite its central role in teamwork and decision-making. This paper takes an application-oriented perspective and examines how openly available state-of-the-art ASR models perform when applied to intra-cockpit communication without any domain-specific adaptation. We evaluate OpenAI’s Whisper (Large-v3 and turbo variant), Wav2Vec2-XLSR-53 as a base model with fine-tuned English, German and multilingual versions, and Meta’s Massively Multilingual Speech (MMS) model. Using a dataset of 409 manually transcribed speech segments collected from simulator flights, this paper classifies cockpit communication into six categories and assess performance using Word Error Rate (WER) for each model and category. Results show that Whisper Large consistently achieves the lowest average error rates and demonstrates strong multilingual handling, though it is prone to outliers and occasional hallucinations. Wav2Vec-based models, while less accurate overall, avoid generative errors, with monolingual fine-tuned models working better in language-specific contexts and multilingual variants being able to adapt to code-switching in some cases. The findings highlight trade-offs between consistency, multilingual capability, and computational work, and point to the potential of domain-specific fine-tuning, as this enables improvements in specialized terminology handling. These insights provide a foundation for applying ASR to cockpit communication in both human factors research and future Human–AI Teaming (HAT) applications.

Veranstaltung:

Deutscher Luft- und Raumfahrtkongress 2025, Augsburg

Verlag, Ort:

Deutsche Gesellschaft für Luft- und Raumfahrt - Lilienthal-Oberth e.V., Bonn, 2025

Medientyp:

Conference Paper

Sprache:

englisch

Format:

21,0 x 29,7 cm, 10 Seiten

URN:

urn:nbn:de:101:1-2511071313335.889758741105

DOI:

10.25967/650258

Stichworte zum Inhalt:

Automatic Speech Recognition, Cockpit Communication, Human–AI Teaming

Verfügbarkeit:

Download - Bitte beachten Sie die Nutzungsbedingungen dieses Dokuments: Copyright protected

Kommentar:

Zitierform:

Ternus, S.; Nareddy, K.K.R.; et al. (2025): Automatic Speech Recognition in the Cockpit: A Comparative Study of ASR Models for Pilot Communication. Deutsche Gesellschaft für Luft- und Raumfahrt - Lilienthal-Oberth e.V.. (Text). https://doi.org/10.25967/650258. urn:nbn:de:101:1-2511071313335.889758741105.

Veröffentlicht am:

07.11.2025

E-Mail:	info(at)dglr.de
Fon:	0228 308050
Fax:	0228 3080524

DGLR-Publikationsdatenbank - Detailansicht

Titel:

Automatic Speech Recognition in the Cockpit: A Comparative Study of ASR Models for Pilot Communication

Suche: