Public and state transportation aircraft are fitted with two crash-survival flight recorders—also known as “black boxes”—i.e., the Cockpit Voice Recorder (CVR) and the Flight Data Recorder. Both need to be retrieved and analyzed by air accident authorities in case of incident or accident. The audio service of BEA (Bureau d’Enquêtes et d’Analyses pour la sécurité de l’aviation civile) and RESEDA are the French authorities in charge of CVR investigations, for civil and State aircrafts, respectively. CVR contents are “manually” transcribed by specialized investigators (a.k.a. audio analysts) for the benefits of the safety investigation.
In a CVR recording, the causes of speech intelligibility degradation are numerous. In particular, the CVR design itself generates a significant amount of superimposed—a.k.a. mixed—speech signals over the audio channels which are simultaneously recorded. Moreover, in case of an aircraft accident or incident, superimposed speech signals are more likely to occur—since voice and cockpit sound activities become denser—which may yield to the loss of crucial information for the safety investigators. In our recent work [1], we reverse-engineered the CVR audio mixing model (see Fig. 2) and we found that state-of-the-art blind source separation (BSS) algorithms could be applied. BSS is a generic problem which aims to estimate unknown source signals from observed ones while the propagation channels from the sources to the sensors are also unknown [2]. We noticed that classical BSS algorithms1 could help the audio analyst to transcribe a CVR recording. In particular, allowing the audio analyst to listen the outputs of different methods significantly helped him in his tasks. However, there remained some cases where these classical techniques were not helpful.
The objective of this Ph.D. thesis is two-fold.
This Ph.D. thesis is funded within the “BLeRIOT” ANR ASTRID project (Jan. 2025 – Dec. 2027). The BLeRIOT consortium is a balanced group of research laboratories—located in Toulouse (IRIT) and Longuenesse (LISIC)—and of French authorities in charge of aircraft accident or incident (BEA, RESEDA, both being located near Paris). The selected Ph.D. student is expected to start his/her Ph.D. thesis at the same time as the project, in early 2025.
The Ph.D. thesis will take place in the new antenna of LISIC in Longuenesse. This antenna currently has 7 permanent researchers, 1 postdoc researcher, and 5 Ph.D. students (many of them working on BSS). LISIC is located in the heart of the Regional Natural Park of “Caps et Marais d’Opale”, close to Lille, England, Belgium, and Northern Europe. Its premises are next to a student residence and near all the facilities. The recruited Ph.D. student will work in close collaboration with all the BLeRIOT partners, in particular with BEA and RESEDA audio analysts. The monthly gross salary is 2.2 k€. The job also brings health insurance and retirement contribution.
Recently or nearly graduated in the field of data sciences (signal and image processing, computer science with a focus in artificial intelligence / machine learning, applied mathematics), you are curious and are very comfortable in programming (Matlab, Python). You read and speak fluent English with ease. You also own communication skills so that you can explain your work to non-experts of your field, e.g., during project meetings. Although not compulsory, speaking French as well as a first experience in low-rank approximation—e.g., matrix or tensor decomposition, blind source separation, dictionary learning—will be appreciated. Applicants must be French or citizens of Member State of the European Union, or of a State forming part of the European Economic Area, or of the Swiss Confederation.
To apply, please send an e-mail to [gilles.delmaire, matthieu.puigt] [at] univ-littoral.fr while attaching the documents that can support your application:
Applications will be reviewed on a rolling basis until the position is filled.
[1] Matthieu Puigt, Benjamin Bigot, and Hélène Devulder. Introducing the “cockpit party problem”: Blind source separation enhances aircraft cockpit speech transcription. Journal of the Audio Engineering Society, to appear.
[2] Pierre Comon and Christian Jutten, editors. Handbook of Blind Source Separation: Independent Component Analysis and Applications. Elsevier, 2010.
[3] DeLiang Wang and Jitong Chen. Supervised speech separation based on deep learning: An overview. IEEE/ACM Trans. Audio, Speech, Language Process., 26(10):1702–1726, Oct. 2018.
[4] Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-Yiin Chang, and Tara Sainath. Deep learning for audio signal processing. IEEE J. Sel. Topics Signal Process., 13(2):206–219, May 2019.
[5] Romain Couillet, Denis Trystram, and Thierry Ménissier. The submerged part of the AI-ceberg. IEEE Signal Process. Mag., 39(5):10–17, 2022.
[6] Clément Dorffer, Matthieu Puigt, Gilles Delmaire, and Gilles Roussel. Informed nonnegative matrix factorization methods for mobile sensor network calibration. IEEE Trans. Signal Inf. Process. Netw., 4(4):667–682, 2018.
[7] Gilles Delmaire, Mahmoud Omidvar, Matthieu Puigt, Frédéric Ledoux, Abdelhakim Limem, Gilles Roussel, and Dominique Courcot. Informed weighted non-negative matrix factorization using αβ-divergence applied to source apportionment. Entropy, 21(3):253, 2019.
[8] Sarah Roual, Claude Sensiau, and Gilles Chardon. Informed source separation for turbofan broadband noise using
non-negative matrix factorization. In Forum Acousticum 2023, 2023.
[9] Laetitia Loncan, Luis B De Almeida, José M Bioucas-Dias, Xavier Briottet, Jocelyn Chanussot, Nicolas Dobigeon, Sophie Fabre, Wenzhi Liao, Giorgio A Licciardi, Miguel Simoes, et al. Hyperspectral pansharpening: A review. IEEE Geosci. Remote Sens. Mag., 3(3):27–46, 2015.
(c) GdR IASIS - CNRS - 2024.