Annonce

10 décembre 2024

Classical GAS - Classical methods for Generative Audio Synthesis

Catégorie : Stagiaire

Signal-based audio synthesis methods like the one described in [1] or [2] have seen a regain of interest recently after the introduction of the Differential Digital Signal Processing (DDSP) package [3]. It resort in using a neural architecture to produce control parameters for a signal based synthesis method. During this internship, a study of DDSP based audio generative model will be performed. After a comprehensive study of DDSP inspired architectures on piano and drums datasets [6,7], the students will extend the work proposed in [4] to more complex signal-based synthesizers. In particular the scalability of controls to achieve a perceptive control method similar to [5] will be studied. The internship will result in the implementation of a DDSP as virtual instrument on a Digital Audio Workstation (DAW) or on an embbeded system.

The candidate should be enrolled in a M2 or engineer diploma in one or more of the following fields: signal and image processing, computer science, embedded systems. The candidate should have strong progamming abilities as well as good writing and oral communication skills. A strong interest and/or experience with audio signal processing will be appreciated.

Position can be started anytime from February, 2025 and duration is up to 6 months. The candidate will be based in Annecy. This internship will be hosted in the LISTIC laboratory, with regular meetings and exchanges with researchers from the project.

Classical GAS - Classical methods for Generative Audio Synthesis

Keywords

Audio synthesis, C/C++ programming, generative models, frugal AI, deep learning,

Location

Annecy, France.

Context

In contrast, diffusion models lacks interpretability and intuitive control, need an intensive training and suffer from a long inference time. Yet, the training of a generative model remains challenging due the unstable gradients produced by signal-based audio synthesis methods. Notably, the success of such a training of a DDSP inspired generative model is impacted by (i) the signal based synthesis method, (ii) the control neural architecture (iii) the training methodology.

In [4], a GAN-based synthesizer (StyleWaveGAN, SWG) have shown promising results on percussion synthesis with improved expressivity in control, opening way to instruments synthesis using DDSP inpired models.

Project summary

During this internship, a study of DDSP based audio generative model will be performed. After a comprehensive study of DDSP inspired architectures on piano and drums datasets [6,7], the students will extend the work proposed in [4] to more complex signal-based synthesizers. In particular the scalability of controls to achieve a perceptive control method similar to [5] will be studied. The internship will result in the implementation of a DDSP as virtual instrument on a Digital Audio Workstation (DAW) or on an embbeded system.

Candidate profile

She/he should be enrolled in a M2 or engineer diploma in one or more of the following fields: signal and image processing, computer science, embedded systems. The candidate should have strong progamming abilities as well as good writing and oral communication skills. A strong interest and/or experience with audio signal processing will be appreciated.

Environment

Position can be started anytime from February, 2025 and duration is up to 6 months. The candidate will be based in Annecy. This internship will be hosted in the LISTIC laboratory, with regular meetings and exchanges with researchers from the project.

Contact

Antoine Lavault (antoine.lavault@univ-smb.fr) - LISTIC, Annecy
Yassine Mhiri (yassine.mhiri@univ-smb.fr) - LISTIC, Annecy

Application procedure

Send a detailed CV and motivation letter to antoine.lavault@univ-smb.fr and yassine.mhiri@univ-smb.fr

References

[1] Serra, Xavier. “Musical Sound Modeling with Sinusoids plus Noise.” (1997).

[2] Chowning, John. “The Synthesis of Complex Audio Spectra by Means of Frequency Modulation.” Journal of The Audio Engineering Society 21 (1973): 526-534.

[3] Jesse Engel, Lamtharn (Hanoi) Hantrakul, Chenjie Gu, Adam Roberts. "DDSP: Differentiable Digital Signal Processing." International Conference on Learning Representations. 2020.

[4] Antoine Lavault. Generative Adversarial Networks for Synthesis and Control of Drum Sounds, Sorbonne Université, 2023.

[5] Antoine Lavault, Axel Roebel, Matthieu Voiry. STYLEWAVEGAN: STYLE-BASED SYNTHESIS OF DRUM SOUNDS WITH EXTENSIVE CONTROLS USING GENERATIVE ADVERSARIAL NETWORKS. 19th Sound and Music Computing Conference, 2022

[6] Gillet, Olivier and Gaël Richard. “ENST-Drums: an extensive audio-visual database for drum signals processing.” International Society for Music Information Retrieval Conference (2006).

[7]Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, and Douglas Eck. "Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset.", International Conference on Learning Representations, 2019.

[8]Renault, Lenny et al. “DDSP-Piano: A Neural Sound Synthesizer Informed by Instrument Knowledge.” Journal of the Audio Engineering Society (2023): n. pag.

Retour

Identification

Annonce

Classical GAS - Classical methods for Generative Audio Synthesis

Keywords

Location

Context

Project summary

Candidate profile

Environment

Contact

Application procedure

References

Dans cette rubrique