Employing RAVE and vschaos2 neural audio models in larger compositions + seed conservation in arrays

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ส.ค. 2024
  • In this video, I'm showcasing the patch and configuration that led to track "Tritt Nochmal Zu" from my Saatgut Proxy/ Saatgut Proxy Reflux release in late 2023/ early 2024. The patch contains a setup of vschaos2 and RAVE model decoder layers and predefined seeds for mocking latent embeddings plus a randomized harmonic pad and a minimal kick drum synthesizer.
    The models have been trained on a dataset selected from my own release material.
    martsman.bandc...
    www.ninaprotoc...
    ---
    RAVE is "A variational autoencoder for fast and high-quality neural audio synthesis” created by Antoine Caillon and Philippe Esling of Artificial Creative Intelligence and Data Science (ACIDS) at IRCAM, Paris.
    vschaos2 is a vintage-flavoured neural audio synthesis package by Axel Chemla Romeu Santos. It is based on unsupervised/ (semi-)supervised training of spectral information using variational auto-encoders.
    RAVE on GitHub: github.com/aci...
    nn~ on GitHub: github.com/aci...
    vschaos2 on GitHub: github.com/aci...
    To train models on Colab or Kaggle, you can use these Jupyter notebooks i've set up: github.com/dev...

ความคิดเห็น • 7

  • @fermiLiquidDrinker
    @fermiLiquidDrinker 6 หลายเดือนก่อน +2

    This sounds fucking sick

  • @alchemist.D
    @alchemist.D 6 หลายเดือนก่อน +1

    nice patch 😊

  • @federicoinzerillo5333
    @federicoinzerillo5333 6 หลายเดือนก่อน +1

    I tried Colab, but it has a too low limit on the usage of units, and the prices seem a bit prohibitive to me. Instead, I tried using Kaggle, but I can't find a way to export it. Is there a small tutorial about it?

    • @martsm_n
      @martsm_n  6 หลายเดือนก่อน +1

      My Kaggle notebook comes with instructions on training and export - maybe that helps for starters? github.com/devstermarts/Notebooks/blob/main/RAVE_Training_Template--Kaggle.ipynb
      I'll see to create some kind of walkthrough video at some point.

    • @federicoinzerillo5333
      @federicoinzerillo5333 6 หลายเดือนก่อน

      @@martsm_n thank you for this and for all the knowledge you're spreading :)

    • @federicoinzerillo5333
      @federicoinzerillo5333 6 หลายเดือนก่อน

      @@martsm_n little update: i managed to run, resume and export the training properly. however, i have some questions about the training parameters. what's the best way to achieve the minimum latency possible with the best resolution? in your experience, how do the different architectures, regularizations and augmentations influence the final result? and what are the "--override" parameters showed in the notebook you shared? are there only CAPACITY and PHASE_1_DURATION?

    • @martsm_n
      @martsm_n  6 หลายเดือนก่อน +1

      @@federicoinzerillo5333 I can recommend joining the RAVE discord channel for discussing this kind of detail.
      The short (and potentially unsatisfying answer) is: it all depends. Personally, I find V1 the most robust for my interest and use case. Also I achieve satisfying results with training lengths well below the recommended amount of training steps and datasets that would be considered small (1-2h).
      You can override potentially every setting defined in the .gin configuration files in /configs.
      Hope that helps.