19. Melody generation with transformers - Generative Music AI

แชร์
ฝัง
  • เผยแพร่เมื่อ 16 พ.ย. 2024

ความคิดเห็น • 23

  • @tomastellechea
    @tomastellechea 4 หลายเดือนก่อน +1

    in waiting for part 3 of the course!! thanks!!

  • @aniket-shirke
    @aniket-shirke 10 หลายเดือนก่อน +1

    This was a great tutorial on applying Transformers for Symbolic Music Generation! Can you please share pointers on how it can be done for raw audio directly?

  • @_NickTech
    @_NickTech 6 หลายเดือนก่อน +1

    thanks a lot for providing this code for free! And for all the comments! really helpful ❤

  • @mauricioalfaro9406
    @mauricioalfaro9406 11 หลายเดือนก่อน +1

    Thank you very much Valerio, fantastic work.

  • @3bnuri
    @3bnuri 3 หลายเดือนก่อน

    amazing content, keep up the good work bro!

  • @jdavibedoya
    @jdavibedoya 9 หลายเดือนก่อน

    Once again, a wonderful video. I am always infinitely grateful to Valerio for making this knowledge more engaging and digestible. I have two questions that might be a bit naive:
    1. In the 'sinusoidal_position_encoding' function, shouldn't the sines and cosines be interleaved?
    2. I understand that during training, it would be redundant to use the same input for both the encoder and decoder. But, in this case, doesn't it create a discrepancy between training and inference? Because in the predictions at 51:05, the same input is used for both.

  • @gabrielmehra1626
    @gabrielmehra1626 10 หลายเดือนก่อน +1

    Awesome video Valerio!
    Question: I'm using this Transformer architecture with the keras.model.save() and .load_model() functions in order to be able to quickly test the model on many different starting sequences. However, I'm running into some issues with the Transformer's call() function that we are overwriting - it is preventing me from successfully calling .load_model() because it says it is getting unexpected arguments to the call() function. All I'm trying to do is essentially more/less "pickle" (or save) this model to a file and then run it on many different starting note sequences to test it.
    Have you run into this sort of issue before? Thanks!

    • @guillaume6757
      @guillaume6757 6 หลายเดือนก่อน

      Hey I'm encourtering the same issue, did find how to resolve it ? Thanks !

  • @cavacoparis
    @cavacoparis 11 หลายเดือนก่อน +1

    Hi Valerio, thanks a lot for the awesome content!
    A question about the train step function (around 1:00:00): could the target_input sequence be the same as the encoder_input one, and only the target_real sequence be shifted by 1 token?
    I mean, to follow your example, I would do encoder_input = target_input = [1, 2, 3, 4] and target_real=[2,3,4,5] ? It would be more consistent with the inference phase, in which both the encoder and decoder outputs are the same? Thanks again!

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  11 หลายเดือนก่อน +2

      It would be redundant. If you go for that approach, it may be best just having a decoder-only architecture. We use the encoder for conditioning the generation on something different from the target input. In the simple case of the video, this will coincide with the shifted value of the sequence.

  • @wisdom1223
    @wisdom1223 11 หลายเดือนก่อน

    Awesome video!

  • @bilzebug
    @bilzebug 4 หลายเดือนก่อน

    i believe you need to use python 3.10 to get that version 2.13 of tensorflow, otherwise the karas text preprocessor is deprecated.

  • @SuperMaDBrothers
    @SuperMaDBrothers 6 หลายเดือนก่อน +1

    Why not use a decoder only stack like GPT?

  • @mitswadas
    @mitswadas 8 หลายเดือนก่อน

    Hey @Valerio, How can we play the music generated by this model? Because at the end the output are just in notations

  • @michelesantoro4731
    @michelesantoro4731 11 หลายเดือนก่อน

    Thank you Valerio for this incredible course! I found it super inspiring. Just one question: why didn't you cover GANs? You simply didn't have time or you believe that they are not as good as the methods you presented for music?

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  11 หลายเดือนก่อน

      Time constraint is definitely a reason. Also, they are significantly less capable than transformers for music generation, and way harder to train ;)

  • @wisdom1223
    @wisdom1223 11 หลายเดือนก่อน

    Again thank you Velerio for this innovative work, but what if the dataset is a midi file

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  11 หลายเดือนก่อน

      If the dataset is in MIDI format, you'll have to map it onto a textual encoding.
      If you mean you only have 1 midi file, I'm afraid you can't do much :D

  • @musescore1983
    @musescore1983 11 หลายเดือนก่อน

    Hi Valerio, where can I listen to some music you have generated / created with your tools? I am curious.

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  11 หลายเดือนก่อน +1

      Last things I've done in this space are not public-facing unfortunately. Some of it has been for music tech companies. Other stuff still in the making ;)

  • @m.a.7768
    @m.a.7768 8 หลายเดือนก่อน

    good to understand theory, and that is it!!!... do not bother if you want to grasp it from experience the coding part. You wont be able to hear the output...
    Fantastic "tutorial!" without practical feedback.

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  8 หลายเดือนก่อน

      ?

    • @9sirgato9
      @9sirgato9 7 หลายเดือนก่อน +1

      If you ACTUALLY followed the tutorial you have ended with a text sequence that is easily transformed into Midi. In my case i just tranformed the text sequence to midi using Music21 so then I can hear the output in any DAW.