Audio processing in Python with Feature Extraction for machine learning

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 พ.ย. 2024

ความคิดเห็น • 28

  • @ritanovitasari9653
    @ritanovitasari9653 10 หลายเดือนก่อน +1

    hello sir,, the tutorial is easy to understand because the explanation is very clear. but I found a problem in this section "'ls' is not recognized as an internal or external command,
    operable program or batch file."
    please help why this part cannot be recognized. what should I install

  • @Minos818
    @Minos818 2 ปีที่แล้ว +2

    👍 Electroacoustic music composer thanks you ! 😀

    • @650AILab
      @650AILab  2 ปีที่แล้ว

      Appreciate your comment, happy to hear that.

  • @drjfilix3192
    @drjfilix3192 2 ปีที่แล้ว +2

    Thanks for your very interesting video!
    A question: but if I have to align 2 tracks with different BPM (one 108 and the other 120 bpm) what can I do?
    do I raise the first or lower the second? or do I take them both at 119 bpm? but will the beat grid be constant for the 2 files?

    • @650AILab
      @650AILab  2 ปีที่แล้ว +1

      I believe so, feature extraction is based on the input content so its best to match the BBP per channel for feature similarity. Thanks for the comment, appreciate it very much.

  • @tsegayebiresaw6306
    @tsegayebiresaw6306 11 หลายเดือนก่อน

    Amazing vedio thank you I learn more!! but I want to save the extracted feature to CSV file!! how can I do it? did you provide the souce code?

  • @bzaz228
    @bzaz228 ปีที่แล้ว

    Which part of the tutorial do you think would be best for genre determination? I am trying to build a model/application that determines sub genres of Electronic music but might technical music knowledge is limited?

    • @650AILab
      @650AILab  ปีที่แล้ว

      Thanks for the comment, appreciate it.
      You would have to use a combination of features to determine the genre, no single feature can be deterministic for the genre in the audio fragments.
      I hope the following article can help you more on this regard:
      www.clairvoyant.ai/blog/music-genre-classification-using-cnn

  • @hesamgh1515
    @hesamgh1515 ปีที่แล้ว +1

    thank you for your video, which method is best solution that i can process signal to recognise human voice like hello?

    • @650AILab
      @650AILab  ปีที่แล้ว

      You would need to experiment with MEL Spectogram and other functions to get you started..
      Please take a look at the following blog where human voice is recognize and sampled from a long minute audio.
      towardsdatascience.com/voice-classification-using-deep-learning-with-python-6eddb9580381

  • @ripudamansingh1761
    @ripudamansingh1761 ปีที่แล้ว +1

    thank you sir. the video was very helpful

    • @650AILab
      @650AILab  ปีที่แล้ว

      Appreciate your comment, thank you so much.

  • @educationgist7798
    @educationgist7798 2 ปีที่แล้ว +1

    Good Knowledge

  • @mohamadilhamfahrizisofyan5087
    @mohamadilhamfahrizisofyan5087 8 หลายเดือนก่อน

    when i try the command
    plt.figure(figsize=(14, 5))
    librosa.display.waveshow(music_array2, alpha=0.1)
    plt.vlines(beat_times, -1, 1, color='r')
    plt.ylim(-1, 1)
    it says : process_plot_var_args' object has no attribute 'prop_cycler'
    can you help mee?

  • @ACFilms09
    @ACFilms09 2 ปีที่แล้ว +1

    Very good 👍👍

  • @shubhamkapoor5152
    @shubhamkapoor5152 ปีที่แล้ว +1

    Hey the dataset i have is with gz format .how do i work with that ?could you give me pointers

    • @650AILab
      @650AILab  ปีที่แล้ว

      Thanks for your comment, appreciate it.
      I believe that librosa can't read audio files from an in-memory buffer so unpacking the gz to temporary files is one option.
      Please read the following solution and hope it helps you:
      stackoverflow.com/questions/50202350/how-do-i-read-in-wav-files-in-gz

  • @abugslife2461
    @abugslife2461 ปีที่แล้ว +1

    Hi sir, great video. I learned a lot. Is it possible to use the output from the feature extraction to create AI music composition? Or what else can I do with feature extraction besides creating genre prediction, transcription, and classification programs? I plan on doing feature extraction for my ML project and am looking to see what are the options. Thank you so much

    • @650AILab
      @650AILab  ปีที่แล้ว +1

      Thanks for your comment. You can extract features and then embedded them together with the music notes and then play them using libraries i.e. tone.js.
      Other option is to apply extracted features with generative AI music creator libraries to create the music also.

  • @sandymlgenai
    @sandymlgenai 2 ปีที่แล้ว +1

    How can I divide an audio data into several equal sized chunks with padding? ( I'm dividing the data into chunks to apply DCT on every chunk)

    • @650AILab
      @650AILab  ปีที่แล้ว +1

      Once option is to use the pydub library and then use make_chunks method to crate the same size chunks.. you can look at the documentation.
      Other option is create the manual chunks and then add padding.. The following link has full working code for you to follow:
      codereview.stackexchange.com/questions/221790/converts-audio-files-to-fixed-size-chunks-and-the-chunks-to-spectrogram-images
      Hope above help you, what you are looking for. Thanks for the comment.

    • @sandymlgenai
      @sandymlgenai ปีที่แล้ว +1

      Thank you so much for your response.I really appreciate it.

    • @sandymlgenai
      @sandymlgenai ปีที่แล้ว +1

      I wish to quantize the data after applying DCT but I couldn't find a quantization table for audio data.Is there a way to do the quantization the right way?

  • @harshbhagwani523
    @harshbhagwani523 ปีที่แล้ว +1

    this video is enough for audio processing?

    • @650AILab
      @650AILab  ปีที่แล้ว

      Thanks for the comment, appreciate it. Yes, you can follow the full tutorial and it will get you from start to finish.

  • @rs9130
    @rs9130 2 ปีที่แล้ว

    thanks for the tutorial.
    how can i convert to time series data wrt to frames
    ex: (time_step, feature_dim)
    if my loaded audio data shape is (67032,)
    can i reshape it to (12,5586) #feature size
    and then repeat data with 3 time step to create (3,10,5586)
    I want to use this for lstm model

    • @650AILab
      @650AILab  2 ปีที่แล้ว

      Sorry I did not understand your objective by reshaping your data? Do you want to feature engineer it? What is the logic/concept you would want to use with your feature engineering?
      If I know what you want, I can better answer your question. Thanks for the comment.