Exploring OpenAI's New GPT-4o Audio Preview Model: The Future of AI Audio Processing

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 ธ.ค. 2024

ความคิดเห็น • 19

  • @BartSlodyczka
    @BartSlodyczka  2 หลายเดือนก่อน

    🗂 GET ALL THE CODE FILES: bartslodyczka.gumroad.com/l/jeznwq
    📋 Take This Quick Survey: forms.gle/otAr1xUamgyYZE5y7
    📺Realtime API Tutorial Series: th-cam.com/play/PLi7jtY2ZZqRYE8Lvw4MuLHTZPYTA4jZHQ.html&si=7DAE9z7YtQlMrzrd

  • @AngeloXification
    @AngeloXification 2 หลายเดือนก่อน +1

    Instant subscription, then I saw you build and provide resources. Excellent content.

    • @BartSlodyczka
      @BartSlodyczka  2 หลายเดือนก่อน

      Thanks legend 🤝

  • @derherrdirector
    @derherrdirector หลายเดือนก่อน

    You are an absolute legend! You should have millions of subscribers

    • @BartSlodyczka
      @BartSlodyczka  หลายเดือนก่อน

      haha! thank you my man!

  • @alexanderkingstam5164
    @alexanderkingstam5164 2 หลายเดือนก่อน

    You are very pedagogic and explaining very well. Thanks for sharing!

    • @BartSlodyczka
      @BartSlodyczka  2 หลายเดือนก่อน

      thank you very much, appreciate this comment 🙏

  • @GiovanneAfonso
    @GiovanneAfonso 2 หลายเดือนก่อน

    very well structured video and test, great work! hope you do more videos

    • @BartSlodyczka
      @BartSlodyczka  2 หลายเดือนก่อน

      thanks legend! Will do 💪

  • @pixelperfectpravin
    @pixelperfectpravin 2 หลายเดือนก่อน

    Most onpoint video 😍 i appreciate you

    • @BartSlodyczka
      @BartSlodyczka  2 หลายเดือนก่อน +1

      thanks man! I appreciate you too 💪

  • @Rhiever
    @Rhiever หลายเดือนก่อน

    If you’re just performing audio to text, is it necessary to specify both text and audio modalities? Will the model just ignore the audio file if you don’t specify both modalities?

    • @BartSlodyczka
      @BartSlodyczka  หลายเดือนก่อน

      I haven't tested if the model will ignore it and yeah also not sure if you need to specify both. Made this code a couple weeks back and can't recall from the top of my head 🙏

  • @yurijmikhassiak7342
    @yurijmikhassiak7342 2 หลายเดือนก่อน

    Thanks. How is that different from whisper voice to text? For voice to text usecase?
    The price difference is 10x. Is it faster? Is Quality better? The price looks stull very high. Like 20$/ hour of voice conversation. Almost, the cost of hiring humans for talking).

    • @BartSlodyczka
      @BartSlodyczka  2 หลายเดือนก่อน

      Haven't done any work with whisper voice to text so i cant say, but in the demo I show this new audio model recognise abstract sounds and not just speech. So if whisper is cheaper for now, then you might stick with that for speech to text. Whereas for more dynamic sound recognition, you can use this audio model

  • @vsigal
    @vsigal 2 หลายเดือนก่อน

    is it doing diarizarion? separation voices - voice1 - voice2 etc?

    • @BartSlodyczka
      @BartSlodyczka  2 หลายเดือนก่อน +1

      I just tested using short audio with 2 speakers talking to each other. I asked for a transcript of the convo broken down by speaker and it gave me the below:
      **Speaker 1:** So, Erin, in your email you said you wanted to talk about the exam.
      **Speaker 2:** Yeah, um, I've just never taken a class with so many different readings. I've managed to keep up with all the assignments, but I'm not sure how to... how to...
      **Speaker 1:** How to review everything?
      **Speaker 2:** Yeah. In other classes I've had, there's usually just one book to review, not three different books. Plus all those other text excerpts and videos...

    • @vsigal
      @vsigal 2 หลายเดือนก่อน

      @@BartSlodyczka wow wow, I will try. thank you