Deep Learning with PolyAI: The Multilayered Anatomy of an AI Voice Assistant

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ส.ค. 2024
  • In this episode of 'Deep Learning with PolyAI,' we welcome Shawn Wen, co-founder and CTO at PolyAI. Shawn provides an in-depth overview of the AI tech stack essential for developing high-quality AI voice assistants. Inspired by Andreessen Horowitz's recent publication on AI voice agents, the discussion covers key components of a complex system, including speech recognition, voice activity detection, the application of generative AI models, and the integration of these technologies into practical applications. Shawn also explores the challenges of managing latency, how input affects selected speech recognition models, and the future of end-to-end AI systems. Join us as we unravel the complexities behind creating and optimizing effective voice AI solutions!
    00:18 Understanding the AI Tech Stack
    00:49 Building and Buying Voice Assistants
    02:10 Speech Recognition Challenges
    07:29 Voice Activity Detection (VAD)
    10:31 Generative AI and Guardrails
    15:58 Tooling and Function Calls
    22:39 Future of End-to-End Models
    #ai #voiceai #texttospeech #asr #aitechnology #deeplearning

ความคิดเห็น • 1

  • @CoolM-ff2um
    @CoolM-ff2um หลายเดือนก่อน

    Delete poly ai boring game