DIY Alexa: Create Your Own Voice Assistant with ESP32 & TensorFlow Lite!

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 พ.ค. 2024
  • We've been building towards this project in the previous set of videos. And we're now ready to build our very own DIY Alexa!
    All the code for this project is on GitHub - github.com/atomic14/diy-alexa
    What are we building - 1:15
    Wake Word Detection - 2:27
    Command Recognition - 11:47
    Digging into the code - 16:30
    What's life all about Marvin? - 21:52
    To detect the wake words we stream audio from either an I2S microphone or from the ADC. The wake word detector looks at a 1-second window of audio. The spectrogram of the audio is calculated and fed into a TensorFlow Lite model.
    Once we detect the wake word we stream the audio up to wit.ai to recognise the user's intent.
    It works surprisingly well for such a small model, there are improvements that could be made with more training data.
    I'll leave the access token for wit.ai live for as long as I can, but at some point, you will need to generate your own wit.ai application.
    Let me know how you get on in the comments!
    Related Videos:
    Audio Input
    • ICS-43434 A replacemen...
    • ESP32 Audio Input Show...
    • ESP32 Audio Input Usin...
    Audio Output
    • ESP32 Audio Output wit...
    And TensorFlow Lite for machine learning
    • TensorFlow Lite With P...
    Components you could use:
    MAX98357 - amzn.to/3cg88Z5
    TinyPico - amzn.to/3vVoONp
    INMP441 I2S Microphone: amzn.to/3cicuiv
    ICS-43434 I2S Microphone: www.tindie.com/products/21519/
    ESP32 Dev board: amzn.to/3gb6fyc
    Analogue Audio Amplifier: amzn.to/3pxkEJr
    Speakers: amzn.to/3pjWFgq
    ---
    Want to help support the channel? I'm accepting coffee on ko-fi.com/atomic14
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 221

  • @atomic14
    @atomic14  2 ปีที่แล้ว +12

    Interested in ESP32 Audio: th-cam.com/play/PL5vDt5AALlRfGVUv2x7riDMIOX34udtKD.html
    Looking for all my ESP32 projects: th-cam.com/play/PL5vDt5AALlRdN2KyL30l8j7kLCxhDUrNw.html

  • @clydealcott3379
    @clydealcott3379 2 ปีที่แล้ว +2

    Thank you so much for this awesome and very educational video... I got my ESP32 recently...
    It's time to roll along.!👍

  • @trevorwslee
    @trevorwslee ปีที่แล้ว +3

    What an insightful project! I really hope to be able to adapt your idea (including some code snippets and likely the TensorFlow model) and come up with my own little ESP32 experiment.

  • @engrwaqas2904
    @engrwaqas2904 2 ปีที่แล้ว +1

    Absolultely amazing, Great work.

  • @marcush.6632
    @marcush.6632 2 ปีที่แล้ว

    You are an absolute genius in my eyes.....

  • @trueintellect
    @trueintellect 3 ปีที่แล้ว +8

    I'm so glad I found your channel!! This is really cool. You've helped free me from my Raspberry Pi dependence.

    • @atomic14
      @atomic14  3 ปีที่แล้ว +3

      The ESP32 is an amazing device. Really powerful.

    • @engineerdanny7569
      @engineerdanny7569 2 ปีที่แล้ว

      Same case😊

  • @jeffzor
    @jeffzor 2 ปีที่แล้ว +3

    Obrigado pela oportunidade de aprendizado mestre!

  • @JohnLauerGplus
    @JohnLauerGplus 3 ปีที่แล้ว +2

    Wow. Nice work here.

  • @jorgemota879
    @jorgemota879 2 ปีที่แล้ว +1

    Amazing, Fantastic thank you very much, really a great project

  • @photopicker
    @photopicker 2 ปีที่แล้ว +2

    Very educational. Coming for a ESP32 background I found it very helpful to create a real target for the AI modeling tools. Great introduction.

    • @digitronix532
      @digitronix532 7 หลายเดือนก่อน

      How to interpret esp32 and this program

  • @mariomedina
    @mariomedina ปีที่แล้ว

    Got it working! Now I need to learn how to change the activation word, and how to add multiple activation words that activate different code

  • @7Trident3
    @7Trident3 3 ปีที่แล้ว +13

    Wow!! I didn't think the esp32 had the guts for any AI stuff! Great video!!

    • @atomic14
      @atomic14  3 ปีที่แล้ว +4

      It's definitely starting to push the limits - but I think it's easy to forget just how powerful the ESP32 is. One of the problems is the size of the models which can get quite large (relative to the amount of RAM we have to play with). Processing time is also a factor especially when trying to do real time as in this project.

  • @tektronix475
    @tektronix475 3 ปีที่แล้ว +12

    wow, your alexa version, got me speechless.

    • @digitronix532
      @digitronix532 7 หลายเดือนก่อน

      How to upload program to esp32

  • @paulsimpson9544
    @paulsimpson9544 3 ปีที่แล้ว +3

    Really fascinating. Thank you so much for sharing.

    • @paulsimpson9544
      @paulsimpson9544 3 ปีที่แล้ว

      I've a follow on question if you don't mind.. I see you using the Arduino framework, but also have the esp IDF icon in platform Io. Do you have any particular preference? I'm considering switching to the IDF as I'm already using xtimers. I like the idea of know control, but also like the easy access to the Arduino ecosystem of libraries..

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      I've been mixing in quite a lot of functions from the IDF with my Arduino code. But it seems the IDF that comes with Arduino is now quite out of date. I've been trying to get Arduino working as a component in the IDF so I can use the latest IDF but still take advantage of the Arduino eco system but I've not had much luck. For my Asteroids game I did it all in the IDF - mainly because I wanted to use the PSRAM with malloc and there's not way to do that when using Arduino. But I really missed simple things like uploading firmware OTA - especially with my custom board not having a USB port...
      I think, unless there's a compelling reason (APIs that aren't available from the IDF when using Arduino) then I'd be tempted to stick with Arduino. If you aren't using any libraries or you can easily port them over then IDF is definitely worth giving a go. But, I don't think there are any huge advantages to it.

  • @OnePunchHeizou
    @OnePunchHeizou 2 ปีที่แล้ว +1

    this channel was really helpful to understand many edge ai related concepts, thank you @atomic14.

    • @atomic14
      @atomic14  2 ปีที่แล้ว

      Thanks for the kind words - much appreciated!

    • @OnePunchHeizou
      @OnePunchHeizou 2 ปีที่แล้ว +1

      @@sltechgalaxy1677 he is using terminal/cmd_prompt for that.

    • @OnePunchHeizou
      @OnePunchHeizou 2 ปีที่แล้ว +1

      @@sltechgalaxy1677 i think anything should work for this this purpose. preferably use linux.

    • @OnePunchHeizou
      @OnePunchHeizou 2 ปีที่แล้ว +1

      @@sltechgalaxy1677 i used this video for reference. these commands work on linux/windows, i dont know about mac terminal.

    • @OnePunchHeizou
      @OnePunchHeizou 2 ปีที่แล้ว +1

      @@sltechgalaxy1677 bro clone this project git repository, in data u will find all the audio files.
      go to that directory and try using these commands.

  • @gxbs2318
    @gxbs2318 4 หลายเดือนก่อน

    voy a aplicarlo en dos dispositivos IOT que tengo en funcionamiento Excelente video

  • @iotan09
    @iotan09 3 ปีที่แล้ว +2

    How kind you are ,thanks for sharing

    • @atomic14
      @atomic14  3 ปีที่แล้ว +2

      No problem at all, it's a privilege to be able to give something back to the community.

  • @maul6117
    @maul6117 8 หลายเดือนก่อน

    do you have to watch these in a certain order? is there a playlist for just the diy Alexa project?

  • @ChrisHalden007
    @ChrisHalden007 3 ปีที่แล้ว +1

    Amazing!!!! Will definitely give it a try. Thank you

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Let us know how you get on!

    • @digitronix532
      @digitronix532 7 หลายเดือนก่อน

      How to integrate esp32 and this program

  • @OMNI_INFINITY
    @OMNI_INFINITY หลายเดือนก่อน

    Thanks! Seems I should make a touchscreen voice AI app

  • @maul6117
    @maul6117 8 หลายเดือนก่อน

    is there a step by step video for the hardware build?

  • @sasisekharmg7823
    @sasisekharmg7823 3 ปีที่แล้ว +2

    Amazing work!

    • @atomic14
      @atomic14  3 ปีที่แล้ว +2

      Thank you! Cheers!

    • @digitronix532
      @digitronix532 7 หลายเดือนก่อน

      How to upload to esp32

  • @CongTrading
    @CongTrading 2 ปีที่แล้ว

    Thanks for sharing the great work, subscribed.

    • @atomic14
      @atomic14  2 ปีที่แล้ว

      Thanks for the sub!

  • @WagnerUlisses
    @WagnerUlisses 3 ปีที่แล้ว +1

    Very cool!

  • @kavishchattoor1729
    @kavishchattoor1729 7 หลายเดือนก่อน

    sorry i know this might be late but I am replicating a similar project. Did you use the ESP32 to capture the audio signal? My esp32 doesn't have enough memory to capture enough data.

  • @alphoncemutabuzi6949
    @alphoncemutabuzi6949 2 ปีที่แล้ว

    Thanks alot brother

  • @gitaran24
    @gitaran24 3 ปีที่แล้ว +1

    absolutly amazing.. you do great things,.. you are smart.. its chalange me to make it one

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      You should definitely go for it - report back on how you get on.

  • @aisolutions834
    @aisolutions834 3 ปีที่แล้ว +1

    Hi There!
    Nice Work, Is it possible to run a TensorFlow object detection model like MobileNET on ESP32? OpenMV has this capability using TFLite library, but I am interested in running object detection on ESP32 which is very low cost compared, thanks!

  • @sambidpradhan32
    @sambidpradhan32 3 ปีที่แล้ว +2

    This is awesome.. thinking to implement this on a custom dataset, and this model looks light weight as well.. can be implemented in real time I guess

    • @atomic14
      @atomic14  3 ปีที่แล้ว +2

      It's amazing what you can do with quite a small model. I have seen that the micro-speech example in the main TensorFlow codebase is now available for the ESP32 - might be worth taking a look at that as well.

  • @nielspaulin2647
    @nielspaulin2647 ปีที่แล้ว

    EXCELLENT!

  • @YigalBZ
    @YigalBZ 3 ปีที่แล้ว +1

    Great video and project. This is my next project. Thank you !

    • @atomic14
      @atomic14  3 ปีที่แล้ว +1

      Let us know how you get on!

  • @your.free.electrons
    @your.free.electrons 2 ปีที่แล้ว +1

    Hey, this one's awesome :')

  • @edgull_tlt
    @edgull_tlt 2 ปีที่แล้ว

    Спасибо за видео. Было интересно.

  • @erikpratama7685
    @erikpratama7685 2 ปีที่แล้ว +2

    Hello, nice project, can i use esp 32 cam??

  • @guilhermevini65
    @guilhermevini65 2 ปีที่แล้ว

    Amazing !!!

  • @pruthvirajvenkatesha6897
    @pruthvirajvenkatesha6897 2 ปีที่แล้ว +2

    Thanks for this! Amazing work! I had few questions and would be helpful if you could reply. Can we use this procedure to build the same for esp32s3? It seems you used arduino framework which i checked and is not up yet on vscode. Any other approaches to build this firmware on esp32s3?
    Also , do we have info on KWS model #computations? Based on few algorithms papers which are validated on Google speech data set, it is always a trade off bw accuracy and total computations so wanted to know the procedure used to select an algorithm.
    Last question, can we build any tflite model using the tflm framework?

  • @ernstgennial7064
    @ernstgennial7064 3 ปีที่แล้ว +2

    Very interesting!

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Glad you think so!

  • @naafff1
    @naafff1 3 ปีที่แล้ว +4

    I thought your gonna be using a Raspberry pi. Speechless... Im gonna make one like u .

    • @naafff1
      @naafff1 3 ปีที่แล้ว

      ​@Taylor Van i have got many messages like these. They ask you for money and once you give, they dont give you the account you wanted to hack

  • @user-ux2oq6yd2c
    @user-ux2oq6yd2c 2 หลายเดือนก่อน +1

    can you integrate with ChaGPT? would be super amazing!

  • @ehabelbwab1783
    @ehabelbwab1783 หลายเดือนก่อน +1

    You should mix the audios with noise background at out side than use them for training because adding _background folder with training data is bad choices.

  • @Techn0man1ac
    @Techn0man1ac 2 ปีที่แล้ว

    Спасибо Большое

  • @kingsleybaros2095
    @kingsleybaros2095 4 หลายเดือนก่อน

    At the end please what are you uploading as code in the esp 32 that will run your entire system

  • @devisnugroho
    @devisnugroho 2 ปีที่แล้ว

    what's kind of software that you use in 3:58, the wave and spectrogram comes realtime?

  • @ankitthealchemist
    @ankitthealchemist 3 ปีที่แล้ว +2

    Hey! great work dude!! could we implement the simple command like "turn off the light" offline, just like the wake word detection?

    • @atomic14
      @atomic14  3 ปีที่แล้ว +2

      I'm looking at this right now - it is a more difficult problem than the simple wake word detection. The model needs to have an output for each possible command word which means it is a larger model so will take longer to run on the ESP32. Hopefully, I'll be able to do another video soon showing it working - though just to be clear, this would be very limited commands - like: "on", "off", "left", "right" etc...

  • @SinanAkkoyun
    @SinanAkkoyun 3 ปีที่แล้ว +2

    Wow wtf!!!!!! 😍😍😍😍😍😍

  • @emilianotl3572
    @emilianotl3572 2 ปีที่แล้ว

    do you know if i can use dialogflow to control devices that are connected to google home?

  • @EricSouzarys
    @EricSouzarys 2 ปีที่แล้ว

    Do you think it's possible to train the model so it can detect a ringtone?

  • @chockman3833
    @chockman3833 3 ปีที่แล้ว +2

    I had to login to my other account to give this video another like, this was incredible!
    How hard would it be to extend the model to have some amount of offline NLP so we don’t have to rely on Facebook?

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      I'm having a look at that right now, got slightly sidetracked looking at building an AGC. It's possible to a limited extent, the command dataset does contain some other words that we can try using. Getting performance from a small enough model looks doable. Hopefully should have something up this week,

  • @shufnagl
    @shufnagl 3 ปีที่แล้ว +3

    Hi, as others already mentioned...great work, great video. BTW, would it make sense to use other ESP32 Hardware with included Mic/Speaker like Atom Echo?

    • @atomic14
      @atomic14  3 ปีที่แล้ว +1

      I don't see why not - you may need to modify the code to use whatever pins and interface the Atom Echo uses for the microphone and speaker. It should work really well.

    • @shufnagl
      @shufnagl 3 ปีที่แล้ว +1

      @@atomic14 My AtomEcho arrived and I will give you feedback about the results. BTW, where should we discuss the technical aspects? TH-cam or Git? Thx

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Probably best on GitHub as we can share code snippets a bit more easily.

    • @shufnagl
      @shufnagl 3 ปีที่แล้ว

      @@atomic14 Should I create a separate branch (to avoid poluting your code)

    • @atomic14
      @atomic14  3 ปีที่แล้ว +2

      @@shufnagl You'll need to fork the repository and then you can do pull requests back to my code - there's a good guide here - github.com/firstcontributions/first-contributions Looking forward to seeing what you do!

  • @ahlamhusni6258
    @ahlamhusni6258 ปีที่แล้ว

    What is the distance for the microphone to be able to catch the voice ?

  • @SonuRauniyar
    @SonuRauniyar 3 ปีที่แล้ว

    Pretty cool stuff:). I want to make my own wake-up word detection system using a custom audio dataset. Let's say my wake-up word is "Hey Marvin" which is I assume is longer than 1 second? How many data points can be decent enough to train the model? and since I will use google speech dataset to add noise for better accuracy , do you think time frame of 1 second will matter here?

    • @spacecdr
      @spacecdr 2 ปีที่แล้ว

      A linux terminal with alsa and curl installed! "software"...😂

  • @digitronix532
    @digitronix532 7 หลายเดือนก่อน

    Kindly help me in Programming ESP32 ...how to integrate python program and ESP 32

  • @DJ1TJOO
    @DJ1TJOO ปีที่แล้ว

    Can this work with a normal sound sensor that just has an analog out put?

  • @user-sr9ss3xd4q
    @user-sr9ss3xd4q 4 หลายเดือนก่อน

    Excellent.. I really enjoy the contents of the channel.. I suggest you make a content about rihno picovoice on esp32

    • @atomic14
      @atomic14  4 หลายเดือนก่อน

      Looks interesting, but I don't think it works on the ESP32 yet - might need a more powerful processor.

  • @thomasob42
    @thomasob42 3 ปีที่แล้ว

    Can this project be implemented using arduino BLE 33 Sense?

  • @AryanKapur0605
    @AryanKapur0605 2 ปีที่แล้ว

    Hi! Can I use ESP 32 Cam instead of ESP32? Thanks!

  • @devmishra4131
    @devmishra4131 2 ปีที่แล้ว

    hi sir, I have one more doubt that what at this timing 16:13 you used as the terminal, I tried many ways to run the link(I have used my own recording, saved in desktop and pasted the path) which I got from my wit.ai account in my window's terminal, but it didn't work. And I also tried to find many other ways to do that, but nothing worked. So, please reply as soon as possible.

  • @dariovicenzo8139
    @dariovicenzo8139 2 ปีที่แล้ว

    Great video! What I don’t understand (Im at the basic of TF) why we need to use a cloud service AI when we are trying to make an edge device? So in other word we are losting the advantage to realize an edge system if we need a cloud service. So I could avoid the lite model and make all the stuff in the cloud using the esp32 as audio transmitter. I hope I understood well the purpose of facebook service. Thanks.

    • @atomic14
      @atomic14  2 ปีที่แล้ว

      Hi Dario, that is a very good question. One of the issues with using the ESP32 as an audio transmitter and doing the wake detection in the cloud is privacy concerns - you really want the user to be in charge of when the device is actively listening and sending your data to a third party service. So you really want the device doing the wake word detection and only sending audio data to the internet once the wake word has been detected. Currently, doing full intent recognition on the edge is too difficult on a device like the ESP32 - however, there is software for the raspberry pi that looks very promising - rhasspy.readthedocs.io/en/latest/

  • @jspark4171
    @jspark4171 5 หลายเดือนก่อน

    Your answer was very helpful to me. Thank you very much.

    • @atomic14
      @atomic14  5 หลายเดือนก่อน

      Thanks! Very much appreciated!

  • @rolyantrauts2304
    @rolyantrauts2304 3 ปีที่แล้ว

    Also many thanks as didn't realise tensorflow was and will do the job on an ESP32. There is a lack of opensource linux beamforming algs, which you have probably just solved.
    Esp32 is so relatively cheap that a distributed microphone array where the mic with highest keyword match is used for that ASR session.
    Vosk has a streaming API alphacephei.com/vosk/ just needs a streaming RTP protocol with current keyword match info and no beamforming needed as nearest mic automatically used...

    • @atomic14
      @atomic14  3 ปีที่แล้ว +1

      Sounds interesting - the only issue you may start to hit performance issues with processing multiple microphones at once. Currently the wake word detection takes around 100ms so you may start running out of CPU time with more than one or two microphones. You might also hit memory issues with the audio buffering - though using a wrover module might fix this.

    • @rolyantrauts2304
      @rolyantrauts2304 3 ปีที่แล้ว

      @@atomic14 I dunno thought I would ask you as a total noob with ESP32 but on linux irrespective of process power we still lack opensource beamforming. The pulseaudio addition just doesn't work, don't think it ever did prob hence why upstream its been dropped from webrtc.
      What I am thinking is that we are not 'processing' multiple microphones at once the I2S data for mono is just doubled and the L/R hi/lo word select is not used.
      A single channel would be fed into a delay buffer and then I guess just summed with the inverse of the current value of the other channel?
      It is really a single channel in a short delay ring buffer of the speed of sound distance and what is present on the other I2S is just subtracted.
      For a noob who is blankly staring blankly at a $5 aliexpress wrover and brief journey of documentation it makes curious if you could with 2x cores but to be honest yet haven't a clue how :)
      I can not even work out if http streams are just client or you can create a server stream or if you could present AMR-WB on a port?!?
      Just got my fingers crossed it might perk your interest.

    • @rolyantrauts2304
      @rolyantrauts2304 3 ปีที่แล้ว

      @@atomic14 PS the lack of beamforming was that each ESP32 could be a streaming KWS to a central ASR.
      Broadcast from KW to silence with some metadata of KW hit score and a central ASR would be able to use best KW hit score so an array of esp32s could be a distributed array with best and nearest always used.

  • @Gauthamphongalkar
    @Gauthamphongalkar 2 ปีที่แล้ว +2

    Marvelous content, thank you very much!

    • @Gauthamphongalkar
      @Gauthamphongalkar 2 ปีที่แล้ว +1

      @@sltechgalaxy1677 I'm not sure to which you are pointing.. to play audio.. if you are on Linux you can use aplay

    • @Gauthamphongalkar
      @Gauthamphongalkar 2 ปีที่แล้ว +1

      @@sltechgalaxy1677 aplay is utility of Linux.. you can't use such in windows.. in windows you can try playing in RAW format in VLC

    • @Gauthamphongalkar
      @Gauthamphongalkar 2 ปีที่แล้ว +1

      @@sltechgalaxy1677 yes.. also read about ALSA

  • @Nerdsking
    @Nerdsking 4 หลายเดือนก่อน

    It would be more interesting (and usefull) if there was a way to merge this with another ESP32 project that wuns chatgpt, so it could be not only a DIY Alexa, but also an general
    assistant

  • @devmishra4131
    @devmishra4131 2 ปีที่แล้ว +2

    I am pursuing mechanical engineering from Stanford batch of 2023, and your video is pretty good.
    I had one query, can we use PAM8403 instead of MAX43434 for the output.

  • @TechnicalShubhamofficial
    @TechnicalShubhamofficial ปีที่แล้ว

    Hey can you tell me how to program the esp 32 and where is the final code

  • @55cancri_e76
    @55cancri_e76 ปีที่แล้ว

    Hi sir,
    Thank you for the great video.
    My teammates and I are trying to make similar project to yours. But I would like to ask you how did you linked the python code with the C code. Also, how did you upload the code on the ESP32? was the C code or the python code?

  • @francegall-web9819
    @francegall-web9819 3 ปีที่แล้ว

    Mr. atomic14 really impressive. Since you are very good at programming can you help us reprogram the HLK-V20 speech recognition? It is a very cheap chip - three dollars - which provides offline speech recognition, but its manufacturer does not explain how it is programmed. (There is also the SU-10A which is the same from a different manufacturer.)

  • @khaoulakanna4227
    @khaoulakanna4227 2 ปีที่แล้ว +1

    can this be done in an other language other than english ?

  • @amarjeetkumarfor
    @amarjeetkumarfor 6 หลายเดือนก่อน

    Can I have circuit diagram, please

  • @JernD
    @JernD 3 ปีที่แล้ว

    This is probably a silly question, but why did you take the log(audiodata) after audio normalization? Would it be superior to swap those operations?

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Hey John, definitely not a silly question, the audio is normalised and then we calculate the spectrogram of the normalised audio. The log operation is applied to the spectrogram output. The spectrogram can end up with some very large values and the log operation brings them down into a more sensible range for the neural network to train against.

  • @dicle6714
    @dicle6714 2 ปีที่แล้ว

    I can't compile this application with Arduino IDE. I made the necessary file edits.

  • @prof.tahseen6104
    @prof.tahseen6104 2 ปีที่แล้ว

    the voice from those meme videos 😂

  • @apoorvanavin3300
    @apoorvanavin3300 2 หลายเดือนก่อน

    in which language this works on? python

  • @fiottovotre7202
    @fiottovotre7202 ปีที่แล้ว

    How can I navigate the dataset plz? Actually, I can't find it

  • @dreyreis
    @dreyreis 3 วันที่ผ่านมา

    Is it possible to use a pre-trained voice model and install it on a device (like a model of a famous person, perhaps)? If so, how would we do this?

    • @atomic14
      @atomic14  3 วันที่ผ่านมา

      The ESP32 isn’t really powerful enough to do that locally. But there are APIs that you can call that will do Text To Speech (TTS). And some of them offer custom voices.

  • @DayanandKushwaha-ef6oi
    @DayanandKushwaha-ef6oi 4 หลายเดือนก่อน

    i am not getting audio output please help ...

  • @ei23de
    @ei23de 3 ปีที่แล้ว

    The following question may falls below the standard of your channel, but since you introduced me to jupyter notebook, i have to know, which software you are using for presentation.
    This is not Powerpoint, is it?

    • @atomic14
      @atomic14  3 ปีที่แล้ว +1

      I use a bit of a mix for videos - I'm on a Mac so use Keynote (the Mac equivalent of Powerpoint). I've been trying to learn the manim library which is what the guy who does 3Blue1Brown uses. I've also got my own homegrown animation library that I use for some things - but it's definitely not really ready. I've used Apple Motion for a couple of videos, there is quite a learning curve with it and I'm nowhere near proficient.

    • @ei23de
      @ei23de 3 ปีที่แล้ว

      @@atomic14 I like your videostyle, its looks professional.

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      @@ei23de Thanks!

  • @Yakroo108
    @Yakroo108 3 หลายเดือนก่อน

    👍👍👍

  • @THEbonny95
    @THEbonny95 ปีที่แล้ว

    Can't do this on Google Assistant?

  • @alo1236546
    @alo1236546 3 ปีที่แล้ว

    Any plan for TinyMl

  • @devmishra4131
    @devmishra4131 2 ปีที่แล้ว

    I followed all of your processes and really found it amazing and helpful!!. but I have a doubt that how are we going to upload these codes into esp 32 or esp 8266 as you don't have any .ino file so you must not be using arduino for that. so what ide are you using, if it is vscode then what settings you have did? please tell, it would really help everyone.

    • @atomic14
      @atomic14  2 ปีที่แล้ว +1

      I'm using PlatformIO, just install VSCode and download the PlatformIO plugin.

    • @devmishra4131
      @devmishra4131 2 ปีที่แล้ว

      @@atomic14 Thanks a lot sir for your reply, it means a lot to me.
      Looking forward to a successful test!!

    • @data_resources
      @data_resources 2 ปีที่แล้ว

      @@devmishra4131 can you explain how you did it

    • @digitronix532
      @digitronix532 7 หลายเดือนก่อน

      How to upload program to esp32

  • @Pavana_sai
    @Pavana_sai 3 ปีที่แล้ว +1

    HI, wonderful project.
    im interested to build the same project. can you help me

  • @gsge
    @gsge 3 ปีที่แล้ว

    Apart from your vast knowledge of hardware and software you are the best teacher to make quite complicated subject very easy to understand for newbie like me.
    Is it possible to bypass cloud service like wit.ai to host it on local Raspberry for totally local solution ?
    Thank you.

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Yes - there's a solution called Rhasspy - rhasspy.readthedocs.io/en/latest - I think in theory you should be able to swap out Wit.ai for it. The code for decoding the response will probably need to change, but it looks doable.

    • @gsge
      @gsge 3 ปีที่แล้ว

      @@atomic14 Thank you.

    • @digitronix532
      @digitronix532 7 หลายเดือนก่อน

      How to upload program to esp32

  • @faizabdulchakim8796
    @faizabdulchakim8796 3 ปีที่แล้ว +1

    this is esp32 s2 saloa-1 right? is possible using other type of esp32?

    • @atomic14
      @atomic14  3 ปีที่แล้ว +1

      Definitely - pretty much any ESP32 dev board will work - I'm not using any special features.

  • @ajanthahimali8491
    @ajanthahimali8491 2 ปีที่แล้ว +1

    Can you simplex the firmware codes please, it's very difficult to understand the code

  • @techs5564
    @techs5564 21 วันที่ผ่านมา

    my unit doesn't respond to "marvin" what shall i do?

  • @rafaelmatos8754
    @rafaelmatos8754 4 หลายเดือนก่อน

    How do you get so many examples of the word Marvin?

    • @atomic14
      @atomic14  4 หลายเดือนก่อน +1

      Weirdly, it was in the training data. I guess the people who compiled the audio samples were fans of Douglas Adams.

  • @keithsummers2842
    @keithsummers2842 3 ปีที่แล้ว

    You didn't really mention the size of the project. What is the expected memory footprint of the Flashed program?

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      It uses about 1,1Mbytes of flash. When running memory is tight, making the HTTPS connection to Wit.ai leaves about 30K of RAM.

    • @keithsummers2842
      @keithsummers2842 3 ปีที่แล้ว

      @@atomic14I'm working on a project right now where just Wifi and BLE implemented is soaking up about 1.5M of flash. As long as the entire project remains below about 3M then OTA continues to be possible in the WROM32 with 16M flash. I was most concerned about OTA memory space. Thank you for the response and the excellent video post here on TH-cam.

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      @@keithsummers2842 No problem - thanks and good luck with your project!

    • @keithsummers2842
      @keithsummers2842 3 ปีที่แล้ว

      @@atomic14 You seems to be very knowledgeable. Could I hire you for consultations just to keep us on track with our project? I can be reached at Keith@SSLEDLighting.com

  • @data_resources
    @data_resources 2 ปีที่แล้ว

    Hello i followed your instructions and i did almost all the project but am having trouble getting the output sound when i give the commands

  • @ei23de
    @ei23de 2 ปีที่แล้ว

    Hey, i hope you don't mind if i mention this video (and your channel) in one of my future videos?

    • @atomic14
      @atomic14  2 ปีที่แล้ว +1

      Go for it :)

    • @ei23de
      @ei23de 2 ปีที่แล้ว

      @@atomic14 th-cam.com/video/-Hfow7KMCK8/w-d-xo.html
      (but it's german language...)

  • @typingcat
    @typingcat 2 ปีที่แล้ว +1

    I need a off-the-grid system, not using a voice recognition service from Facebook. Who knows what Zuckerberg is going to do with your data. Also, as I see in the demo, the is a quite a significant delay, like 3 seconds. One of the reasons why I want to create my own is that I don't like the delay of Google Home.
    I don't know how other people use the the voice assistance, but I have found that they are dumb. Not really "A.I.", but just scripted responder by some programmers. So, I don't really try to "speak" to it, but just say some fixed-structure phrases that I know it will understand, like "turn on the light", etc. In short, all I need is speech to text. If I could get a string like "turn on the light", I could parse it and turn on the light myself. Is ESP32 powerful enough to convert speech to text on its own?

  • @izigoldenberg218
    @izigoldenberg218 2 ปีที่แล้ว +1

    Is there any chance this could work with ESP8266 instead of the ESP32?

    • @atomic14
      @atomic14  2 ปีที่แล้ว +2

      I think that might be difficult - it is pretty much pushing the limits of the ESP32.

  • @legal_hack5626
    @legal_hack5626 2 ปีที่แล้ว +1

    everything is ok but......
    for file_name in tqdm(get_files("_problem_noise_"), desc="Processing problem noise"):
    process_problem_noise(file_name, words.index("_background"))
    in these lines you are processing noise , but I don't have data set of problem noise , from where I can download it... I have downloaded google speech data set but there is no _problem_noise_ folder... what can I do now>>?

    • @atomic14
      @atomic14  2 ปีที่แล้ว +1

      Hi there, the problem noise files are options (as are the mar sound files). I just recorded some additional audio of my office noises that seemed to be confusing the neural network. You can either add a folder yourself and record some audio or you can comment out that section of the notebook.

    • @legal_hack5626
      @legal_hack5626 2 ปีที่แล้ว +1

      @@atomic14 Thanks

  • @yaowang4490
    @yaowang4490 3 ปีที่แล้ว

    hello ~Can you tell me how to import the project into vsconde, and look forward to your reply。 think you

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Hi Yeo, you'll need the PlatformIO extension installed and then you just open the folder the project is in.

    • @yaowang4490
      @yaowang4490 3 ปีที่แล้ว

      @@atomic14 I have successfully run your project, but I don’t know whether the inmp441 works. How to print the data of inmp441? think you

  • @tryssss
    @tryssss 3 ปีที่แล้ว

    Question ? did acces key still ok ?

    • @atomic14
      @atomic14  3 ปีที่แล้ว +1

      I think the one in GitHub should still be valid. But if not it's pretty easy to setup a new one.

  • @OMNI_INFINITY
    @OMNI_INFINITY หลายเดือนก่อน

    Found where rabbit AI maybe started

  • @ei23de
    @ei23de 3 ปีที่แล้ว +2

    This is super great!
    We should do some kind of collaboration!
    Some time ago i tried out Rhasspy with a Raspberry Pi as an offline Alexa (I call it "Axel", you can see it in my "DIY Open Source Home Automation with a Raspberry Pi [EN]" Video).
    Rhasspy is great, but I need some kind of sattelite hardware like this, or an ESP32 Audio Kit, which I saw as quite a challenge.
    But you obviously did it right away!

    • @atomic14
      @atomic14  3 ปีที่แล้ว +2

      I had a quick look at Rhasspy and you could easily modify my code to talk to it. I have a few projects to complete but will come back to it when I have some more time.

    • @ruifreitas7475
      @ruifreitas7475 3 ปีที่แล้ว

      @@atomic14 This is something i was looking into when i saw your video. Great timing. Passing audio commands from ESP32 to (via MQTT or not) Rhasspy to be recognized and trigger intents or actions in Home assistant would be great. Thank you for sharing this. community.rhasspy.org/

    • @synesthesiam
      @synesthesiam 3 ปีที่แล้ว +1

      @@atomic14 Rhasspy author here. Your project looks awesome! I'd be very interested in collaborating, so feel free to ping me whenever :)

    • @ei23de
      @ei23de 3 ปีที่แล้ว +1

      ​@@synesthesiam What great people here!
      Thank you for Rhasspy!
      I'm currently working my DIY Video Doorbell (ESP32 Cam) and Face Detection with OpenCV. You know the drill.
      But the video soon will be finished and after that my smart doorlock will get some spotlight... but after that! I will definitly spend time on this! This is super exiting and needs more attention.
      Hope i'll find time for this, soon.

  • @luisfelipesaldivar5100
    @luisfelipesaldivar5100 3 ปีที่แล้ว

    I have a question, maybe it's stupid or too obvious for You, but, how and which files are uploaded into the ESP32? I don't understand very well it's done.
    And can i use the Arduino IDE to upload the code(s)?

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Hi Luis, I'm using Platform.io for the project - it's a lot better than the Arduino IDE when you have a lot of files to manage. You can upload directly using platform.io it will handle it all for you.

    • @luisfelipesaldivar5100
      @luisfelipesaldivar5100 3 ปีที่แล้ว

      @@atomic14 I've download platform.io and load the "FIRMWARE" folder to it, but it give me 27 error while trying to upload it to my ESP32, I search for the errors that appeared but i don't understand them.

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      @@luisfelipesaldivar5100 Hi Luis, check in the platformio.ini file it may be that the upload_port and monitor_port have been set to the wrong values. You can delete these entries or change them to the correct ones.

    • @data_resources
      @data_resources 2 ปีที่แล้ว

      hey i was wondering how you upload the files in the esp32

  • @digitallifetanzania2373
    @digitallifetanzania2373 ปีที่แล้ว

    Can it answer any questions

  • @yaowang4490
    @yaowang4490 3 ปีที่แล้ว

    I have installed platformio on vscode, but when I open the diy-alexa-master folder, vscode prompts me "this is not platfrom project(should contains platformio.ini file), please tell me how to learn your project, Looking forward to your reply~~~ think you

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      Ah sorry - you need to be in the "firmware" folder. That should fix your problems.

    • @yaowang4490
      @yaowang4490 3 ปีที่แล้ว

      @@atomic14 😄~~Is it convenient for you to give me your contact information?
      I still won’t import your project into vscode. I feel a little frustrated. My computer environment is vscodo+esp32 wroom 32u(hardware)+platfromio

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      @@yaowang4490 Easiest way is to raise an issue on the GitHub repo - I can help you from there and there are other people who will be able to help as well.

    • @yaowang4490
      @yaowang4490 3 ปีที่แล้ว

      @@atomic14 ok According to your instructions, I successfully imported your project, but when I clicked the (ESP-IDF build project) button, I got this error (CMake Error: The source directory "C:/Users/liu/Downloads/diy-alexa-master) /diy-alexa-master/firmware" does not appear to contain CMakeLists.txt.)

    • @atomic14
      @atomic14  3 ปีที่แล้ว

      @@yaowang4490 I think you are trying to import it into an ESP-IDF project. It's a PlatformIO project. You just need to open it up in VSCode - no need to import or do anything like that. Just make sure you have the PlatformIO plugin installed in VSCode and open the firmware folder. platformio.org/ As I say, these kind of questions are much easier as a GitHub issue where we can share images and code snippets.

  • @San_DIY
    @San_DIY ปีที่แล้ว

    Is this translate the language??