Beta is before Alpha with Delta being before that and Theta being the earliest development term Delta Stage is different from Delta Updates Delta Stage is ready for testing but not public testing, DELTA Updates are pushing only the changed code not whole chunks of unchanged code connected to the changed lines of code saving a lot of time processing-power & data!
Thanks Jeff! I was just watching your video you published today. Everything you said in your video completely hit home with me between the comments you get and the unused tech all around. We should meet up at CES if you're going!
Hi i watched your video and you were talking about how your house is actually smart because all the light switches still function how did you do that and the smart home still can control the lights
Basically it comes to this: when it comes to controlling critical stuff in your home (heating, lights, doors,...): it already works. When it comes to less critical stuff (playing media, getting stuff from the internet etc...) not there yet. BUT: this means that you can get your own personal assistent without internet for controlling lots of stuf in your home by voice commands without relying on google, amazon or others... thinking about buying one for use in my own office at home. Knowing the HA community it is just a matter of time and it becomes a competitor to the google/alexa stuff
@@suavethreads8904you like sandbox games? Then yes. Otherwise no. HA - lot's of fun. Customisation ALL you want the way you want. You can integrate EVERYTHING. But it would take you a lot of time, and it might not be pretty in the end, so you'll need to re-do dashboard multiple times, before it's pretty in your opinion 😊 And config backups are on you. You didn't do backup and disk failed - all on you, noone else to blame. Much more tinkering needed, much more freedom. Only thing I didn't connected yet is Robo Vacuum, because of laziness 😂
I'm happy to see The Community developing an alternative to those commercial versions. I will likely get one once it's refined a bit more (out of Beta maybe)? Thanks for previewing what is coming!
12:16 Despite the limitations it still amazes me for its size. It can do a lot for its small size, maybe the next update or upgrade will fix all those limitations. The offline service is super awesome and the ability to play music on a large speaker is great too. It's more secure to keep your business offline. I'd love to have this exact home device for its size and capabilities. 😎💯💪🏾👍🏾
Just wanted to thank you for posting a seemingly honest review of this thing. This obviously will become a much better piece of hardware as times goes on, but I can’t help but think that this was released early to get support and help from the community. I’ll save my money and headaches for now and keep my using my HomePods until this is at a level that makes sense, especially for the missus who argues enough with Siri as it is, lol
I agree with this take. I have the esp box so I've been playing with voice on and off and honestly...it's really still not usable. With openWakeWord I only get successful detection about half the time (and that's only because I'm using the "alexa" wakeword and not "Ok Nabu" which hasn't worked once for me sadly). It also really struggles with some phrases, so I've had to set up aliases which aren't really correct because they're easier for it to decipher. As examples, "hall light" often gets recognised as "whole light" and "desk lamp" just goes all over the place :( I hope they can pull it together and make it usable at some point. I like the idea of being able to use voice commands to control my housse, but it's just nowhere close yet sadly.
You should definitely try pairing it with an LLM. There are cloud and local options. It has built in support for ChatGPT (cloud) and Ollama (local). There is also HomeLLM addon that gives you a lot more flexibility in what endpoints to connect to (it supports OpenAI compatible endpoints, so TextGen WebUI, llama-cpp server and such all work). And it allows you to control prompting. You can make your own system prompt template using jinja for it. It sends all info about your house as context to the LLM, so then the LLM brains kick in and it is able to interpret many fumbly commands, even if TTS misinterprets some words, it can still understand what you meant. The downside is that it's an LLM so it may hallucinate. But that depends on what LLM you are running. Running all of this on an RPi is not a good idea, even STT and TTS are all neural networks so RPi cant realistically handle them fast enough. With a decent GPU you can get it down to 1-2 seconds between request and reply, including local LLM.
Thank you very much for your honest and down to earth review, hearing others saying that the speaker quality was actually good was quite painful to hear and I coudn’t take the rest of theirs points seriously after that. Looking forward to more updates on this 🎉
Ya know a lot of the voice limitations you mention can also be sorted out via custom sentences and just point the PE to do something in particular when those sentences are said
I moved HA from a container to HAOS installation on a mini-PC recently, an N100 with 16GB / 500GB was about $160, so really cheap. Blast the image down to the NVME and boot, it's really fast. If I decide later to go back to containers, I can use that machine, just re-image it to Fedora or the like and go.
If they support third party LLMs, then it would make more sense to use it with an LLM that is hyper focused on simple home automation commands. You don't really need 70 billion parameter LLM just to automate the home, a small 1 million LLM trained for home automation commands would suffice.
Home Assistant can already support other LLMs, including ones you can run locally, like on the Ollama engine which can run a bunch of free LLM models on your local hardware. The challenge is picking the right hardware to host a model that's a good match to the hardware.
On the other hand, now that you can put an LLM as fallback and if you're not against using OpenAI models, the 4o-mini model with tool calling is outstanding and just fills the gap. Plus it is not really expensive. Still have to update HA to 2024.12 and test it myself.
Hey, so here's my current setup for AI assistant is HA, and my toughts on the product : For STT is use google cloud voice to text. For commands i use GPT-4o-mini. For TTS is use an ElevenLabs V2-Fast voice. It's quite fast, the main slowdown is ElevenLabs. It's completely Free for a basic usage if you decide to use Gemini API instead of GPT4o-mini. I tried to use whisper, a local LLM and TTS but with my server hardware it was quite slow and not precise enough, so i made the tradeoff of my privacy for my ease of use. Looking to build an AI server soon and take back control on all that. Also, the thing presented in this video has absolutely no use. From my understanding it's just a mic and a bad speaker with fancy led and volume control. For the same price i'd recommend buying or repurposing an old tablet, setting up HA app and having fun with your own google hub/echo show (with admitedly a worse mic but fixable at very low cost). This thing has absolutely no hardware power and rely on other hardware for local LLM or the cloud like everything else, this is NOT at all an entrypoint to AI assistant like advertised. Do not hesitate to correct me on this or ask questions :)
Honestly, as an owner of 6 Amazon Echo Show 5 devices, the "shortcomings" of this device are exactly why I want it. I hate, hate, HATE when I want to turn a light on or off and my Echo wants to carry on a conversation with me. I want my Alexa devices to control my devices with basic commands and nothing more. There's no way to do this w/o the stupid thing trying to sell me crap at every turn. I want to 100% disable music streaming with Alexa and I can't. I paid quite a bit of money for these devices to barely do what I want them but a whole host of extra curricular activities that I want no parts of. Too bad they don't offer a higher price to actually be able to control me devices. This device will do all I need it to do right out of the box: basic voice commands for devices like lights & fans for when I can't use automations. I can't wait! Viva freedom!!
Watched a bunch of your videos nonstop for a few months just ordered a rasperry pi 5 8gb with the ssd add on board in a few days i can start home assistant projects!
I built one of these with a pi zero and mic hat. Fun and easy experience glad this is here now as my home server is built and I was just about to order a bunch of pi zero 2 to start building these as I'm buying a house soon and gonna go nuts lol
The TTS and STT add-ons can be run on a separate device as Docker containers. I have mine set up to run on my Synology NAS in Synology's Container Manager application. That lets me run HA on dedicated hardware (HA Yellow) but also get more horsepower for the speech add-ons.
Same with the conversation agent. You can run that to the cloud, as shown. But you can also setup ollama on a completely different system (PC), and allow HA to handle the limited agent (on/off/dim) while giving questions it can't figure out to the faster system as a fall-back. Any system with a reasonable GPU can handle pretty complex models.
Nice video, I’m so excited for all the great hardware coming out for voice with HA. Most of the shortcomings can be fixed several ways - either through the intent scripts you tried or using a custom sentence trigger. But the easiest is an LLM integration, it really is great at understanding the context of your request.
I’m sure this thing is definitely going to improve rapidly. I remember when they released support for the M5 Stack Atom Echo. At first it was horrible and now it’s pretty damn good!
@@chrish6373 Absolutely! The evolution of voice control and smart home technology is truly impressive. It's amazing how quickly these systems can improve. The M5 Stack Atom Echo is a great example-early versions often have their quirks, but with continuous updates and community feedback, they can become incredibly efficient and reliable.
HA has a fallback option that lets it go to an LLM (ChatGPT or a local Ollama for instance) should the built-in voice functionality be unable to grasp what you want, perhaps that could help with making voice interpretation more natural. Voice on a Raspberry Pi is kind of a fool's errand, although a Pi 5 has enough horsepower to at least keep delays down to a second or two; you really want beefier gear.
I believe both Jeff Geerling and Network Chuck did some videos on how to set this up. They've also posted recent reviews of the HA Nabu Casa so they should be easy to find from there, as they mention doing exactly this. :)
@@FawziBreidi I built a conversation agent that happens to use the free version of Google Gemini, that has the personality of Marvin the Paranoid Android from Hitchhiker's Guide to the Galaxy. This is easy to do since you can specify your own prompt prefix for the AI conversation agent. Marvin is great to chat with, even in the context of turning lights on and off, which he is sure to remind you is completely beneath him and his skills. I'm endlessly amused by it; though I suspect my wife would grow tired of it very quickly. Though since you can specify different voice pipelines per voice assistant, I can have the on in my home office be infused with the Marvin (Genuine People) personality, while the others can be more boring.
@@FawziBreidi You can also do general questions locally with ollama if you have a system with an older GPU laying around. I threw a Quadro M2000 into my HA just this month to setup ollama, which I'm talking with via an ESP32-S3-BOX (similar to this device). I gets most questions right, can keep context across a few questions, and replies within 2 to 5 seconds depending. It can even control exposed entities if you get a model with tooling (like llama3.2 model). That's a bit more setup, but it's quite snappy. What I'm waiting for next (or may work on this break) is the ability to specify multiple models, and enable either cascading or multiple parallel LLMs to try to get answers. Kind of like what was done with using the very limited local model (on/off, etc). But if ollama can't answer it, have it reply "I need to ask the internet" and then query chatGPT. That's the ideal mix, since >80% can be done locally, and the things it can't do (current events, etc), it can use the internet for. Far less reliance on third party, and/or them getting my personal query/usage data that way.
Given your optimal use case.. could a stop gap video address updated information for actionable notifications in home assistant. As a new HA user, love the capabilities but struggling with outdated info and guides. Otherwise great video!!!
Love it Love it Love it! And, yes, I'm referring to your review vid. I agree that the hardware looks awesome, and the feature set is what I'm looking for. I currently have an Amazon Echo and/or an Alexa-enabled device (ecobees, etc) in every room. Nabu Casa exposes all HA devices to Alexa, and we use her purely for voice control and for whole home audio. So, this looks like a perfect substitution if/when needed (which might be soon depending on what the Alexa upgrade brings). But yes, the software needs to be updated. Like you, I have faith in the community and in Nabu Casa that this will happen in time.
So basically you're saying you need to straight up run it through your home hosted GTP so that it understands normal sentences, that's really not a big deal.
It may be that average users would expect it to understand more than it does out of the box, but you can still make it recognize different commands. Give entities one or more aliases, and then it will know that you mean that one when you say it that way.
I am interested to see how this looks in another six months, and what advances they will make in the processing. Gonna keep an eye on this one. Would be a great thing for privacy conscious people with disabilities.
I've been playing with ESPHome the last few days with a Raspiaudio Muse Luxe speaker, and I agree that it is still quite dodgy. I'm certain that it will get loads better over time.
Yes. You can run Ollama in a docker container and connect to it. Personally I'd rather use the Windows version of Ollama rather than the Docker version, simple executable file and notifies you of updates. I run Ollama and connect to it both from Open WebUI running in WSL and Home Assistant running on a NUC. I might be moving everything bit for bit to Linux on my Proxmox server to run it side by side with Piper (TTS) and Faster-Whisper (STT) but I'll pick a simple self-updating windows executable over yet another docker container any day.
Yea if u add the llm to it then you won't have the problem of it not noting what you said. I setup ollama for it. And it's running on my pc with the home assistant on a pi
Could you use to announce alerts, ie. If you had a fridge sensor that said the fridge temp was above 4 degrees C could this device audibly alert you , play an alert tone?
Do you know if this can be used to provide person tracking via BLE, either out-of-the-box or via modification (software and/or hardware)? Would be a nice way to add both capabilities around the house.
@@xitee4245Could be they already have a Windows server PC set up and it was easier to add on using Virtual Box instead of starting over installing ProxMox. That's why I ended up going that route anyway.
I've been waiting for something like this for a long time! By the way, you are always the best when it comes to reviews! and everything in smart home subject🔥🔥🔥🔥🔥🔥
At least software can be updated over time. The hardware seems decent and can’t be fixed post shipping. I’m confident they’ll be able to iterate on what’s driving you nuts today!
Thanks a lot of the things like software and how it acts can be updated over time especially as they get more users on it and more telemetry data to be seeing people saying commands like play instead of resume and then put in the logic for if it's paused play what it was or resume it.
Lots of potential, this version probably not for me but liking the enthusiasum from the community. Hoping the guys figure a way to make this work modularly with local LLM so you can have your HA green be central HAOS and somewhere else be the LLM engine , since i think we entering an era where LLM hardware is going to change a lot ...essentially let us pick our "cloud" Agree with the take and enjoy the comedy
Curious if you can add more than one of these devices to your smart home? I have 10 HomePods (Gen 2) and HomePod Minis scattered throughout my house. Can I do the same with the HAV PE?
Hi, I'm looking for a remote control button that look like a light switch and work with HA. I would like it to turn some smart plugs on/off. Any suggestions?
Using this to run the basic voice assistant software doesn’t seem like it’s the special part of this. What makes it unique is the ability to easily use a local or cloud-based LLM to control your home. Then you can get true natural language control and responses.
I can't help but be curious about the app eventually gaining voice abilities of its own which will link back to the home assistant you have set up. Ultimately replacing google assistant and siri at that point, lol
Funny cuz that particular echo didn’t have a display either. Not even the clock display. But… I don’t understand the whole starting up a conversation with you randomly. I am kind of torn. I think a smart home should be in the background just doing its thing. But it should be able to do what you want. Personally I wouldn’t want a smart home randomly suggesting to me what I want… or don’t want. Preferences… maybe. But “hey it’s bright should I shut the shades?” If you wanted them shut wouldn’t you just ask it?
I think it would be helpful in some cases and intrusive in others. Definitely a case by case basis and not for everyone. I could personally use a reminder like SHS set up mentioning it's nice outside, asking to open my shades if they're closed, and recommending I head outside to touch some grass. 😉 Or perhaps asking if I want to turn off the HVAC and open some windows. Something like that.
I have this semi setup with my echos, by calling into the API to do an announcement. Just an example of things it does: * Alerts me when the motion sensor/camera detects a person coming up my walk or driveway. * Alerts me when a car is pulling into the driveway. * Tells me when it detects flashing lights outside (eg cops/ambulance/fire are at any of my neighbors) * Reminds me to take the trash to the curb if the motion sensors on them have not moved by 10pm. * Nags me on occasion to go to bed if it's past 1am and I've not gone to bed yet. * Reminds of events on my social calendar. * Alerts me if I go to bed while one of the outside doors is still unlocked. I would love to have it offer to pull up video for the first two. Or ask about locking the locks, and allow me to just say yes/no. As it is, it alerts me, then I have to wake it again, tell it to do something, and wait for that action to happen. Same with the trash, in case it's a holiday and delayed a day. Those are just what I can think of off the top of my head. :)
It'd still be an upgrade to a normal automation process. Instead of automatically opening the window when the temperature reaches 28 degrees in the office, suggest it. You might not want to. And combined with an LLM running, it might even suggest it even if you haven't automated it. Alexa also sometimes bring up suggestions during an interaction - even if they're probably based on human scripted interactions rather than AI.
I wonder how this compares to that local way of doing it with a raspberry pi. The only issue I had was that the speaker wasn't that great. So I was thinking if something like this works fine, I could rip open my echos and steal the speakers and maybe case and it could work the same way with great speakers.
Is the function of the rotatable ring able to be changed? I'd much rather have physical controls for the dimming of a room light. Also, how are the microphones at picking you up from a distance?
It is changeable, and I would also want it as a dimmer. Apparently improving the microphones was one of the main focuses compared to the older ESP32 devices. From what they say it will hear you across the room over music.
Very interesting. Does the new piece of kit rely on VoIP? Home Assistant broke that facility (to speak to it using an analogue phone and a Grandstream box) for some of us using HAOS back in October and hasn't yet fixed it.
I’d like to know what specs are considered “powerful hardware”, which is a very vague designation for what I might need to run it entirely locally. I have a Dell Wyse 5070 mini PC that has a pretty low base clock speed but can get some pretty decent boost speed. It’s been running HomeAssistant OS very reliably but I’m not sure how intensive the voice assistant add-on will actually be on it.
For about the same price, you can get an ESP32-S3-BOX, which can do all that and has a programable touch screen, and a button or two, and temperature sensor. That's what I have setup right now to semi-replace my Echo Dot, and love it. If you have a PC running HA, and can add an older GPU to it (Gen5.2 go for under $50 these days), you can run ollama locally to do 80% of the "smart" things you want to do now without internet. The only thing HA/ollama can't really pull off right now are things involving internet search or current events. (Where "current" is since you last updated the model you're running.) I'm running on a lower-end (low power) PC with Quadro M2000 and Ollama3.2, and get really good replies to basic questions 95% of the time in under a couple seconds. Anything past, it knows well, like "Who was president of the US in 1987?" Current events you need to either prompt it well in HA, or ask "Who is the most current pope you know of?" (vs "Who is the current pope?" which it will not generally answer, not knowing how accurate it's last known answer is.)
@Ballissle from what I've seen.. radically improved mics.. but the xmos chip inside it is the magic.. makes it *way* better experience... I'm a contributor to the s3-box community firmware... so I've managed to get a decent experience from it.. but this VPE will take HA's VA to a new level... if your primary use case is a desk touch screen though? sure.. the s3-box is also excellent for that.
Heya! thanks for the review. Are there any other similar devices that work well with home assistant? I am looking to get rid of google home devices and place some microphone/speaker set-ups like this for commands only not listening to music or anything.
You definitely need to upgrade your home assistant to something more than a raspberry pi! I run mine as a virtual appliance on my four node vCenter cluster of HP DL 380 g9s. And so it has eight logical processors but it can actually use the megahertz speed of the 40 processors at 3.5Ghz But even running it on an older computer, or hell even a headless laptop, would be loads faster on a lot of the tasks that you don't even think about. I have a fairly large home assistant install with a metric crap ton of devices, and from your videos you do too. I would imagine little tasks like sending commands and logging and stuff like that the raspberry pi does but you've grown used to it and so it may be a little slower. I'll bet you'll be amazed after putting it on a much faster system. Even if you have a AV rack or something, you can get a small 1U server that uses a lower end processor and stuff they're relatively cheap, and run home assistant like a champ!
If you want the full local LLM (LLLM?) experience, even that lower end CPU will not do. You'd want to run on a GPU and better models require more video RAM.
You can connect to Gemini for free and it fills in all the gaps for natural language interaction with the smart home. I realize this is not fully local, and actually goes against my philosophy, generally speaking. But in this case, I think it's a worthwhile trade-off until I have the hardware for a local LLM
honest review, honest product name "preview edition" GREAT INCREDIBLE PRODUCT not for all, probably not for me neither, not now ... but I think is NOW the right time to launch a product that say IS POSSIBLE is doable, now I'll expect a lot of work from enthusiast and community, two or three years we'll see. I'm really happy ( GAFAM may be not)
Your timing is insane! I did already brought tons of Lenovo Smart Clocks cause my parents really were interested; since they run Android, hopefully there's a way I can transform them into Assist speakers easily 🙂
I don't have a video link, but you can set up a rule in HA that basically checks something like the light level, compares it between outside vs inside, checks conditionals like shades open and TV on, then prompt an Alexa routine for her to TTS you the question. You'd have to likely reply with the wake word and command for Alexa to do the thing. I'm unsure about the last part since I require the wake word for responses. :)
I’ve been working on my own (super basic) version of this thing recently after I realised that the WLED controller my Dad and I designed could work as one (since we recently added a mic and used the ESP32s3). And it totally works… except the speech to text I’m using SUCKS, so it basically never actually does what I want. But no worse than Siri these days I guess 🤷♂️
I'd argue that the hardware and software are more distinct issues than is being duly considered here. When we hear media reports that Amazon is burning hundreds of millions of dollars a year on the back end of their Echo services with professional engineers, the shock that the Nabu Casa community only inches forward recedes quite a bit. This device competes with stuff like the ESP32-S3-Box, and while they aren't giving it away at $60USD, functionally it compares favourably with those more ad-hoc devices. It's long been clear that if we want to kick free from what big tech offers in this space, we are going to have to put in the work and, yeah, we are going to need patience.
The few Amazon Echos I still have set up are terrified of this thing. You can get one here: home-assistant.io/voice-pe
They don't need to worry. This thing will be great in 2032.
Beta is before Alpha with Delta being before that and Theta being the earliest development term
Delta Stage is different from Delta Updates Delta Stage is ready for testing but not public testing, DELTA Updates are pushing only the changed code not whole chunks of unchanged code connected to the changed lines of code saving a lot of time processing-power & data!
12:48 Please can we get the link to that video. Thanks
Off topic, but I was digging through the info on this (your) channel and noticed it has nearly 100m views, CONGRATS!
Haha the whole video led up to that ending skit! Had me rolling.
Thanks Jeff! I was just watching your video you published today. Everything you said in your video completely hit home with me between the comments you get and the unused tech all around. We should meet up at CES if you're going!
Loved it!!
@@SmartHomeSolver Sadly, I'm skipping CES this year and heading to NAB with my Dad. But at some point, we shall meet again!
It's Jeff himself!
Hi i watched your video and you were talking about how your house is actually smart because all the light switches still function how did you do that and the smart home still can control the lights
Basically it comes to this: when it comes to controlling critical stuff in your home (heating, lights, doors,...): it already works. When it comes to less critical stuff (playing media, getting stuff from the internet etc...) not there yet. BUT: this means that you can get your own personal assistent without internet for controlling lots of stuf in your home by voice commands without relying on google, amazon or others... thinking about buying one for use in my own office at home. Knowing the HA community it is just a matter of time and it becomes a competitor to the google/alexa stuff
@reallordy was thinking about purchasing HA green, been using smart things for years and looking for an upgrade. What’s your thoughts on this?
@@suavethreads8904 do it. HA is the best.
@@suavethreads8904you like sandbox games? Then yes. Otherwise no.
HA - lot's of fun. Customisation ALL you want the way you want. You can integrate EVERYTHING.
But it would take you a lot of time, and it might not be pretty in the end, so you'll need to re-do dashboard multiple times, before it's pretty in your opinion 😊
And config backups are on you. You didn't do backup and disk failed - all on you, noone else to blame.
Much more tinkering needed, much more freedom.
Only thing I didn't connected yet is Robo Vacuum, because of laziness 😂
Network Chuck has a growing series about a local LLM. He even has a custom wake word, and voice!
If only he weren't ridiculously annoying to watch and listen to.
I'm happy to see The Community developing an alternative to those commercial versions. I will likely get one once it's refined a bit more (out of Beta maybe)? Thanks for previewing what is coming!
Just ordered mine and stoked to test it out. I switched from a Pi 4 to a mini PC earlier this year, and everything is much faster!
Awesome! Oh and that's great you already upgraded to a mini PC. I need to do that.
I use a intel nuc and it’s supa fast
@SmartHomeSolver the upgrade to mini pc is very worth it. It also opens up the temptation to add heaps of random servers just because you can.
N100 minis came into my life during the pi shortage. Such great little devices.
They're 8 times more expensive than a pi....of course they're faster
You're one of the only folks I've seen dive into the multiple button presses and colors, thanks. :)
12:16 Despite the limitations it still amazes me for its size. It can do a lot for its small size, maybe the next update or upgrade will fix all those limitations. The offline service is super awesome and the ability to play music on a large speaker is great too. It's more secure to keep your business offline. I'd love to have this exact home device for its size and capabilities. 😎💯💪🏾👍🏾
Just wanted to thank you for posting a seemingly honest review of this thing. This obviously will become a much better piece of hardware as times goes on, but I can’t help but think that this was released early to get support and help from the community. I’ll save my money and headaches for now and keep my using my HomePods until this is at a level that makes sense, especially for the missus who argues enough with Siri as it is, lol
I agree with this take. I have the esp box so I've been playing with voice on and off and honestly...it's really still not usable. With openWakeWord I only get successful detection about half the time (and that's only because I'm using the "alexa" wakeword and not "Ok Nabu" which hasn't worked once for me sadly). It also really struggles with some phrases, so I've had to set up aliases which aren't really correct because they're easier for it to decipher. As examples, "hall light" often gets recognised as "whole light" and "desk lamp" just goes all over the place :(
I hope they can pull it together and make it usable at some point. I like the idea of being able to use voice commands to control my housse, but it's just nowhere close yet sadly.
Absolutely love the sense of humour in this video.Great job.
You should definitely try pairing it with an LLM. There are cloud and local options. It has built in support for ChatGPT (cloud) and Ollama (local). There is also HomeLLM addon that gives you a lot more flexibility in what endpoints to connect to (it supports OpenAI compatible endpoints, so TextGen WebUI, llama-cpp server and such all work). And it allows you to control prompting. You can make your own system prompt template using jinja for it.
It sends all info about your house as context to the LLM, so then the LLM brains kick in and it is able to interpret many fumbly commands, even if TTS misinterprets some words, it can still understand what you meant.
The downside is that it's an LLM so it may hallucinate. But that depends on what LLM you are running.
Running all of this on an RPi is not a good idea, even STT and TTS are all neural networks so RPi cant realistically handle them fast enough. With a decent GPU you can get it down to 1-2 seconds between request and reply, including local LLM.
Thank you very much for your honest and down to earth review, hearing others saying that the speaker quality was actually good was quite painful to hear and I coudn’t take the rest of theirs points seriously after that. Looking forward to more updates on this 🎉
Ya know a lot of the voice limitations you mention can also be sorted out via custom sentences and just point the PE to do something in particular when those sentences are said
I moved HA from a container to HAOS installation on a mini-PC recently, an N100 with 16GB / 500GB was about $160, so really cheap. Blast the image down to the NVME and boot, it's really fast. If I decide later to go back to containers, I can use that machine, just re-image it to Fedora or the like and go.
If they support third party LLMs, then it would make more sense to use it with an LLM that is hyper focused on simple home automation commands. You don't really need 70 billion parameter LLM just to automate the home, a small 1 million LLM trained for home automation commands would suffice.
Home Assistant can already support other LLMs, including ones you can run locally, like on the Ollama engine which can run a bunch of free LLM models on your local hardware. The challenge is picking the right hardware to host a model that's a good match to the hardware.
Nvidia just released Jetson mini pc for that purpose it's about 200$ and gives 70bil parameters and it's consumes 25w as well.
On the other hand, now that you can put an LLM as fallback and if you're not against using OpenAI models, the 4o-mini model with tool calling is outstanding and just fills the gap. Plus it is not really expensive.
Still have to update HA to 2024.12 and test it myself.
@@MrElphmanit has shit VRAM, it's designed for edge AI, not for running llms
Home Assistant, Home Assistant, HOME ASSISTANT!
So much for the summoning ritual. I'm curious how long it takes for Paul to show up.
He's basically beetle juice in smart home form
Laughing in YAML
Hey, so here's my current setup for AI assistant is HA, and my toughts on the product :
For STT is use google cloud voice to text.
For commands i use GPT-4o-mini.
For TTS is use an ElevenLabs V2-Fast voice.
It's quite fast, the main slowdown is ElevenLabs.
It's completely Free for a basic usage if you decide to use Gemini API instead of GPT4o-mini.
I tried to use whisper, a local LLM and TTS but with my server hardware it was quite slow and not precise enough, so i made the tradeoff of my privacy for my ease of use.
Looking to build an AI server soon and take back control on all that.
Also, the thing presented in this video has absolutely no use. From my understanding it's just a mic and a bad speaker with fancy led and volume control.
For the same price i'd recommend buying or repurposing an old tablet, setting up HA app and having fun with your own google hub/echo show (with admitedly a worse mic but fixable at very low cost). This thing has absolutely no hardware power and rely on other hardware for local LLM or the cloud like everything else, this is NOT at all an entrypoint to AI assistant like advertised.
Do not hesitate to correct me on this or ask questions :)
Honestly, as an owner of 6 Amazon Echo Show 5 devices, the "shortcomings" of this device are exactly why I want it. I hate, hate, HATE when I want to turn a light on or off and my Echo wants to carry on a conversation with me. I want my Alexa devices to control my devices with basic commands and nothing more. There's no way to do this w/o the stupid thing trying to sell me crap at every turn. I want to 100% disable music streaming with Alexa and I can't. I paid quite a bit of money for these devices to barely do what I want them but a whole host of extra curricular activities that I want no parts of. Too bad they don't offer a higher price to actually be able to control me devices. This device will do all I need it to do right out of the box: basic voice commands for devices like lights & fans for when I can't use automations. I can't wait! Viva freedom!!
Watched a bunch of your videos nonstop for a few months just ordered a rasperry pi 5 8gb with the ssd add on board in a few days i can start home assistant projects!
I built one of these with a pi zero and mic hat. Fun and easy experience glad this is here now as my home server is built and I was just about to order a bunch of pi zero 2 to start building these as I'm buying a house soon and gonna go nuts lol
Beautiful ending! Well done sir.
The TTS and STT add-ons can be run on a separate device as Docker containers. I have mine set up to run on my Synology NAS in Synology's Container Manager application. That lets me run HA on dedicated hardware (HA Yellow) but also get more horsepower for the speech add-ons.
Same with the conversation agent. You can run that to the cloud, as shown. But you can also setup ollama on a completely different system (PC), and allow HA to handle the limited agent (on/off/dim) while giving questions it can't figure out to the faster system as a fall-back. Any system with a reasonable GPU can handle pretty complex models.
Nice video, I’m so excited for all the great hardware coming out for voice with HA.
Most of the shortcomings can be fixed several ways - either through the intent scripts you tried or using a custom sentence trigger. But the easiest is an LLM integration, it really is great at understanding the context of your request.
damn that's fucking sick. I'm gonna look into this because a completely offline voice assistant is insane.
Sounds like something good to “tinker” with.
I’m sure this thing is definitely going to improve rapidly. I remember when they released support for the M5 Stack Atom Echo. At first it was horrible and now it’s pretty damn good!
Exactly! With some update it will be a good alternative for the ESP32-S3-BOX or M5 Stack Atom Echo
@@chrish6373 Absolutely! The evolution of voice control and smart home technology is truly impressive. It's amazing how quickly these systems can improve. The M5 Stack Atom Echo is a great example-early versions often have their quirks, but with continuous updates and community feedback, they can become incredibly efficient and reliable.
Subscribed. Thanks for keeping moving and keeping chill.
HA has a fallback option that lets it go to an LLM (ChatGPT or a local Ollama for instance) should the built-in voice functionality be unable to grasp what you want, perhaps that could help with making voice interpretation more natural. Voice on a Raspberry Pi is kind of a fool's errand, although a Pi 5 has enough horsepower to at least keep delays down to a second or two; you really want beefier gear.
Is it possible to ask random questions and have a conversation with it?
@@FawziBreidi With the cloud LLM fallbacks (chatGPT), I think yes
I believe both Jeff Geerling and Network Chuck did some videos on how to set this up. They've also posted recent reviews of the HA Nabu Casa so they should be easy to find from there, as they mention doing exactly this. :)
@@FawziBreidi I built a conversation agent that happens to use the free version of Google Gemini, that has the personality of Marvin the Paranoid Android from Hitchhiker's Guide to the Galaxy. This is easy to do since you can specify your own prompt prefix for the AI conversation agent. Marvin is great to chat with, even in the context of turning lights on and off, which he is sure to remind you is completely beneath him and his skills. I'm endlessly amused by it; though I suspect my wife would grow tired of it very quickly. Though since you can specify different voice pipelines per voice assistant, I can have the on in my home office be infused with the Marvin (Genuine People) personality, while the others can be more boring.
@@FawziBreidi You can also do general questions locally with ollama if you have a system with an older GPU laying around. I threw a Quadro M2000 into my HA just this month to setup ollama, which I'm talking with via an ESP32-S3-BOX (similar to this device). I gets most questions right, can keep context across a few questions, and replies within 2 to 5 seconds depending. It can even control exposed entities if you get a model with tooling (like llama3.2 model). That's a bit more setup, but it's quite snappy.
What I'm waiting for next (or may work on this break) is the ability to specify multiple models, and enable either cascading or multiple parallel LLMs to try to get answers. Kind of like what was done with using the very limited local model (on/off, etc). But if ollama can't answer it, have it reply "I need to ask the internet" and then query chatGPT. That's the ideal mix, since >80% can be done locally, and the things it can't do (current events, etc), it can use the internet for. Far less reliance on third party, and/or them getting my personal query/usage data that way.
Kinda bummed it doesn't have a POE option. Maybe in the future.
It does
Gigabit Type C PoE Splitter 5V 2.4A, IEEE 802.3af Standard, 10/100/1000Mbps Type C Power Over Ethernet Splitter for 5v
@@clif88 That's just for power. Having true PoE would mean no need to use WiFi at all. This would be the preferred method for me as well.
they honestly wouldn't look bad in-wall either.
The assistant battles never get old 😂
Given your optimal use case.. could a stop gap video address updated information for actionable notifications in home assistant. As a new HA user, love the capabilities but struggling with outdated info and guides. Otherwise great video!!!
Love it Love it Love it! And, yes, I'm referring to your review vid.
I agree that the hardware looks awesome, and the feature set is what I'm looking for. I currently have an Amazon Echo and/or an Alexa-enabled device (ecobees, etc) in every room. Nabu Casa exposes all HA devices to Alexa, and we use her purely for voice control and for whole home audio. So, this looks like a perfect substitution if/when needed (which might be soon depending on what the Alexa upgrade brings).
But yes, the software needs to be updated. Like you, I have faith in the community and in Nabu Casa that this will happen in time.
Loved the voice assistant battle at the end!
@SmartHomeSolver can you do a video on how to link 2 home assistant instances together. So you can see details of a vacation home or RV
So basically you're saying you need to straight up run it through your home hosted GTP so that it understands normal sentences, that's really not a big deal.
It may be that average users would expect it to understand more than it does out of the box, but you can still make it recognize different commands.
Give entities one or more aliases, and then it will know that you mean that one when you say it that way.
I am interested to see how this looks in another six months, and what advances they will make in the processing. Gonna keep an eye on this one. Would be a great thing for privacy conscious people with disabilities.
The ending eas the best 😂
I'd really like to see a comparison video between this and the S3 Box. What are the pros and cons of each?
I've been playing with ESPHome the last few days with a Raspiaudio Muse Luxe speaker, and I agree that it is still quite dodgy. I'm certain that it will get loads better over time.
Can you run a docker of the llm locally on a more powerful machine? That would be useful given the home assistant yellow is low powered
Yes. You can run Ollama in a docker container and connect to it. Personally I'd rather use the Windows version of Ollama rather than the Docker version, simple executable file and notifies you of updates. I run Ollama and connect to it both from Open WebUI running in WSL and Home Assistant running on a NUC. I might be moving everything bit for bit to Linux on my Proxmox server to run it side by side with Piper (TTS) and Faster-Whisper (STT) but I'll pick a simple self-updating windows executable over yet another docker container any day.
Yea if u add the llm to it then you won't have the problem of it not noting what you said. I setup ollama for it. And it's running on my pc with the home assistant on a pi
Could you use to announce alerts, ie. If you had a fridge sensor that said the fridge temp was above 4 degrees C could this device audibly alert you , play an alert tone?
Yes, because of it being also a normal speaker. You could play audio and text-to-speech
Do you know if this can be used to provide person tracking via BLE, either out-of-the-box or via modification (software and/or hardware)? Would be a nice way to add both capabilities around the house.
I just migrated from Raspberry Pi 3 to VirtualBox instance on my server PC and it's night and day. Highly recommend
What's the reason why you are using VirtualBox on your server and not Proxmox (Linux-based OS specifically for VMs)?
@@xitee4245Could be they already have a Windows server PC set up and it was easier to add on using Virtual Box instead of starting over installing ProxMox. That's why I ended up going that route anyway.
That ending was so priceless 😂😂
Can it trigger a routine when the timer ends at least without having to run it through IFTTT? I miss my visual light cue. :)
I've been waiting for something like this for a long time! By the way, you are always the best when it comes to reviews! and everything in smart home subject🔥🔥🔥🔥🔥🔥
What if you install Home Assistant on Jetson Nano (super) and then use a local LLM to respond back instead of sending it to ChatGPT?
At least software can be updated over time. The hardware seems decent and can’t be fixed post shipping. I’m confident they’ll be able to iterate on what’s driving you nuts today!
Actually the hardware has an extension port underneath, so to some degree it can be "fixed upon".
Thanks a lot of the things like software and how it acts can be updated over time especially as they get more users on it and more telemetry data to be seeing people saying commands like play instead of resume and then put in the logic for if it's paused play what it was or resume it.
Lots of potential, this version probably not for me but liking the enthusiasum from the community.
Hoping the guys figure a way to make this work modularly with local LLM so you can have your HA green be central HAOS and somewhere else be the LLM engine , since i think we entering an era where LLM hardware is going to change a lot ...essentially let us pick our "cloud"
Agree with the take and enjoy the comedy
Curious if you can add more than one of these devices to your smart home? I have 10 HomePods (Gen 2) and HomePod Minis scattered throughout my house. Can I do the same with the HAV PE?
Pls clip the ending skit as shorts. It’s gonna print you money lol
Hi, I'm looking for a remote control button that look like a light switch and work with HA. I would like it to turn some smart plugs on/off. Any suggestions?
Using this to run the basic voice assistant software doesn’t seem like it’s the special part of this. What makes it unique is the ability to easily use a local or cloud-based LLM to control your home. Then you can get true natural language control and responses.
If they carry on with this item, it could become something special.
This has got a lot of potential
You can do full local if you want... Or you can add chatgpt as an assistant and blow Alexa out of the water.
I got Ollama running and it can do more
That last part was funny :D
I can't help but be curious about the app eventually gaining voice abilities of its own which will link back to the home assistant you have set up. Ultimately replacing google assistant and siri at that point, lol
Funny cuz that particular echo didn’t have a display either. Not even the clock display. But… I don’t understand the whole starting up a conversation with you randomly. I am kind of torn. I think a smart home should be in the background just doing its thing. But it should be able to do what you want. Personally I wouldn’t want a smart home randomly suggesting to me what I want… or don’t want. Preferences… maybe. But “hey it’s bright should I shut the shades?” If you wanted them shut wouldn’t you just ask it?
I think it would be helpful in some cases and intrusive in others. Definitely a case by case basis and not for everyone. I could personally use a reminder like SHS set up mentioning it's nice outside, asking to open my shades if they're closed, and recommending I head outside to touch some grass. 😉 Or perhaps asking if I want to turn off the HVAC and open some windows. Something like that.
I have this semi setup with my echos, by calling into the API to do an announcement. Just an example of things it does:
* Alerts me when the motion sensor/camera detects a person coming up my walk or driveway.
* Alerts me when a car is pulling into the driveway.
* Tells me when it detects flashing lights outside (eg cops/ambulance/fire are at any of my neighbors)
* Reminds me to take the trash to the curb if the motion sensors on them have not moved by 10pm.
* Nags me on occasion to go to bed if it's past 1am and I've not gone to bed yet.
* Reminds of events on my social calendar.
* Alerts me if I go to bed while one of the outside doors is still unlocked.
I would love to have it offer to pull up video for the first two. Or ask about locking the locks, and allow me to just say yes/no. As it is, it alerts me, then I have to wake it again, tell it to do something, and wait for that action to happen. Same with the trash, in case it's a holiday and delayed a day. Those are just what I can think of off the top of my head. :)
It'd still be an upgrade to a normal automation process. Instead of automatically opening the window when the temperature reaches 28 degrees in the office, suggest it. You might not want to. And combined with an LLM running, it might even suggest it even if you haven't automated it. Alexa also sometimes bring up suggestions during an interaction - even if they're probably based on human scripted interactions rather than AI.
Would it be possible to use two of these as a wifi intercom?
I wonder how this compares to that local way of doing it with a raspberry pi. The only issue I had was that the speaker wasn't that great. So I was thinking if something like this works fine, I could rip open my echos and steal the speakers and maybe case and it could work the same way with great speakers.
I am curious of the performance from their Green HomeAssistant server box.
Would the Jetson nano super be good for home assistant voice?
Is the function of the rotatable ring able to be changed? I'd much rather have physical controls for the dimming of a room light. Also, how are the microphones at picking you up from a distance?
It is changeable, and I would also want it as a dimmer. Apparently improving the microphones was one of the main focuses compared to the older ESP32 devices. From what they say it will hear you across the room over music.
What’s the blue lamp behind you
Very interesting. Does the new piece of kit rely on VoIP? Home Assistant broke that facility (to speak to it using an analogue phone and a Grandstream box) for some of us using HAOS back in October and hasn't yet fixed it.
@1:43 Hilarious. Expect some weird guy to install this as a default response in the next software edition for the HA Voice Preview edition 🤣
Was that agent cooper from twin peaks i saw? Nice!👍
How does this device compare to the FutureProofHomes Sattelite1 device?
hm, for 60.- I get an Alexa Dot with a rather good speaker. Having to add a proper speaker is a hassle. Make it bigger with a proper speaker?
I’d like to know what specs are considered “powerful hardware”, which is a very vague designation for what I might need to run it entirely locally.
I have a Dell Wyse 5070 mini PC that has a pretty low base clock speed but can get some pretty decent boost speed. It’s been running HomeAssistant OS very reliably but I’m not sure how intensive the voice assistant add-on will actually be on it.
For about the same price, you can get an ESP32-S3-BOX, which can do all that and has a programable touch screen, and a button or two, and temperature sensor. That's what I have setup right now to semi-replace my Echo Dot, and love it.
If you have a PC running HA, and can add an older GPU to it (Gen5.2 go for under $50 these days), you can run ollama locally to do 80% of the "smart" things you want to do now without internet. The only thing HA/ollama can't really pull off right now are things involving internet search or current events. (Where "current" is since you last updated the model you're running.)
I'm running on a lower-end (low power) PC with Quadro M2000 and Ollama3.2, and get really good replies to basic questions 95% of the time in under a couple seconds. Anything past, it knows well, like "Who was president of the US in 1987?" Current events you need to either prompt it well in HA, or ask "Who is the most current pope you know of?" (vs "Who is the current pope?" which it will not generally answer, not knowing how accurate it's last known answer is.)
Ending lol 😂
Can it run a local LLM like an ollama server? I want to have a normal conversation like a chatbot and not just control devices. Is that possible?
"hey babe" 🤣🤣🤣🤣🤣🤣🤣🤣
Ordered mine within the first minute... looking forward to it replacing my s3-box.
I was considering the s3-box. How do they compare? I would have thought the s3-box would be better as it also can have a screen
@Ballissle from what I've seen.. radically improved mics.. but the xmos chip inside it is the magic.. makes it *way* better experience... I'm a contributor to the s3-box community firmware... so I've managed to get a decent experience from it.. but this VPE will take HA's VA to a new level... if your primary use case is a desk touch screen though? sure.. the s3-box is also excellent for that.
0:43 How did you do that in a video editor? That's crazy that your hand must be perfectly still and perfectly centered!!!
Look a bit closer. You can see the jump cut.
Heya! thanks for the review. Are there any other similar devices that work well with home assistant? I am looking to get rid of google home devices and place some microphone/speaker set-ups like this for commands only not listening to music or anything.
the ending soooo goood 🤣
You definitely need to upgrade your home assistant to something more than a raspberry pi! I run mine as a virtual appliance on my four node vCenter cluster of HP DL 380 g9s. And so it has eight logical processors but it can actually use the megahertz speed of the 40 processors at 3.5Ghz
But even running it on an older computer, or hell even a headless laptop, would be loads faster on a lot of the tasks that you don't even think about. I have a fairly large home assistant install with a metric crap ton of devices, and from your videos you do too. I would imagine little tasks like sending commands and logging and stuff like that the raspberry pi does but you've grown used to it and so it may be a little slower. I'll bet you'll be amazed after putting it on a much faster system. Even if you have a AV rack or something, you can get a small 1U server that uses a lower end processor and stuff they're relatively cheap, and run home assistant like a champ!
If you want the full local LLM (LLLM?) experience, even that lower end CPU will not do. You'd want to run on a GPU and better models require more video RAM.
1:49 the crossover I didn't know I needed
Do you get more feature support if you enable AI support?
You can connect to Gemini for free and it fills in all the gaps for natural language interaction with the smart home. I realize this is not fully local, and actually goes against my philosophy, generally speaking. But in this case, I think it's a worthwhile trade-off until I have the hardware for a local LLM
honest review, honest product name "preview edition" GREAT INCREDIBLE PRODUCT not for all, probably not for me neither, not now ... but I think is NOW the right time to launch a product that say IS POSSIBLE is doable, now I'll expect a lot of work from enthusiast and community, two or three years we'll see.
I'm really happy ( GAFAM may be not)
Your timing is insane! I did already brought tons of Lenovo Smart Clocks cause my parents really were interested; since they run Android, hopefully there's a way I can transform them into Assist speakers easily 🙂
That would be sweet! Since all of this is open source it should be possible.
@@SmartHomeSolver Will definitely try doing so in the future!
Hahaha, I also got a bunch of Lenovo devices for the same sort of thing. Now I just need to find the time for some tinkering . . .
Can be ship to Canada address
And I hope you had a good holidays
Hi, How can I allow Alexa to ask me to run a certain automation? 12:42 Can you share the link of the video?
I don't have a video link, but you can set up a rule in HA that basically checks something like the light level, compares it between outside vs inside, checks conditionals like shades open and TV on, then prompt an Alexa routine for her to TTS you the question. You'd have to likely reply with the wake word and command for Alexa to do the thing. I'm unsure about the last part since I require the wake word for responses. :)
I’ve been working on my own (super basic) version of this thing recently after I realised that the WLED controller my Dad and I designed could work as one (since we recently added a mic and used the ESP32s3). And it totally works… except the speech to text I’m using SUCKS, so it basically never actually does what I want. But no worse than Siri these days I guess 🤷♂️
Home assistant should fire this to the pipeline so your effort is not in vain.
Now let’s see one for Hubitat.
ReSpeaker Lite with XIAO ESP32S3?
I would like to see an integration to hubitat!
I'd argue that the hardware and software are more distinct issues than is being duly considered here. When we hear media reports that Amazon is burning hundreds of millions of dollars a year on the back end of their Echo services with professional engineers, the shock that the Nabu Casa community only inches forward recedes quite a bit. This device competes with stuff like the ESP32-S3-Box, and while they aren't giving it away at $60USD, functionally it compares favourably with those more ad-hoc devices. It's long been clear that if we want to kick free from what big tech offers in this space, we are going to have to put in the work and, yeah, we are going to need patience.
The end of the video is FIRE! Loved it.
Pre-Alpha comes before alpha….. this is why anyone with a camera shouldn’t be allowed to do reviews…..
Great review! I've been eager to see this device come out. Thanks again!
Matter smart hub is better like the Google TV Streamer... 💯