I'm doing the same thing on a rpi4 with no tpu using opencv and facial detection, and I can't tell the difference. I wish these videos would be more measured with clear demonstrations. Looks choppy.
Back in 2017, RPI 3b. I used it to do basic machine learning on the edge. It was capable of simple stuff for even (basic) image processing. Did I do it all on the edge? Hell no. After doing what it could I transmitted data to the backend. But it did the easy stuff like image filters, color inversion, even some Gaussian work. Amazing to see where things have progressed.
As far as I've researched you can't get the dual core TPU working, since the Pi 5 PCIe socket is single lane 2.0. Only one of the cores will show up. You can though mess with config files to get 3.0 speeds. If anyone figures the single lane limitation to be false let me know ..because that's exactly what I had bought and hoped for.
It is possible to design a board that alternates between the two chips over a 1x lane. If you operated that at 3.0, you would just get 2.0 speeds for both of the chips. I think. Same way you can have a bunch of usb ports from one pcie lane. They just compete with each other.
Awesome setup! I wonder how it does with llm inference. Could.you try running ollama and see if the accelerator makes any difference? Idk if the coral has the matrix multiplication abilities.
I have a pi5 8gb and get around 1.4 tokens per/sec with 7B LLMs on llama.cpp running 4 bit quantized models (Q4_K_M.gguf's from TheBloke on huggingface) on an sd card. On SSD it's ~1.5 and on Pimoroni's NVME base its ~1.7. I've tried ollama a couple of times and it seems to be way slower but that could also be from a mistake I've made. I setup with a 16gb swap and ollama keeps putting half the model in swap.
@@geobot9k How do you go about downloading a .gguf model from HuggingFace? I cannot find a big obvious download button lol. I would like to use something like this for something like GPT4All. I just started with AI for classes so I am very new to all of this.
On CES 2024 there were new NPU on Edge devices introduced.They use PCIe Gen 3 with m.2. The first one is MemryX MX3 Edge AI Accelerator and the other is Kinara Ara-2
You aré using a wrong preset. Raspberry pi 5 can only decode HEVC ( h265 ) , so if you add More cameras It Is probable your container Will crash even with the TPU.
Nice! At that price, what about an array of accelerators hooked up to the Pi. Maybe work with some company to put together a kit (for home security). Person can program to their liking, but you give them a nice standard implementation. Multiple cameras recording to multiple Pi's (with storage attached) would be good security. They'd have to find all the Pi's. Camera's need to be wireless (separated from the pi) so you can hide the Pi's away real well.
It looks like you're using a card with a single edge. There's nothing on the Pineberry site that specifically says whether it works with the dual Edge model... do you know whether it does? I'm thinking the fact that it's E-keyed suggests it would...
Noway! Just discovered your channel. Gained a new subscriber. I was wondering, can you install your setup in a rc car. Having the gpt autonomously control it? Thanks again for your vids 🤙
can TPU assist with video editing? e.g. record multiple takes and have it cut into 1 best version; suggest images to include; find stock footage; create transitions, effects, text, etc.
I wish the coral TPU could accalerate home-assistant‘s voice assistant. I‘m really not interested in frigate but there does not seembto be much more I could use the TPU for in my cloudless smart home, or do you have any more ideas?
Would I be able to run the mistral 7b, as a casual chatbot with short responses(with 4 second max time to first token or so) on desktop using the coral ai usb accelerator or even two? If you could test it on your setup, Id be very thankful. Pls respond :>
Your safest bet is to use an Nvidia graphics card with enough memory to fit the LLM that you want to use. For a 7b parameter LLM with 4bit, you should be able to make do with 8GB of VRAM.
@@_IamKnight It is a good card. I can't say, how well it will work for your specific need's. But many seems to get faster answers from their local run LLM's than what they get from paying to get access to ChatGPT.
This is great. Can the Coral TPU on the Pi improve or allow for better license plate recognition? ALPR would be the #1 reason I’d implement a video classifier.
Excellent instruction. I just received my m.2 tpu a+e key. Currently running frigate in an Hp800 elitedesk core i5 with Debian 12 linux mint edition. I'm not very good with linux. Can I run your script to install and setup the coral tpu? I'm going to install the tpu in place of the wifi board. Any mod I need to add to the frigate docker compose? Thanks in advance!
Very interesting. Could be a nice replacement for Synology Surveillance Station. My cameras are set up for movement detection. In practice this means they are recording all the time because of a spider web in front of the camera or water or bugs. These cameras attract al kinds of bugs. So it's pretty useless right now.
having simply stumbled across this video on my feed, i struggled to understand what the AI was doing, or what the purpose was of doing this. Maybe it was at 1:11 and it's purpose is to record when someone is on camera?
Hi. Thanks for the video. But I have a question, before I kill another running system on my pi-5. Could you please tell me if the "sudo tee -a /boot/config.txt" works on a ubuntu server 24.04 ?? Because there is no config.txt in boot, instead it is in /boot/firmware/cofig.txt.... Also, as I have tried this change before in the /boot/firmware/config.txt it just killed my system by passing "kernel=kernel8.img". I don't know why, but after reboot the LED indicated kernel not found.....
Just found your channel and as a ML engineer this is great stuff! Only recently been looking into integrating into a raspberry pi so this is very useful! I don’t know if you have a video out on it now but have you tried integrating one of these optimised AIs into a django/flask/fastapi framework on a raspberry pi so we can interact with a locally ran api from an external device? And if so how was the performance on these? Would be useful for having a quick app be able to find information on trained documents
LLMs are bandwidth limited on parallel processors, so having the faster 3.0 pcie connection will be a speed up over any other pi solution. Maybe like a couple tokens a second on 7b models which could be usable. The 7b bit models are not usable like chatgpt tho. They can do like sentence completion prediction maybe. This is better for speech to text or object detection.
Do you think it's possible, with a little bit of code, to use a wide-angle camera in the right spot in a room to get not only a people presence sensor, but also something like zone presence like the Aqara FP2? If I can see the whole room I can tell how many people are in each zone and I can eliminate all motion or presence sensors with one camera. The next step will be to add facial recognition, so I can eliminate all the people tracking or room tracking sensors. The end game will be posture or gesture recognition to get rid of voice control and give commands on the fly with simple gesture or trigger automations based on recognition of the action a person is taking at a certain area of the room. With AI's advancements on image recognition I believe that within a short time we will only need 1 camera per room to replace all the sensors in the house (hypothetically even for door and window sensors just instruct a model that sees things open from the video stream) Am I being too optimistic?
I was running that script, after script is done I'm running "ls /dev/apex_*" and it says: "No matches found: /dev/apex_*", I have the same hardware that you showed in the video.
i wonder is this possible with ollama? i mean most of their models are 4bit to 8bit int and the coral uses 8bit int and can do 4tops and some models can do 8tops. i have a dell power edge r630 wich uses 2 xeon E5 2683v4 cpus who a capable to do 8int operations over 32 threads with 8.9477,5 TOPS in summary. so yea i know exactly what i expect in performance, but my system uses 400W and the coral only 2w. and given that fact the coral uses only 2 pcie lanes i could use more then one in my system with 40pcie lanes i have in spare. i know the r630 is old but 300€ for such powerfull machine who can beat this value of performance and features to price. i dont know much about the coral and other npu and tensor core systems. maybe i can use multiple corals to increase the performance. also the coral uses only 2.0 pcie and i dont know how much data really going over the bus. but using sas12g ssd drives is no problem and with 8 drives in raid 0 i could get up to 8gb/s read and write speed. so if the coral need really both 2.0 lanes i should easy feed 8 corals. if no losses i would get 64TOPS with only 16W of power. what a massive number. maybe some one has a answer of this question if this would be possible if not and i can use only one or none for ollama. is ok in this case i leave it like it is now even it uses much more power. power usage is not really a concern for me. but given the fact the r630 supports only 35w pcie devices up to 3.0 and the limited space of 1 slot and half height my options are limited. the nvidia tesla t1000 8gb gpu uses 40W wich is 5w over the limit but it works has also a 10TOPS performance and costs about 380€ each. about gpus, i know i can use multiple gpus for but tpu i have no idea.
Hi I am having an doubt like I Need to install windows 11 on raspberry pi 5B along that i need to install so many things, but first I need to install stable diffusion model on locally so what i need is that which can be more useful for that text to Image features and how can I use it like my GPU and voice conversation like that which hardware is best for those can you please tell me and if possible can be please make a video of it on how to do it please?😊
Thanks for doing this. Do you know if Pineberry or anyone else makes a HAT that accommodates the Coral accelerator AND a NVME drive? I happen to have a USB Coral unit lying around (thanks Google! and Tiny ML) so I will start with that, but would love to go faster! You have really inspired me to give it a try and see what sort of ML/AI performance can be achieved on the Pi 5.
Listen closely developers to the problems running python. . This is 1000% the second reason why you shouldn’t write tools system tools in python. It displaces the need to resolve dependencies to the end user. The first reason is efficiency esp for the pi.
Person whats needed for a diy surface to air missile, no radar, no ir no jamming it. Great for suicide drones, load the image of the person you dont like and theres no jamming it
The video that he is showing there on the birdview is only use for the detection. The video that is recorded by frigate is normal frame rate like 25 or 30 fps
The video that he is showing there on the birdview is only use for the detection. The video that is recorded by frigate is normal frame rate like 25 or 30 fps
damn people are calling you out making videos saying your a scammer and a fraud. they dont use your name tho i found you by reverse search. whats your response to the allegations?
Only recently discovered your channel, and it is SOOO unique! Looking forward to more embedded+AI videos, keep it up!
I'm doing the same thing on a rpi4 with no tpu using opencv and facial detection, and I can't tell the difference. I wish these videos would be more measured with clear demonstrations. Looks choppy.
I get a laggy session in my pi4b 4gb trying an open source version of an alexa . Yeah I can tell you the difference
For facial detection and stuff like that, invest in an NPU
Back in 2017, RPI 3b. I used it to do basic machine learning on the edge. It was capable of simple stuff for even (basic) image processing.
Did I do it all on the edge? Hell no. After doing what it could I transmitted data to the backend.
But it did the easy stuff like image filters, color inversion, even some Gaussian work. Amazing to see where things have progressed.
As far as I've researched you can't get the dual core TPU working, since the Pi 5 PCIe socket is single lane 2.0. Only one of the cores will show up. You can though mess with config files to get 3.0 speeds. If anyone figures the single lane limitation to be false let me know ..because that's exactly what I had bought and hoped for.
PCIe hubs exist, and are relatively cheap
It is possible to design a board that alternates between the two chips over a 1x lane. If you operated that at 3.0, you would just get 2.0 speeds for both of the chips. I think. Same way you can have a bunch of usb ports from one pcie lane. They just compete with each other.
It seems impossible to use I wish I researched before buying one
@@timjenkinson26clock reference issue?
Awesome setup! I wonder how it does with llm inference. Could.you try running ollama and see if the accelerator makes any difference? Idk if the coral has the matrix multiplication abilities.
I have a pi5 8gb and get around 1.4 tokens per/sec with 7B LLMs on llama.cpp running 4 bit quantized models (Q4_K_M.gguf's from TheBloke on huggingface) on an sd card. On SSD it's ~1.5 and on Pimoroni's NVME base its ~1.7. I've tried ollama a couple of times and it seems to be way slower but that could also be from a mistake I've made. I setup with a 16gb swap and ollama keeps putting half the model in swap.
@@geobot9k How do you go about downloading a .gguf model from HuggingFace? I cannot find a big obvious download button lol. I would like to use something like this for something like GPT4All. I just started with AI for classes so I am very new to all of this.
Your can't. The coral has 1 GB ram, you will need a model that is very small.
Edit: there is new models with 2 and 4 gb now
@@G.Seuros I ran across TinyLlama last night and it’s Q5_K_M.gguf is at 785MB. Could be worth a shot
On CES 2024 there were new NPU on Edge devices introduced.They use PCIe Gen 3 with m.2. The first one is MemryX MX3 Edge AI Accelerator and the other is Kinara Ara-2
You aré using a wrong preset. Raspberry pi 5 can only decode HEVC ( h265 ) , so if you add More cameras It Is probable your container Will crash even with the TPU.
Thank you for the great video! Is there a way to adapt both nvme and coral tpu m2 by using two HATs?
No, there's only one PCIe lane
And, is posible ssd and USB adapter for external Coral USB and all in a one case?
@@galdakaMusic It's definitely worth making such a case.
Pineberry pi offers now a hat with 2 slots, key e and key m so you can use both despite only 1 lane.
@@distiking You sure it's possible to use both slots at the same time? I'm not so sure
Nice! At that price, what about an array of accelerators hooked up to the Pi.
Maybe work with some company to put together a kit (for home security).
Person can program to their liking, but you give them a nice standard implementation.
Multiple cameras recording to multiple Pi's (with storage attached) would be good security.
They'd have to find all the Pi's.
Camera's need to be wireless (separated from the pi) so you can hide the Pi's away real well.
can you run ollama and run a llama2-uncensored model on it. please?
+100
I think he'd already run an ollama model on them.
No tpu.
I think the performance seemed quite well.
Look for "running llm on rpi5" video from him.
It looks like you're using a card with a single edge. There's nothing on the Pineberry site that specifically says whether it works with the dual Edge model... do you know whether it does? I'm thinking the fact that it's E-keyed suggests it would...
Noway!
Just discovered your channel.
Gained a new subscriber.
I was wondering, can you install your setup in a rc car.
Having the gpt autonomously control it?
Thanks again for your vids 🤙
More info about this chip? Is it customizible?
Cool video, have you tried the hailo chip tho?
can TPU assist with video editing? e.g. record multiple takes and have it cut into 1 best version; suggest images to include; find stock footage; create transitions, effects, text, etc.
I wish the coral TPU could accalerate home-assistant‘s voice assistant. I‘m really not interested in frigate but there does not seembto be much more I could use the TPU for in my cloudless smart home, or do you have any more ideas?
wouldn't that make it... Raspberry PAI?
frigate is pretty cool, i set it up at home, but still have trouble detecting some events, especially animals
Ive got a Jetson Nano and just got a RPi5, are these two and to be combined to work together?
Would I be able to run the mistral 7b, as a casual chatbot with short responses(with 4 second max time to first token or so) on desktop using the coral ai usb accelerator or even two?
If you could test it on your setup, Id be very thankful. Pls respond :>
Your safest bet is to use an Nvidia graphics card with enough memory to fit the LLM that you want to use.
For a 7b parameter LLM with 4bit, you should be able to make do with 8GB of VRAM.
Thanks, do you think that the rtx 3070 would have a decent time to first token?
@@_IamKnight It is a good card. I can't say, how well it will work for your specific need's. But many seems to get faster answers from their local run LLM's than what they get from paying to get access to ChatGPT.
Hi, what is the Camera you used for your content ?? is it a Canon ?
name and model ?
Thanks
Is it possible to run LLMs locally on it??
Is posible Rpi5 with Coral m.2 and ssd?
This is great. Can the Coral TPU on the Pi improve or allow for better license plate recognition?
ALPR would be the #1 reason I’d implement a video classifier.
Hypothetically, yes it should be faster. TPU is like a GPU but built specifically for processing tensors/AI.
Excellent instruction. I just received my m.2 tpu a+e key. Currently running frigate in an Hp800 elitedesk core i5 with Debian 12 linux mint edition. I'm not very good with linux. Can I run your script to install and setup the coral tpu? I'm going to install the tpu in place of the wifi board. Any mod I need to add to the frigate docker compose? Thanks in advance!
Is this the only use case for this device? Can the AI be used to for object detection for robots using ROS?
Have you updated your patreon download? The GIST you mention in the vidoe no longer works on latest RPI5 images.
Im not in the frame
Very interesting. Could be a nice replacement for Synology Surveillance Station. My cameras are set up for movement detection. In practice this means they are recording all the time because of a spider web in front of the camera or water or bugs. These cameras attract al kinds of bugs. So it's pretty useless right now.
Where can you actually buy one of these for $25?
I can find no-one who has them in stock selling them at list price.
This is the content we want.
Does it work with Ollama?
having simply stumbled across this video on my feed, i struggled to understand what the AI was doing, or what the purpose was of doing this.
Maybe it was at 1:11 and it's purpose is to record when someone is on camera?
how can i use it to tracking an object and output to tow servo motors to act like pan tilt , how many fps i can get , thanks
_"you can run up to ten cameras"_
This is the literal definition of *"NOICE"*
Hi. Thanks for the video. But I have a question, before I kill another running system on my pi-5. Could you please tell me if the "sudo tee -a /boot/config.txt" works on a ubuntu server 24.04 ?? Because there is no config.txt in boot, instead it is in /boot/firmware/cofig.txt.... Also, as I have tried this change before in the /boot/firmware/config.txt it just killed my system by passing "kernel=kernel8.img". I don't know why, but after reboot the LED indicated kernel not found.....
The USB 3 ports are faster than the rpi4 thanks to the RP1 chip, they might provide more power too.
Just found your channel and as a ML engineer this is great stuff! Only recently been looking into integrating into a raspberry pi so this is very useful! I don’t know if you have a video out on it now but have you tried integrating one of these optimised AIs into a django/flask/fastapi framework on a raspberry pi so we can interact with a locally ran api from an external device? And if so how was the performance on these? Would be useful for having a quick app be able to find information on trained documents
what types of models can be run with this? Can it speed up llms?
LLMs are bandwidth limited on parallel processors, so having the faster 3.0 pcie connection will be a speed up over any other pi solution. Maybe like a couple tokens a second on 7b models which could be usable. The 7b bit models are not usable like chatgpt tho. They can do like sentence completion prediction maybe. This is better for speech to text or object detection.
1:42 and a raspberry pi 5 130.-
Do you think it's possible, with a little bit of code, to use a wide-angle camera in the right spot in a room to get not only a people presence sensor, but also something like zone presence like the Aqara FP2?
If I can see the whole room I can tell how many people are in each zone and I can eliminate all motion or presence sensors with one camera.
The next step will be to add facial recognition, so I can eliminate all the people tracking or room tracking sensors.
The end game will be posture or gesture recognition to get rid of voice control and give commands on the fly with simple gesture or trigger automations based on recognition of the action a person is taking at a certain area of the room.
With AI's advancements on image recognition I believe that within a short time we will only need 1 camera per room to replace all the sensors in the house (hypothetically even for door and window sensors just instruct a model that sees things open from the video stream)
Am I being too optimistic?
I was running that script, after script is done I'm running "ls /dev/apex_*" and it says: "No matches found: /dev/apex_*", I have the same hardware that you showed in the video.
Any chance you could get facial recognition working too?
This adapter works with Coral M.2 Accelerator with Dual Edge TPU??
There is a specific one from pineberry for the dual TPU version
anyone else unable to set up the driver?
ls: cannot access '/dev/apex_0': No such file or directory
same, but on the other hand I tried with USB Coral... however, shouldnt matter
@@pluronic123 I've tried so many solutions I found online, but still no luck. I think I need to try to downgrade from kernel 6.6 to 6.1 (if possible).
i wonder is this possible with ollama? i mean most of their models are 4bit to 8bit int and the coral uses 8bit int and can do 4tops and some models can do 8tops. i have a dell power edge r630 wich uses 2 xeon E5 2683v4 cpus who a capable to do 8int operations over 32 threads with 8.9477,5 TOPS in summary. so yea i know exactly what i expect in performance, but my system uses 400W and the coral only 2w. and given that fact the coral uses only 2 pcie lanes i could use more then one in my system with 40pcie lanes i have in spare. i know the r630 is old but 300€ for such powerfull machine who can beat this value of performance and features to price.
i dont know much about the coral and other npu and tensor core systems. maybe i can use multiple corals to increase the performance. also the coral uses only 2.0 pcie and i dont know how much data really going over the bus. but using sas12g ssd drives is no problem and with 8 drives in raid 0 i could get up to 8gb/s read and write speed. so if the coral need really both 2.0 lanes i should easy feed 8 corals. if no losses i would get 64TOPS with only 16W of power. what a massive number. maybe some one has a answer of this question if this would be possible if not and i can use only one or none for ollama. is ok in this case i leave it like it is now even it uses much more power. power usage is not really a concern for me. but given the fact the r630 supports only 35w pcie devices up to 3.0 and the limited space of 1 slot and half height my options are limited. the nvidia tesla t1000 8gb gpu uses 40W wich is 5w over the limit but it works has also a 10TOPS performance and costs about 380€ each. about gpus, i know i can use multiple gpus for but tpu i have no idea.
super into this. like a lot.
(its pronounced "fri·guht" or "fri-git" not "fri-gate" ; though i do understand the confusion,.. its a type of warship)
Hi I am having an doubt like I Need to install windows 11 on raspberry pi 5B along that i need to install so many things, but first I need to install stable diffusion model on locally so what i need is that which can be more useful for that text to Image features and how can I use it like my GPU and voice conversation like that which hardware is best for those can you please tell me and if possible can be please make a video of it on how to do it please?😊
Thanks for doing this. Do you know if Pineberry or anyone else makes a HAT that accommodates the Coral accelerator AND a NVME drive?
I happen to have a USB Coral unit lying around (thanks Google! and Tiny ML) so I will start with that, but would love to go faster! You have really inspired me to give it a try and see what sort of ML/AI performance can be achieved on the Pi 5.
Listen closely developers to the problems running python. . This is 1000% the second reason why you shouldn’t write tools system tools in python. It displaces the need to resolve dependencies to the end user. The first reason is efficiency esp for the pi.
LLM inference on the TPU? That would be cool.
RPis come a long way, great
Hi dear, tks for the video.
Do you think it would be able to run on a RPI Zero W? Usb or with the hat?
I only think coral supports pcie gen 2.
Person whats needed for a diy surface to air missile, no radar, no ir no jamming it. Great for suicide drones, load the image of the person you dont like and theres no jamming it
My heart dropped as i see him pick up the board while it’s running, with his barehands… yikes 😬
Thank you
I get 5ms inference on the Pi 4 with the USB accelerator
Raps pi 5 next level must have
pyenv made me the python version that worked for my coral setup.
Why dont you list what kind of frame rate you are getting as another user mentioned very choppy and low frame rate. Not very usable in the real world
What use case are you even talking about?
The video that he is showing there on the birdview is only use for the detection. The video that is recorded by frigate is normal frame rate like 25 or 30 fps
Thank you for explaining this! Now I understand what he meant! @@MrPmjg
hey data slayer
Why the poor fps?
The video that he is showing there on the birdview is only use for the detection. The video that is recorded by frigate is normal frame rate like 25 or 30 fps
Great! Thanks for that reply.
Is the 25-30 fps available for display or only for recording?
Clicked because i thought the thumbnail says - trump
your tv is pretty small
awesome)
Very noise
the raspberry pi 5 is cooked bro its 8 year old cpu for $80 🤣🤣
this would be a great video if you were louder and not sounding like you are talking with food in your mouth
White Terminal 😢 Jesus, my eyes just can't...
damn people are calling you out making videos saying your a scammer and a fraud. they dont use your name tho i found you by reverse search. whats your response to the allegations?
what did he do? same like siraj ?
🎶 promo sm
Dear God, I thought the thumbnail was saying a pi 5 was Trump 2.0!
He isn't that intelligent
RK3588 build in 6TOPS NPU