- 113
- 447 977
Ominous Industries
เข้าร่วมเมื่อ 15 ต.ค. 2023
Want to know more about the robot: www.ominousindustries.com
Testing LLMs on the NEW 16gb Raspberry Pi5 (Llama 11B & Qwen 14B)
Timestamps:
00:00 - Intro
01:28 - Unboxing
02:40 - Ollama Download
03:38 - Qwen 14B Test
07:13 - Ollama Glitch PSA
08:53 - Llama3.2 11B Vision
13:48 - Pi5 8GB Comparison
16:26 - Other SBC Comments
17:17 - Closing Thoughts
Experience the groundbreaking capabilities of the new 16GB Raspberry Pi 5 as we put it through its paces with large language models using Ollama. In this comprehensive test, we explore previously impossible territories for the Raspberry Pi - running sophisticated models like Qwen2.5 14B and the multimodal Llama 3.2 11B Vision.
Watch as we demonstrate real-world token generation speeds and compare performance metrics, showcasing how this RAM upgrade opens new doors for local AI processing. We directly contrast with the 8GB Pi's limitations, highlighting models that were previously out of reach but now run successfully on this new hardware.
Whether you're an AI enthusiast, a Raspberry Pi tinkerer, or just curious about the future of local LLM deployment, this video provides concrete data and practical insights into what's now possible with the 16GB Pi 5. Join us as we explore the expanding capabilities of local AI processing and what this means for the future of edge computing.
00:00 - Intro
01:28 - Unboxing
02:40 - Ollama Download
03:38 - Qwen 14B Test
07:13 - Ollama Glitch PSA
08:53 - Llama3.2 11B Vision
13:48 - Pi5 8GB Comparison
16:26 - Other SBC Comments
17:17 - Closing Thoughts
Experience the groundbreaking capabilities of the new 16GB Raspberry Pi 5 as we put it through its paces with large language models using Ollama. In this comprehensive test, we explore previously impossible territories for the Raspberry Pi - running sophisticated models like Qwen2.5 14B and the multimodal Llama 3.2 11B Vision.
Watch as we demonstrate real-world token generation speeds and compare performance metrics, showcasing how this RAM upgrade opens new doors for local AI processing. We directly contrast with the 8GB Pi's limitations, highlighting models that were previously out of reach but now run successfully on this new hardware.
Whether you're an AI enthusiast, a Raspberry Pi tinkerer, or just curious about the future of local LLM deployment, this video provides concrete data and practical insights into what's now possible with the 16GB Pi 5. Join us as we explore the expanding capabilities of local AI processing and what this means for the future of edge computing.
มุมมอง: 5 045
วีดีโอ
An OFFLINE AI Chatbot with NVIDIA Jetson Orin Nano (Setup & Tutorial)
มุมมอง 4K15 ชั่วโมงที่ผ่านมา
Timestamps: 00:00 - Intro 00:51 - Install 08:57 - First Test 12:42 - Fully Offline 13:59 - Technical Overview 19:32 - Chat Robot Test 23:04 - Closing Thoughts Transform your NVIDIA Jetson Orin Nano SUPER into a completely offline, network-free chatbot! In this tutorial, we’ll guide you step-by-step on how to build a fully self-contained chatbot using Silero models for TTS/STT, Ollama with Llama...
Turning The NVIDIA Jetson Orin Nano Into a MINI PC (Windows Apps!)
มุมมอง 7Kวันที่ผ่านมา
Timestamps: 00:00 - Intro 01:35 - Build Time 01:59 - Power Switch How-To 04:47 - Mini Case Fans 06:01 - GPIO Power 06:57 - Finishing Build 07:54 - First Look 09:18 - Box64 Intro 12:48 - Box64 Install 15:15 - Wine64 Install 19:55 - Windows Game Test 23:20 - Running Notepad 24:00 - Future Thoughts Transform your NVIDIA Jetson Orin Nano into a versatile mini PC! In this comprehensive guide, we'll ...
Raspberry Pi5 AI Kit COMPLETE Setup Guide & Detection Alert Tutorial
มุมมอง 1.9Kวันที่ผ่านมา
Timestamps: 00:00 - Intro 00:32 - AI Kit Hardware Install 10:08 - Pi5 OS Install 13:50 - AI Kit Setup 21:37 - AI Demo Examples 23:27 - Detection Alert Setup 29:15 - Detection Alert Script 34:45 - Alert System Demo 36:03 - Closing Thoughts In this video, we provide a complete guide to setting up and implementing an AI-powered detection alert system using the Raspberry Pi 5 and Hailo AI Kit. This...
Real-Time AI Object Detection Testing (Pi5 AI Kit vs. NVIDIA Jetson)
มุมมอง 14K14 วันที่ผ่านมา
Timestamps: 00:00 - Intro 01:04 - Technical Considerations 02:50 - Pi5 Install Setup 05:22 - Pi5 Tech Demo 11:36 - Jetson Install Setup 14:06 - Jetson Tech Demo 19:45 - Observations 20:58 - Closing Thoughts In this video, we dive into the real-time object detection and classification capabilities of small AI-capable edge devices. Specifically, we explore the Raspberry Pi 5 with the Hailo AI kit...
Answering Your Questions On The NVIDIA Jetson Orin Nano Super
มุมมอง 11K21 วันที่ผ่านมา
Timestamps: 00:00 - Intro 00:36 - LLM Token Speed 02:13 - Compared to a 3060 04:44 - No Pi5 Comp 05:10 - Can it run Windows 07:02 - Jetpack Install Issues 08:03 - Gaming and Daily tasks 08:41 - Case Options 10:10 - Next Steps and Closing Thoughts In this video, we address the most common questions about NVIDIA's Jetson Orin Nano Super. From LLM performance benchmarks to Windows compatibility, w...
NVIDIA Jetson Orin Nano Super COMPLETE Setup Guide & Tutorial
มุมมอง 69K21 วันที่ผ่านมา
Timestamps: 00:00 - Intro 00:53 - Pre-Requisites 02:03 - Installing OS 09:08 - NVMe Install 11:24 - Pre Boot 13:40 - First Boot 18:50 - Running An LLM 26:45 - Running Open WebUI 31:57 - Running A Local AI Server 34:07 - Running Stable Diffusion 40:00 - NVMe Setup 55:14 - Closing Thoughts In this video, we provide the most comprehensive setup guide for the NVIDIA Jetson Orin Nano, covering every...
NVIDIA Jetson Orin Nano Super FIRST LOOK ($250 AI SuperComputer)
มุมมอง 42K21 วันที่ผ่านมา
Timestamps: 00:00 - Intro 00:15 - Unboxing 01:25 - Design Overview 02:53 - Size Comparison 04:36 - Technical Intro 05:44 - First Look 06:42 - Performance Boost 08:10 - Stable Diffusion Demo 10:50 - Ollama Demo 12:40 - Closing Thoughts 13:16 - Next Steps In this video, we kick things off with an unboxing of the Jetson Orin Nano, followed by an in-depth overview of its design, layout, and availab...
An Open Source VIDEO LLM (Apollo Test and Install Tutorial)
มุมมอง 1.2K28 วันที่ผ่านมา
Timestamps: 00:00 - Intro 01:25 - Model Overview 03:25 - Model Test 08:15 - Install Guide In this video, we explore the newly released Apollo family of models, a cutting-edge video LLM that integrates advanced video and image encoders to achieve outstanding performance-outperforming many larger models. Surprisingly, Apollo was removed just one day after its release, but thanks to the open-sourc...
ChatGPT Pro vs. Gemini 2.0 (Python Game DEV Test)
มุมมอง 3.4Kหลายเดือนก่อน
Timestamps: 00:00 - Intro 01:05 - Simple Test 03:16 - Gemini Result 04:11 - o1 Pro Result 06:24 - Complex Test 08:24 - Gemini Result 11:00 - o1 Pro Result 11:50 - Closing Thoughts In this video, we pit two cutting-edge AI models against each other: ChatGPT o1 Pro Mode from OpenAI and Gemini Flash 2.0 Experimental from Google. The challenge? To see how well these frontier models perform when tas...
Comparing Sora and Its OPEN SOURCE Rival CogVideo! (Side-by-Side Test)
มุมมอง 753หลายเดือนก่อน
Timestamps: 00:00 - Intro 00:40 - CogVideo Overview 01:13 - Sora Frustrations 01:44 - First Test 04:03 - Second Test 06:12 - Third Test 09:27 - Closing Thoughts 10:07 - Side By Side Comparisons In this video, we explore the newly released Sora video generation capabilities from OpenAI, diving into its features and limitations. To make things even more interesting, we simultaneously test an open...
My Custom Loop EXPLODED But I got Lucky! (Check Your Fittings)
มุมมอง 708หลายเดือนก่อน
Timestamps: 00:00 - Intro 00:30 - Uh Oh 01:35 - Investigating The Failure 02:29 - Quick Microcenter Trip 03:44 - Potential Cause 04:37 - Pressure Testing 05:28 - Filling Back Up 07:28 - First Test Boot 08:31 - I Was Lucky 09:00 - Closing Thoughts In this unexpected and eventful video, I take you through the aftermath and repair of a catastrophic failure in the custom loop liquid cooling system ...
Llama 3.3 70B Tested LOCALLY! (First Look & Python Game Test)
มุมมอง 1.8Kหลายเดือนก่อน
Timestamps: 00:00 - Intro 00:39 - Python Game Test 02:55 - Test Results 04:05 - Second Game Test 05:48 - Second Test Result 06:34 - Final Test and Result 09:40 - Closing Thoughts In this video, I explore the latest release from Meta, the Lama 3.3 70B model, by testing its capabilities in creating a synthwave-themed game. I showcase the model's ability to write functional code in a single shot, ...
The SIMPLEST Way To Run Local AI Agents! (AnythingLLM Agent Demo)
มุมมอง 2.3Kหลายเดือนก่อน
Timestamps: 00:00 - Intro 00:36 - Installation Guide 04:55 - Simple Agent Test 10:00 - Community Hub Agent Test 14:07 - Agent Considerations 16:59 - Making a Custom Agent 21:52 - Closing Thoughts Join us as we dive into the exciting world of AnythingLLM and its newly launched Community Hub! 🌐 In this video, we walk you through a straightforward installation tutorial before exploring the agentic...
OpenAI o1 FULL Model is out! (First Look & Python Game Test)
มุมมอง 1.8Kหลายเดือนก่อน
Timestamps: 00:00 - Intro 00:17 - Python Game Test 04:18 - Test Results 07:31 - Thoughts 09:15 - Second Game Test 12:08 - Multimodal Demo 14:21 - Closing Thoughts In this video, I dive into OpenAI's latest release, the full O1 model, rumored to be built on the innovative Strawberry architecture. To test its capabilities, I challenge the model to create a synthwave-themed game, showcasing its ad...
Run A Local LLM Across Multiple Computers! (vLLM Distributed Inference)
มุมมอง 1.9Kหลายเดือนก่อน
Run A Local LLM Across Multiple Computers! (vLLM Distributed Inference)
Qwen QwQ-32B Tested LOCALLY: An Open Source Model that THINKS
มุมมอง 3.2Kหลายเดือนก่อน
Qwen QwQ-32B Tested LOCALLY: An Open Source Model that THINKS
Using Local AI Agents As NPCs In A Unity Game (Qwen2.5 & Ollama)
มุมมอง 2Kหลายเดือนก่อน
Using Local AI Agents As NPCs In A Unity Game (Qwen2.5 & Ollama)
Dual RTX 3060 12GB Build For Running AI Models
มุมมอง 6Kหลายเดือนก่อน
Dual RTX 3060 12GB Build For Running AI Models
Text-to-3D Mesh Is Here! NVIDIA Llama-Mesh (Test & Install)
มุมมอง 3.6Kหลายเดือนก่อน
Text-to-3D Mesh Is Here! NVIDIA Llama-Mesh (Test & Install)
Run AI Simulated People Locally with Ollama! (Qwen2.5 & TinyTroupe)
มุมมอง 2.1Kหลายเดือนก่อน
Run AI Simulated People Locally with Ollama! (Qwen2.5 & TinyTroupe)
Can This New AI Predict The FUTURE (TinyTroupe Test And Install)
มุมมอง 1.4Kหลายเดือนก่อน
Can This New AI Predict The FUTURE (TinyTroupe Test And Install)
Google's AI On Top? NEW Experimental Gemini Model Python Test
มุมมอง 1.5K2 หลายเดือนก่อน
Google's AI On Top? NEW Experimental Gemini Model Python Test
Run AI Agents Locally with Ollama! (Llama 3.2 Vision & Magentic One)
มุมมอง 3.9K2 หลายเดือนก่อน
Run AI Agents Locally with Ollama! (Llama 3.2 Vision & Magentic One)
THIS New AI Agent Can Do EVERYTHING! (Magentic One Test & Install Guide)
มุมมอง 4K2 หลายเดือนก่อน
THIS New AI Agent Can Do EVERYTHING! (Magentic One Test & Install Guide)
Making A 3D Printed Keyboard For The Raspberry Pi Zero Laptop
มุมมอง 7622 หลายเดือนก่อน
Making A 3D Printed Keyboard For The Raspberry Pi Zero Laptop
RunwayML Gen-3 Video to Video Test: Can It Replace Pro Editing?
มุมมอง 9082 หลายเดือนก่อน
RunwayML Gen-3 Video to Video Test: Can It Replace Pro Editing?
Open Sourcing My AI Social Robots - 3D Print and Build Your Own!
มุมมอง 7322 หลายเดือนก่อน
Open Sourcing My AI Social Robots - 3D Print and Build Your Own!
FIRST LOOK: Testing Claude Computer Use - Can It Handle Web Browsing and Real-World Exploration!?
มุมมอง 2082 หลายเดือนก่อน
FIRST LOOK: Testing Claude Computer Use - Can It Handle Web Browsing and Real-World Exploration!?
Nice work getting this complete!
Thanks very much!
Excellent!
Thank you very much!
Cool, thanks
Thanks for watching!
Please do the orange pi vs
I have a video on the 8gb pi 5 vs Opi here: th-cam.com/video/OXSsrWpIm8o/w-d-xo.html
Great work !!!!
Thanks very much!
Thanks for the great work
No problem, hope it is helpful!
can we run GTA on this 😅 ?
Not well if possible haha
@OminousIndustries 😆
why
For science!
My first printer was a first generation Creality CR-10 that I got back in 2017. It is such a pain to use that I never use it anymore. Looking into getting an A1M now then maybe a p1s or if Bambu comes out with something else new in the future
I started with an Ender 3 a couple years later. These will be as close to plug-n play as you can get.
How about using it with the AI hat?
Unfortunately the current Hailo AI hat does not work for llms (based on what they themselves have said) I have not personally tried it.
@@OminousIndustries thanks for taking the time to comment. Appreciate you and your efforts, and this video.
@@mooninthewater3705 No problem at all, thanks for the kind words!
Great! 13 or 26 TOPS version?
Thank you! This is the 13 TOPS version.
You mentioned that the Raspberry Pi got hot, if you didn’t have any cooling on it then it may of been getting throttled why you had a low initial test.
Yes, it was quite hot! I am going to get an active cooler for it and will test again to see if there is a difference. I was also wondering about throttling.
nice tutorial can you give the link to the nvme you used
Thank you, here is the link: www.microcenter.com/product/661858/inland-tn320-256gb-ssd-nvme-pcie-gen-30x4-m2-2280-3d-nand-tlc-internal-solid-state-drive
@@OminousIndustries thanks
Fantastic walkthrough. I was racking my brain as the jetson-containers run $(autotag stable-diffusion-webui) would constantly blow up. Thanks for the specific image build to use. I guess not all packages are ready for JP6.1, or there's a bug with the autotag feature.
Thanks very much. Yes, that was quite frustrating to me as well until I tried the different container.
16GB on a Pi was unheard of just a few years ago
Exactly, I still have a bunch of 2gb 4B models that I am using for random things. I was very excited when I saw these were in stock and ran out to grab one haha
What do you recommend? Linux? (What version) or Windows?
I personally prefer Ubuntu, but if someone is used to windows and does not want to have to trouble shoot a lot it might not be a bad idea to stick with windows haha
Are there smaller models that would make it a little quicker in it's replies? Or have i missed the whole idea on the models? lol I was thinking of that TARS AI thing they are building. I'm really not interested in it moving or even the vision side of it but, i would like to have it's brain. hahaha (as strange as that sounds... almost Frankenstein type talk)
Yes there are small models like llama 1b and 3b that would be much much quicker. I wanted to test the larger models for this video as the new 16gb ram variant was able to run them, something that the previous 8gb max RAM pi could not!
It is interesting to see what the llama 3.2 vision model is trained on. It is very impressive.
Yes it is, I was impressed it got that, I would love to have the HP to run the 90b one at a decent quant but I need more than 48gb vram.
Hello, I bought the Dell S3222dgm 32 inch curved VA monitor, but this MSI monitor is 40 inch. Should I return my Dell and buy this monitor?
This monitor is not curved so that would probably factor into your decision.
@@OminousIndustries It doesn't matter whether the monitor is flat or curved, image quality and a large screen are more important to me. Thanks for reply
Would have been nice if you had another terminal window open to monitor temp while running ollama watch -n 1 'vcgencmd measure_temp'
You're absolutely right. I will make sure to better show temps/etc in the future!
What is the best SLM to use for basic chat? I'm using RAG extensively (using c# code to hit database) and am looking for a SLM which supports function calling
I can't definitively answer this, but I have had good luck with some of the Qwen models and function calling abilities.
Had the same progress reset issue on win 64bit for a large LLM
Good to know I wasn't going nuts haha
good video
Thanks very much!
I was expecting an egpu setup but he is not like Jeff
I do actually have the components to do an e-gpu on the pi, at some point I would like to try it as his videos showcasing that were very very cool!
@ yeah with the 16Gb that would be a first and with power consumption. I have the equipment also just busy
@@ESGamingCentral I may try it sooner than later but only have a 3060 12gb to do it with.
@ as far I’m aware you need an AMD card; I have a 6600XT. I don’t believe there are drivers for Nvidia cards in arm RPI
@@ESGamingCentral That's very interesting and something I wasn't aware of. I don't actually have any modern AMD cards so damn haha
Just wondering how it performs with small models, will the extra ram give a boost or it the same . would like to have a help LLM for the vscode helper.., but not if it's so slow..
If you were going to use a small Llama 1b or 3b I don't believe there would be a speed difference between the 8gb and 16gb pi.
It's too bad the support isn't there for these orange pi boards. It would be really interesting to see just how big of a model you could run on the 32gb version of it. And maybe see how it stacks up against the new 16gb raspberry pi
That's the tough part about comparing to the Pi, the long term ecosystem has had the support of a lot of talented people over the years which makes them far better supported.
There's going to be a flood of mini pc's with some Linux distro targeting local llms next year. But what do we want them for.
I'm okay with having more options! haha. Use cases are a different story
Very useful video especially when you compare against similar priced SBC. PI foundation really needs to add an NPU!
Thanks very much! Yes, I would like to see the introduction of a more AI focused Pi.
mate this is perfect I am about to buy a couple of these!!!
Glad to hear, they will definitely fit well into some of your projects!
sora's is way better. but veo2 is going to rock sora
From what I have seen about Veo2 it is very very impressive.
Very cool, I am on the same page and made a Unity project where local llm is used with avatars to explain historic reconstructions in real-time 3d. I think local AI is the future. No one wants to send all conversation to OpenAI or others..
That sounds like an awesome project and something right up my alley. Yes, being able to run things locally is very important to a lot of us!
Skaty McSkateboard case.
LOL
When your doing a matrix multiply across 16GB vectors space it will definitely be slower than an 8GB vector space. Double the time.
Yes, but if I am not mistaken, if the model was a smaller parameter model like a 1b or 3b the total system ram wouldn't make a difference in the speed as it would only be allocated across what was needed and the extra ram wouldn't come into play.
I guess it works better with llama3.2 1B and 3B, as well Phi 3 mini.
Yes, these smaller models are much better suited for lower-powered hardware.
How hot is that getting without a heatsink? Might be thermal throttling. Edit: I see you wondered that as well. Would be interesting to test with active cooling
I did touch the cpu and it was extremely hot haha. Active cooling is on the agenda ASAP.
Try run phi4
I've been meaning to try phi4 at some point.
@@OminousIndustries step by step tutorial without ollama. Only Python, only hardcore
@@MrKim-pt2vm LOL hackathon vibes
Nice videos. i cant afford anything on LLM AI hardware myself, but your videos satisfy my curiosity. Great work !
Thanks very much! Soon the hardware will be more and more accessible :)
Thank you for this super awesomely detailed walkthrough / howto! Loved it! Special ranking points for not skipping anything and above all for solving the issues live without any fancy editing.
Thanks so much, glad you found it useful!
Does nvme SSD storage make a difference? Jeff geerling got interesting LLM results
no, after the initial load it's stored in the ram
I don't believe it would make a difference in this scenario, no.
Overclock it for marginally quicker results?
Was scared to without any form of cooler haha
Build a social credit evaluator public camera system. Identify humans with low credit score and send them fiscal responsibility tutorials.
Impossible is possible
Absolutely!
Nice test. I decided to add a second m.2 to my Jetson Orin Nano Super to give 20gb of virtual ram. Given the better Architecture it should run the 11b and 13b models at a reasonable pace. Exciting times for mobile platforms and local LLMs.
I have 2 Orins ordered in UK. Do you recommend any particular m2 SSD? Is it just set up as swap? What command did you use? Thanks in advance
@@ianfoster99 Any fast SSD should be fine. I did a 256Gb but only set a ram swap of 20Gb. You don't want to over allocate. Plus wear rates will be higher so I have lots of unallocated space as regions go bad. You can set it up with the Disks utility in Ubuntu, or with python code. ChatGPT can help with that. Good luck.
Thats a very cool setup, with the increased swap. I am interested to know your results when you test the larger llms. Definitely exciting time for local llm and SBC's
Interesting! Could you or anyone on this thread share a link to a video explaining how to add this additional virtual ram with a second m.2 on the Jetson Orin Nano Super? Thanks!
Nice video. Please test 7B or 8B model on orange pi NPU. Do multiple test like input token 0, input token 500, input token 1k, input token 4k and input token 7k. I am creating a AI notes app, so currently using Groq, but want to see that on hardware like nvidia (test in future) and self host, max how much token can i generate on 7B or 8B model.
That's a great idea, I will try to squeeze that in!
Great work, thanks a lot!
Thanks very much!
Second!
Cheers!
First!
Cheers!
this is so kool. What's the minimum config of the system required to run this model? You mention 3060s, could you please provide a link? Trying to get a minimum setup
Thanks! I think this needs a 3090 minimum, though for a minimal setup a Zotac 3060 12gb is a very good price/performance option and I have two of those in a machine
Sweet rig bro
Thanks! It's been a fun build!
Would be good to run this test again on Orange Pi 5 Ultra (which is their most powerful model yet), having NPUs and GPU for AI capabilities. That can be compared against Raspberry Pi 5 + RPi AI HAT+. Both the Raspberry 5 and Orange Pi 5B are under-powered to run good quality LLMs.
Unfortunately I don't see myself getting the AI hat + or new Pi 5b but that would be a good comparison indeed. The original pi AI kit was not able to do any llm work because of pcie bandwidth limitations (according to the manufacturer) so it was really only for vision tasks.