Running LLM Clusters on ALL THIS 🚀

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 พ.ย. 2024

ความคิดเห็น • 165

  •  13 วันที่ผ่านมา +104

    Come on, Alex!!! Three of the new Macs Mini together!!!

    • @danielselaru7247
      @danielselaru7247 13 วันที่ผ่านมา +12

      I'm now waiting for three mac minis with m4 pro. Will share the results here as well!

    • @akierum
      @akierum 13 วันที่ผ่านมา +1

      4x 3090 for price of single m4, mac - no way thank you

    • @Peterotica
      @Peterotica 12 วันที่ผ่านมา +4

      @@akierum how did you even come up with that number

    • @laden6675
      @laden6675 11 วันที่ผ่านมา +3

      @@Peterotica he made it up

    • @egrinant2
      @egrinant2 7 วันที่ผ่านมา

      3? What about 5!

  • @isnakolah
    @isnakolah 14 วันที่ผ่านมา +48

    Imagine this with the thunderbolt 5 120Gb/s bandwidth, so much potential

    • @Mustafa_Hamedd
      @Mustafa_Hamedd 14 วันที่ผ่านมา

      But the M4 Chip has a WiFI 6E wireless card

    • @isbestlizard
      @isbestlizard 14 วันที่ผ่านมา +6

      I thought that, but apparently that's just for screens - actual DATA bandwidth is more 80Gb/sec. And honestly I feel like 40Gb/s is just fine for compute with good locality :D

    • @isnakolah
      @isnakolah 13 วันที่ผ่านมา

      @@isbestlizard That's a bummer, but it makes sense, 120 is 30 shy of the M3 pro internal memory bandwind (I know they are not the same), imagine buying 2 m4 base model and chaining them, double of everything at $1200, maybe in like 3 - 5 years that would be possible

    • @michaelthornes
      @michaelthornes 13 วันที่ผ่านมา +2

      @@isnakolah nah, 120 Gbps is only 15 GB/s, so 10%

    • @michaelthornes
      @michaelthornes 13 วันที่ผ่านมา +3

      @@isbestlizard Thunderbolt 5 transfer speeds have 2 options:
      1. 80/80 (simply twice that of TB4, 2 wires in each direction)
      2. 120/40 (3x/1x that of TB4, 3 wires in one direction, 1 in the other)
      they use the same 4 wires, just allocating them differently
      displays are obviously going to use 120/40, since you need a ton of outgoing bandwidth vs very little incoming (assuming you aren't running thunderbolt devices chained downstream)
      external ssds are probably best on 80/80 which runs pcie gen4 x4, unless we get gen5 chips and gen5 ssds so you can choose to either read or write at up to 12 GB/s with just 4 GB/s available in the other direction, as needed
      I really have no idea about GPU bidirectional bandwidth, but TB5 is at least twice as good

  • @kumiho1729
    @kumiho1729 13 วันที่ผ่านมา +3

    I've been waiting for someone to do this test forever! I had a feeling you'd be the first. :D
    This is one of the main reasons I felt comfortable pulling the trigger on the base-model mini. I expect to use it in some fashion even a decade (or two) from now.

  • @meh2285
    @meh2285 13 วันที่ผ่านมา +12

    You should make more videos about this to compare how wifi vs eithernet vs thunderbolt connections impact performance, how larger models run, and other stuff that you are in the unique position to experiment with.

  • @themarksmith
    @themarksmith 14 วันที่ผ่านมา +30

    You are killing it with the AI vids dude... 10/10!

  • @Rkcuddles
    @Rkcuddles 14 วันที่ผ่านมา +5

    I am sure this was a lot to figure out. But it seems unreasonably easy to setup. I am so impressed. Hope you refine it and keep telling us about it.

  • @christianbaer2897
    @christianbaer2897 14 วันที่ผ่านมา +25

    I am super excited for the Mac Mini Cluster we are going to see ;-)

    • @isbestlizard
      @isbestlizard 14 วันที่ผ่านมา +2

      HAH yes everyone has this idea XD

    • @meh2285
      @meh2285 13 วันที่ผ่านมา

      Considering how you can't spec out the Mac Mini with that much RAM or anything better than an M4 Pro, buying these for clusters would be highly expensive for the performance you'd be getting. A base model 32 gb m1 max Mac studio costs around $1000 used and still outperforms the M4 Pro Mac mini. I even bought a 64 GB m1 max macbook pro with a broken screen for $1230 on ebay.

    • @christianbaer2897
      @christianbaer2897 13 วันที่ผ่านมา

      @@meh2285 Well... You are not thinking big enough, I'd say.
      Yes you can get single machines, that might be outperforming 2 or 3 mac minis. Think 10. Think 20. Think 100.
      Now it gets interesting, because we are speaking about a more commercial use. No company will buy used and broken macbooks. And the value proposition of 100 Mac Minis (base M4 Pro model) vs 75 Mac Studios (Base M2 Max modell) is intriguing. I am not saying, that I calculated all of that to the end, but I kind of see this sort of thing happening. Not only for LLMs but for all kinds of clusters. Mac Minis have been used for that in the past and will be used for that in the future ;-)

    • @beaumac
      @beaumac 13 วันที่ผ่านมา +1

      @@meh2285the point of clustering is so that you don’t have to spec up the base model and buy multiple of them instead. I imagine two m4 in a cluster will smoke a m4 pro

    • @meh2285
      @meh2285 13 วันที่ผ่านมา

      ​​@@beaumac Yeah but it's still a terrible price to performance tradeoff to get Mac Minis for clustering. The base model is a great value on its own, but it's not ideal for clustering at this price, it just doesn't have that powerful of a gpu. The higher end models are an even worse value for clustering, considering you can get a new M1 Studio with 64 gb of ram on Ebay for $500 less than a spec'd out M4 mini ($2200 at base storage) that will have worse GPU performance and 2x more storage. Also, the second you buy a second base model Mac Mini, you could have gotten a 32 gb Mac Studio for that price, which would preform better due to still having more GPU compute and less latency from clustering. Unless you really need two computers for some other reason, clustering with Mac Minis is a bad idea.

  • @jutrecenti
    @jutrecenti 13 วันที่ผ่านมา +2

    2:16 environmental variable. Loved it

  • @keithdow8327
    @keithdow8327 13 วันที่ผ่านมา +10

    Thanks! I bought the NAS. When are you going to hook up the 4090 also?

    • @AZisk
      @AZisk  13 วันที่ผ่านมา +4

      thanks! 🤔 i don’t know if it can do both cuda and non-cuda in a cluster, but I’d be curious to find out

    • @rafaeldomenikos5978
      @rafaeldomenikos5978 13 วันที่ผ่านมา +2

      @@AZisk I saw a person on Reddit who had hooked up a Mac Studio with a 4090 pc through a thunderbolt bridge. It should be possible.

  • @AlmorTech
    @AlmorTech 9 วันที่ผ่านมา +1

    Oh my, you’re killing it 😮 Great job!

  • @DS-pk4eh
    @DS-pk4eh 13 วันที่ผ่านมา +2

    This is a nice POC. Another great video from Alex.
    I would definitely prefer having 10Gb switch and having everything connected to it (there are some 8 ports for 300USD). More stable to actually work, and probably, less messy.
    Maybe getting miniPC with 10Gb port and some decent amount of memory? Its shame Apple has such a big tax on memory and storage upgrade.
    There is also Asustor 10Gb SSD only NAS device with 12x SSD slots (Flashstor 12 Pro).

  • @gmullBlack2
    @gmullBlack2 11 วันที่ผ่านมา

    Alex, this is an interesting setup. I would like to see more of your results when clustering these machines together to consume various LLM workloads, especially the larger models.

  • @ErikBussink
    @ErikBussink 13 วันที่ผ่านมา +8

    @Alex your Qwen 2.5 14B Instruct Q4 should run on your MacbookPro 64GB without needing the exo cluster. Are you seeing the same performance then ?

    • @lewisl9029
      @lewisl9029 13 วันที่ผ่านมา

      @@ErikBussink This is what I was wondering too. Would be more interested in seeing a larger model running on a cluster of smaller machines that can't possibly run them on their own.

    • @Techonsapevole
      @Techonsapevole 13 วันที่ผ่านมา

      I was think the same, i don't see the exo cluster advantage or llama3.2 405B running

    • @danielselaru7247
      @danielselaru7247 13 วันที่ผ่านมา +1

      I think it might even be faster because of the lack of external communication with the other computers.

  • @op87867
    @op87867 13 วันที่ผ่านมา +8

    10 base m4 Mac mini cluster here I come

    • @wenztan8662
      @wenztan8662 10 วันที่ผ่านมา

      @@op87867 cooooool

    • @Kaalkian
      @Kaalkian 8 วันที่ผ่านมา

      you want atleast the base m4 pro mini for tb5

    • @znerol1
      @znerol1 4 วันที่ผ่านมา

      @@Kaalkianbut then the value drops already massively, i think those 600 usd base models are great just use more of them cheaper than upgrading RAM or CPU

  • @SirDealer
    @SirDealer 13 วันที่ผ่านมา +4

    Please compare 4x Mac mini base model with 1x 4090 :D

    • @eamoralesl
      @eamoralesl 7 วันที่ผ่านมา

      this is the one we are all waiting for

  • @agent00ameisenbar35
    @agent00ameisenbar35 11 วันที่ผ่านมา

    finally a good exo explanation. thanks alex!

  • @chriswarren6128
    @chriswarren6128 13 วันที่ผ่านมา +1

    Would you mind posting a run of that final test that works? Only difference being multiple calls across the cluster to the same model. I'd love to see how it parallelizes that type of workload and what the resulting tokens/sec ends up being.

  • @matteolulli2654
    @matteolulli2654 13 วันที่ผ่านมา +1

    Nice! Glad you gave it a try! Maybe next round Thunderbolt 5 on the new minis! ;)

  • @DenverHarris
    @DenverHarris 4 วันที่ผ่านมา

    Can you use the cluster model with multiple max for any program? Light wave? Final Cut Pro? Basically, I have a super computer for everything? Or does EXO only help you run LLM?

  • @SonLeDang
    @SonLeDang 8 ชั่วโมงที่ผ่านมา

    Thanks Alex!! Can you do an experiment with Exo is running across PC and Mac ? I have a PC with 4070 and 3 Macbook (m1, m2, m3 pro) Also a setup where PC is the NAS to save us some money :D

  • @nirglazer5962
    @nirglazer5962 9 วันที่ผ่านมา +1

    how does this actually work? you're not actually sharing the compute power right? basically it determines to which computer to send the query to, and then that computer shares the result with the one you're working on? would combining 3 of the same computer be beneficial or just repetitive?

  • @ГеоргиЧалъков-щ1р
    @ГеоргиЧалъков-щ1р 13 วันที่ผ่านมา +1

    Im interested to see the token per second difference between running llama 3.1 70b on the 64gb MacBook compared to the tok/s on the cluster with the thunderbolt configuration. Also why not try llama 405b so we can see how fast is it?

  • @neeleshvashist
    @neeleshvashist 13 วันที่ผ่านมา

    Hey Alex! Your videos are great!
    I’m considering getting a MacBook Pro but not sure which model would be best for my needs. I’m a data science and machine learning student, so I’ll mostly use it for coding, data analysis, and some AI projects. Since I’m still in learning mode and not yet working professionally, I’m unsure if I need the latest high-end model. Any recommendations on which model or specs would best fit my use case? Thanks in advance!

  • @abrahamortiz9812
    @abrahamortiz9812 12 วันที่ผ่านมา

    Dear Alex, I follow your channel for the language models, specifically for the MacBook Pro with Apple silicon. I congratulate you for your very precise and detailed content.
    I have a question.
    Can a Llama3.1 70b Q5_0 model with a weight of 49GB damage a MacBook Pro 16 M2 Max with 64GB ram?
    I ran, on the MacBook, 2 models. (Mixtral8x7b Q4_0 26GB and Llama3.1 70B Q5_0 49GB).
    When the 26GB one was running, the response was more fluid and quiet and the memory flow on the monitor looked "good", with a certain amount free and also without pressure. When I ran the 49GB weight (Llama3.1 70B Q5_0) it was not so fluid and also the Mac made an internal noise that was synchronized with the rhythm of each word that the model answered, in addition the memory monitor marked me that there was pressure in the memory.
    So far so good. Just that detail. The problem came when I decided to reset the MacBook with a clean installation of the operating system and deleted the installation from utilities (as marked by Apple), then I exited disk utilities and clicked on install macOS Sonoma. The installation began, it marked me 3 hours of waiting, and everything started well. After about 6 minutes of installation, the screen image was transformed into a poor quality image at the same time that was fading in areas (from bottom to top) until it disappeared. In that screen image you could see lines and dots of green colors as well. All this happened in a second. He never gave me an image again, only a black image could be seen. You could only see that it turned on the MacBook by the keyboard lighting and if it turned off the office lights you could see a very faint white flash in the center of the screen. I connected a screen by HDMI but you couldn't see anything either, just a black screen.
    I can see it's the video card. Do you think memory pressure could have influenced the heavier model that overloaded the MacBook Pro? Or do you think it was a matter of luck and it has no to do with language models?
    I ran the models with Ollama and downloaded them from the same page.
    Thank you very much for reading me,
    Greetings

  • @Fingobob
    @Fingobob 8 วันที่ผ่านมา

    Hi this is very helpful. i am curious if you could run LLM benchmarks of the various M4 models you have and see if an increase in GPU core counts make a difference, if so how much of a difference.

  • @casperes0912
    @casperes0912 13 วันที่ผ่านมา

    You do not need to restart your terminal to have environment variables take effect. You just edited your zshrc; You can run export VARIABLE=VALUE and it takes effect in that session only, or after editing your zshrc you can run "source ~/.zshrc" and it will reload the config immediately. You can also source other files for that matter and have multiple shell configs active so to speak. It basically just runs the file

  • @JSiuDev
    @JSiuDev 13 วันที่ผ่านมา

    Thanks for the testing! I am very interested in this but don't have the extra hardware to try.

  • @ursjarvis
    @ursjarvis 13 วันที่ผ่านมา

    Thanks for adding MacBook Air m2 base model...😊

  • @timrobertson8242
    @timrobertson8242 13 วันที่ผ่านมา

    Alex. While I appreciate the use of the SSD only File Server, couldn't you have Direct Attached to the MacBook Pro and done File Share over the Thunderbolt Bridge pointing your cache "to the same place" Mac Share vs SMB would seem to be efficient and eliminate the WiFi to Storage bottleneck? Just wondering if a measurable impact.

  • @alexeyzinoviev9090
    @alexeyzinoviev9090 14 วันที่ผ่านมา +1

    2 questions: 1. does this bridge support jumbo frames (the default 1500 bytes seems too small)? 2. Why CIFS and not NFS? NFS seems to be about 10-20% faster on MacOS

  • @ishudshutup
    @ishudshutup 13 วันที่ผ่านมา

    This is awesome, can't wait for the M4 Mac Mini LLM review! Could you consider a video about the elephant in the room, multiple 8gb gpus clustered together to run a large model? There are millions of 8gb gpus that are stuck running quantized version of 7B models or just underutilized.

  • @modoulaminceesay9211
    @modoulaminceesay9211 10 วันที่ผ่านมา

    Can you please show how you moved it to the SSD. Like transferring llama to external storage

  • @mk677hd
    @mk677hd 14 วันที่ผ่านมา

    Nice one Alex, whats the effective t/sec across them from your testing - say its x for 128Gigs across a single device versus 64+32+16+8+8 nodes and the model needs two or more machines to run it. Think it wont hit more than 0.5x even with thunderbolt bridge to pass around stuff.

  • @mariol8831
    @mariol8831 13 วันที่ผ่านมา

    Why is the inference performance different per machine? Are they sharing the GPU cores too or just the VRAM? Because based on the output you are getting the VRAM bandwidth is around 300 - 400GB/s

  •  13 วันที่ผ่านมา

    I remember rendering times in Final Cut & Compressor, on many machines - the same problems :)

  • @Wunnabeanbag
    @Wunnabeanbag 12 วันที่ผ่านมา

    CAN’T WAIT FOR YOUR M4 video

  • @muhammadhalimov422
    @muhammadhalimov422 12 วันที่ผ่านมา

    Alex wth, u r crazy!!!

  • @denis.gruiax
    @denis.gruiax 14 วันที่ผ่านมา

    Great video! 🔥

  • @plamengflo
    @plamengflo 13 วันที่ผ่านมา +1

    You should try same but with SAN

  • @mpsii
    @mpsii 6 วันที่ผ่านมา

    If I have this kind of cluster set up, how do I access the cluster from my main machine that is not part of the cluster?

  • @FullStackDevSecOps
    @FullStackDevSecOps 10 วันที่ผ่านมา

    Could you please verify the functionality of connecting three computers to the Terramaster F8 SSD Plus by utilizing each of its USB ports?

  • @Fingobob
    @Fingobob 8 วันที่ผ่านมา

    As a follow on your presentation today is - what if i wanna run Llama 3.1-70 or even 405 gb on a distributed computing setup

  • @edc1569
    @edc1569 13 วันที่ผ่านมา +1

    So rather than upgrade a Mac mini, you just buy more of them?

  • @softwareengineeringwithkoushik
    @softwareengineeringwithkoushik 13 วันที่ผ่านมา +1

    Waiting⏳ for m4 max

  • @loicdupond7550
    @loicdupond7550 10 วันที่ผ่านมา

    Humm so I guess one more question this brings : is it better to go for one m4 pro with 48GB of ram or 2 m4 with 24GB each to run local LLMs since it would be the same price

  • @kamurashev
    @kamurashev 13 วันที่ผ่านมา

    is it sharing only ram but not thecompute resources?

  • @modoulaminceesay9211
    @modoulaminceesay9211 10 วันที่ผ่านมา

    I am having the same problem , how do you set ollama to save to SSD?

  • @_hmh
    @_hmh 12 วันที่ผ่านมา +2

    This is impressive. If this was for a real-world use case, I’d implement these optimizations:
    - Don’t use the NAS since it introduces a single point of failure and it is much slower than directly attached storage. For best performance, the internal SSDs are your best choice. Storing the model on each computer is ok. This is called “shared nothing”
    - Use identical computers. My hypothesis is that slower computers slow down the whole cluster. You would need to measure it with the Activity Monitor
    - Measure the network traffic. Use a network switch (or better two together with ethernet bonding for redundancy and speed increase) so that you can add an arbitrary number of computers to your setup
    - Measure how well your model scales out. If you have three computers and add a fourth, you would expect to get one third more tokens per second. The increase that you actually get in relation to the computing power you added, defines your scale out efficiency.
    - use identical computers to get comparable results
    - Now you have a perfect cluster where you can remove any component without breaking the complete cluster. Whichever component you remove, the rest would still function.

  • @juehmingshi1739
    @juehmingshi1739 12 วันที่ผ่านมา

    Get Mac mini pros, you will have 120Gb thunderbolt connection for the cluster.

  • @MrOktony
    @MrOktony 13 วันที่ผ่านมา

    They’re very short on basic documentation. Any ideas how can i manually add LLM to exp, so that they appear in tiny chat? Maybe you can do video about it?

  • @RicoRojas
    @RicoRojas 2 วันที่ผ่านมา

    Alex, how did you change the default ports? mine keeps coming up on 52415 no matter what flags I give it on launch

    • @AZisk
      @AZisk  2 วันที่ผ่านมา

      that’s the new hard coded port. it used to be 8000, now it’s this

  • @meh2285
    @meh2285 14 วันที่ผ่านมา +2

    I bought two cheap m1 Max Macs with 64 gb of ram for this use case

    • @cyano5758
      @cyano5758 13 วันที่ผ่านมา

      @@meh2285 and what are your results ?

  • @Haitham___ww
    @Haitham___ww 13 วันที่ผ่านมา

    Hey, which new M4 configuration should I get if I want to play around with local LLMs?

  • @sandeeppatil803
    @sandeeppatil803 13 วันที่ผ่านมา +1

    Who is still waiting for m4 machine review

  • @Techonsapevole
    @Techonsapevole 13 วันที่ผ่านมา

    where is llama3.2 405 running ? 🤔

  • @soviut303
    @soviut303 13 วันที่ผ่านมา

    I'd like to see you experiment with the models that won't fit inside a desktop GPU (RTX 4090 maxes out at 24GB of VRAM). With the Mac Minis going up to 64GB of unified memory, a couple of them should be able to run most 70B models without any quantization.

  • @radudamianov
    @radudamianov 8 วันที่ผ่านมา

    Please test with various context windows 8k/ 32k/128k and especially with longer prompts > 1000 tokens.

  • @juap
    @juap 13 วันที่ผ่านมา

    So, what about to have lot of rapsberries conected with exo and the nas running a very LLM?

  • @yenjun0204
    @yenjun0204 11 วันที่ผ่านมา

    Actually NAS is not required. First Networking via Thunderbolt cables, and then assigning internal or external drives or TB DAS as LLM sources should be faster.

  • @HamedTavakoli-m9h
    @HamedTavakoli-m9h 10 วันที่ผ่านมา

    between m3 max 30 core GPU 36ram and m4 pro 48gb ram which one should I choose?

  • @peterbizik224
    @peterbizik224 9 วันที่ผ่านมา

    Thunderbolt 5 m4 64gb ram x3 - is it going to be a 192gb gpu memory cluster ?

  • @dilip.rajkumar
    @dilip.rajkumar 10 วันที่ผ่านมา

    Can we run the biggest 405B Llama 3.2 model on this Apple Silicon Cluster?

  • @liquathrushbane2003
    @liquathrushbane2003 13 วันที่ผ่านมา

    so does this mean that if you have enough hardware/laptops you could DL and run Olama 405b (230Gb) model, and the running of it would be spread across all the nodes ? (Albeit likely slow)

  • @TechGameDev
    @TechGameDev 13 วันที่ผ่านมา

    YOU WILL soon have your macbook pro M4 max I can't wait

  • @ryanswatson
    @ryanswatson 11 วันที่ผ่านมา

    Rust compile time comparison with M4 _vs_ older M Series please.

  • @paultparker
    @paultparker 13 วันที่ผ่านมา

    Why is the ping latency an entire millisecond? Shouldn’t thunderbolt be faster than that? Isn’t it external PCIE?

  • @darthcryod1562
    @darthcryod1562 14 วันที่ผ่านมา

    awesome, does this exo tool works to cluster x86 minipc (fedora) + macbooks?

  • @HadesTimer
    @HadesTimer 13 วันที่ผ่านมา

    Alex, other than testing stuff for videos, what do you need all this for? Wouldn't it be far easier and far cheaper to just use the pro version of Claude, Gemini, or ChatGPT instead rather than running all these models locally? Seems like you are spending a lot of time and money on a problem that has already been solved.

    • @passionatebeast24
      @passionatebeast24 13 วันที่ผ่านมา

      @@HadesTimer a lot of things you want to run on local. Not cloud. There are restrictions on this online services. But on local you can do whatever you want.

    • @HadesTimer
      @HadesTimer 12 วันที่ผ่านมา

      @@passionatebeast24 very true, but that's hardly worth the expense. Especially if you are using image generation.

  • @JonCaraveo
    @JonCaraveo 10 วันที่ผ่านมา

    Question: Can nodes be different OS? 😅

  • @rickdg
    @rickdg 13 วันที่ผ่านมา

    So the endgame is to get a couple of base mac minis m4?

  • @hyposlasher
    @hyposlasher 6 วันที่ผ่านมา

    Is it possible to do with windows laptops?

  • @beaumac
    @beaumac 13 วันที่ผ่านมา

    Could you cluster a bunch of Mac mini base models? At $600 they have to be the best bang for the buck.

  • @8888-u6n
    @8888-u6n 7 วันที่ผ่านมา

    Can you make a video with 4 X the 16Gb with the new mac mini in an cluster, or even better 4X 64GB to make a 256 of Vram 🙂

  • @jasonchan2899
    @jasonchan2899 13 วันที่ผ่านมา

    I am wondering if this could help to run 70B or bigger model given that if I have 2-3 64 Gb Ram Mac Silicon machines?

  • @bouzymaj1086
    @bouzymaj1086 14 วันที่ผ่านมา

    You can use Spark ? To load model on the RAM

  • @isbestlizard
    @isbestlizard 14 วันที่ผ่านมา +2

    Oh my god my plan will be to get 4 mac mini's (base config) and build a 64GB 480GB/sec cluster hahaha those 3 thunderbolt ports make a perfect fully connected 4 node cluster :D

    • @cosmicgamingunlimited
      @cosmicgamingunlimited 13 วันที่ผ่านมา +1

      Would this be cheaper than an m4 ultra setup? I havent done the math, but feels like it's not.

    • @danielselaru7247
      @danielselaru7247 13 วันที่ผ่านมา

      How did you get to the 460GB/s? Aren't you limited to the Thunderbolt transfer speed?

  • @attaboyabhi
    @attaboyabhi 13 วันที่ผ่านมา

    sweet !

  • @moonknight8693
    @moonknight8693 13 วันที่ผ่านมา

    Can your test new m4 pro base variant 16/512. I was looking to buy this varient

  • @xsiviso4835
    @xsiviso4835 13 วันที่ผ่านมา

    Couldn’t you create a smb share on one Mac and then point the other MacBooks to it? Over thunderbolt loading the model should be even faster.

  • @benvicius672
    @benvicius672 13 วันที่ผ่านมา

    The maintainers of the project definitely groaned very loudly when you tried to run two different models at once😂

    • @alexcheema6270
      @alexcheema6270 13 วันที่ผ่านมา

      sure did :D
      we'll fix it tho

  • @ZAcharyIndy
    @ZAcharyIndy 11 วันที่ผ่านมา

    if you are buying Mac Book, make sure it has the larger storage

  • @Helios.vfx.
    @Helios.vfx. 13 วันที่ผ่านมา

    off topic question for apple users. is possible to have the terminal with black background?

    • @alexexoxoxo
      @alexexoxoxo 13 วันที่ผ่านมา +2

      yes, it's possible!

    • @Helios.vfx.
      @Helios.vfx. 12 วันที่ผ่านมา

      @alexexoxoxo thanks !! Im saving up to get a Mac and seeing that white terminal scared the hell outta me 😂

  • @indomag8068
    @indomag8068 14 วันที่ผ่านมา

    Cool Stuff

  • @adelinopais7905
    @adelinopais7905 13 วันที่ผ่านมา

    This setup can use models with more parameters right? For example using 14B can be how much better than using chatgpt 4o in normal chat? In which examples, stories, code, ect? If someone can help me understand this, I really apreciate!

  • @mtolm
    @mtolm 13 วันที่ผ่านมา

    Awesome video than you

  • @vinusuhas4978
    @vinusuhas4978 12 วันที่ผ่านมา

    what about SD e elite chips ?

  • @dougall1687
    @dougall1687 13 วันที่ผ่านมา

    Whoa, did Baz Luhrmann help you edit this or were you editing in Starbucks again ;)

  • @billyliu5452
    @billyliu5452 7 วันที่ผ่านมา

    Can you try MacMini M4 Pro cluster?😂
    With thunder bolt 5

  • @misteryu6819
    @misteryu6819 13 วันที่ผ่านมา

    are you able to run the cluster across MacOS and Windows?

  • @saurabhjadhav106
    @saurabhjadhav106 13 วันที่ผ่านมา

    Great

  • @rohandesai648
    @rohandesai648 13 วันที่ผ่านมา

    Didnt understand point of this video. if you already have 64GB laptop, why use others to run LLMs ? even on shared NAS, it will run on that 64GB one only. Why would anyone have multiple laptops lying around ?

  • @Fingobob
    @Fingobob 7 วันที่ผ่านมา

    Has anybody tried this setup for LLM, (Which would do better in LLM processing, training, inference, RAG etc.)- would this run llama3.1 (70B)
    (2x m4 base mini with 32gb ram each 256 ssd -Tbolt4 linked and load distributed ) VS 1x m4pro with 64gb ram 512gb. This i wanna see if you can pull it off. very curious about the effectiveness of a small cluster vs all in 1 system.

  • @TechGameDev
    @TechGameDev 13 วันที่ผ่านมา

    the anti-resonance foam is too well placed, it is more symmetrical and organized

  • @mantikhatasi
    @mantikhatasi 11 วันที่ผ่านมา

    i saw some1 connected 4 minis and run llm

  • @ScottLahteine
    @ScottLahteine 13 วันที่ผ่านมา

    11:30 -

  • @korakys
    @korakys 13 วันที่ผ่านมา

    Linux on Snapdragon, please.

  • @naimnayak
    @naimnayak 14 วันที่ผ่านมา

    Where's m4 Mr Alex

  • @Youtuber-ku4nk
    @Youtuber-ku4nk 6 วันที่ผ่านมา

    What is the use case of running you own LLM?

  • @WSilby
    @WSilby 13 วันที่ผ่านมา

    if you could explain to someone a little less smart... what does the higher parameters mean for the ai?

  • @sreeharisreelal
    @sreeharisreelal 13 วันที่ผ่านมา +2

    I have a question regarding the installation of SQL Server Management Studio (SSMS) on a Mac. Specifically, I would like to know if it is feasible to install SSMS within a Windows 11 virtual machine using Parallels Desktop, and whether I would be able to connect this installation to an SQL Server that is running on the host macOS. Are there any specific configurations or steps I should be aware of to ensure a successful connection between SSMS and the SQL Server on macOS? Thank you!