Run New Llama 3.1 on Your Computer Privately in 10 minutes

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ก.พ. 2025

ความคิดเห็น • 305

  • @SkillLeapAI
    @SkillLeapAI  6 หลายเดือนก่อน +10

    Check out our updated course on running private AI chatbots on your computer.
    bit.ly/skill-leap

    • @joeldowner2991
      @joeldowner2991 6 หลายเดือนก่อน

      try AIJoel - Multi Generator: text, code, image (create sticker, remove image background, add color to black white image, image to video, logo design) and (music and video are in beta mode)

    • @IndieGamesSpotlight
      @IndieGamesSpotlight 3 หลายเดือนก่อน

      so I installed ollama and asked it, are you cloud based or are you running locacly on my computer? It replied that is cloud based Ai. Why is that?

    • @kudoskudos543
      @kudoskudos543 หลายเดือนก่อน

      Can upload image and analyze that imge in jpeg format?

  • @Alvin-i2t7o
    @Alvin-i2t7o 4 หลายเดือนก่อน +3

    One new SSD and a full install later.... It works!! Docker was giving me an error which I couldn't resolve but it was all straightforward on a clean installation!

  • @marcusstone6273
    @marcusstone6273 6 หลายเดือนก่อน +24

    Hey bro I just want to say that you grind is on another level. So good that you can go so for many years and sitll create new channels and succeed. Nice transition to AI content and your views are great too. Hope you get a lot of sponsorship and affiliate deals.

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน +4

      I appreciate that!

  • @mr.cannon
    @mr.cannon 6 หลายเดือนก่อน +24

    PC USERS IF YOU ARE GETTING THE WSL ERROR WHEN RUNNING DOCKER - Enable virtualization in BIOS: This process varies depending on your computer manufacturer and model. Generally, you'll need to restart your computer and enter the BIOS settings (often by pressing F2, F10, or Del during startup), then look for an option related to virtualization or VT-x and enable it.

    • @joeduffy52
      @joeduffy52 5 หลายเดือนก่อน

      I get this error but can't see anything in BIOS like you mention. My M/B is the Gigabyte X570 Aorus Elite.

    • @mr.cannon
      @mr.cannon 5 หลายเดือนก่อน

      @@joeduffy52 Enter the bios in advanced mode - go over to Tweaker tab - go down to advance CPU settings - and enable SVM mode

    • @faridgasimov1742
      @faridgasimov1742 5 หลายเดือนก่อน

      Just returned to leave this comment

  • @b34k97
    @b34k97 6 หลายเดือนก่อน +24

    "We need to go to an app that a lot of people have never used before.... its called 'Terminal'". OMG that line had me dying!

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน +5

      I’ve made videos for 8 years and 99% of people never used terminal

    • @b34k97
      @b34k97 6 หลายเดือนก่อน +4

      @@SkillLeapAI No I understand, and it makes perfect sense. Just as someone who's used a terminal at school and at work for the past decade... the delivery of line just tickled my funny bone

    • @ZavierBanerjea
      @ZavierBanerjea 6 หลายเดือนก่อน +2

      Absolutely hilarious...

    • @danielrossy7453
      @danielrossy7453 5 หลายเดือนก่อน +1

      Same dude : ) And then my wife came into the room and said "are you really?" HAHAHA

  • @tikkivolta2854
    @tikkivolta2854 6 หลายเดือนก่อน +11

    the really interesting part around 07:15 is that in a few months computing power and size of these models will make it possible to run them on practically anything. when they get more effcient we'll have them in our phones.

    • @SantiagoAbud
      @SantiagoAbud 6 หลายเดือนก่อน +1

      That's the future of this technology. Not that I endorse it or judge it in any way, but it's the way the development is heading.

    • @thinhngo7244
      @thinhngo7244 6 หลายเดือนก่อน

      OpenELM from Apple is already runable on mobile devices I think

    • @tikkivolta2854
      @tikkivolta2854 5 หลายเดือนก่อน

      @@thinhngo7244 i ran LLAMA 3.1 8b on my laptop with docker and ollama. like a breeze.

    • @h0ph1p13
      @h0ph1p13 หลายเดือนก่อน

      there are LLMs that work on phones already.

  • @muhammadasad8549
    @muhammadasad8549 6 หลายเดือนก่อน +4

    Brilliant.
    I have been looking for this videos since meta announced 3.1.
    Hats off.

    • @muhammadasad8549
      @muhammadasad8549 6 หลายเดือนก่อน

      @SkillLeapAI I should be really thankful if u can make a video to deploy it on server.

  • @iskandarhussain
    @iskandarhussain 6 หลายเดือนก่อน +8

    Perfect ‼️ Just the video I was looking for with intro to how to upload files to the model‼️ THX‼️

  • @GmanBB
    @GmanBB 6 หลายเดือนก่อน +1

    You have great teaching skills. Thank you for making it so simple!

  • @skywalkerjedi95
    @skywalkerjedi95 6 หลายเดือนก่อน +1

    Thanks! This video was awesome and really detailed! Can’t wait to start trying this.

  • @micbab-vg2mu
    @micbab-vg2mu 6 หลายเดือนก่อน +3

    Great - I will try 70B - ) thanks for instruction how to do it:)

    • @burntdoug6823
      @burntdoug6823 6 หลายเดือนก่อน

      What's your PC specs that can handle 70b?

    • @Mewmew-y4m
      @Mewmew-y4m 5 หลายเดือนก่อน

      @@burntdoug6823
      Cpu: intel i5 2400
      Ram: 4gb
      Gpu: integrated Graphics
      Storage: 120Gb Hhd

  • @Ilan-Aviv
    @Ilan-Aviv 5 หลายเดือนก่อน

    great video for local ai. easy and clear explanation. thank you!

  • @FastWReX
    @FastWReX 6 หลายเดือนก่อน +5

    No joke. I've always hated Docker. Hated everything about it. However, seeing you run the Openweb UI command and it just randomly showed up in the Docker app is making me reconsider. Holy moly! Might have to reinstall it on my Raspberry Pi 5.

  • @Hilmz
    @Hilmz 6 หลายเดือนก่อน +122

    Not private, its a hybrid model, caches data when offline then when connected back you can see it sends data. Use wireshark it will show you its sending data

    • @longboardfella5306
      @longboardfella5306 5 หลายเดือนก่อน +31

      This is an important comment because many channels are saying OpenWebUI and Llama3 is private to your machine. Is there any way to turn off the cache sending process? I would like to analyse documents that are not permitted to be publicly uploaded

    • @MihaMartini
      @MihaMartini 5 หลายเดือนก่อน

      @@longboardfella5306 Ollama and Llama3 are private, but OpenWebUI might send some analytics and other data.

    • @bryanjuho
      @bryanjuho 5 หลายเดือนก่อน +4

      Is this true? Any resource that supports this statement?

    • @DudethatGross
      @DudethatGross 5 หลายเดือนก่อน +23

      @@longboardfella5306 run it in a VM or container that has the network adapter disconnected from the internet, or wifi off entirely

    • @andresshamis4348
      @andresshamis4348 5 หลายเดือนก่อน +8

      Llama is literally private, maybe openwebui…. However what would the ui need to cache and send over the internet? Doesnt make sense to me

  • @womble_1034
    @womble_1034 6 หลายเดือนก่อน +1

    subscription incoming!! great content, keep up the good work

  • @kamaboko1
    @kamaboko1 2 หลายเดือนก่อน

    Awesome! Thanks for the post!!!

  • @pertsiya
    @pertsiya 6 หลายเดือนก่อน +1

    Thank you for your will to share with us!

  • @Fayaz-Rehman
    @Fayaz-Rehman หลายเดือนก่อน

    Excellent - Thanks for sharing.

  • @southcoastinventors6583
    @southcoastinventors6583 6 หลายเดือนก่อน +27

    Sounds like step one is buy a $5,000 M3 if you want to run it locally now at least before they release smaller quantizations of Llama 70B and 8B

    • @melaronvalkorith1301
      @melaronvalkorith1301 6 หลายเดือนก่อน +8

      Llama3.1 8B runs much faster than you can read on an RTX 2060 Super. Not dirt cheap, but I built my PC for $1.4K 4 years ago - should be cheaper now, and I built it for gaming, not AI.
      You can host it on a desktop and connect your other devices with a VPN like Tailscale. Don’t spend extra money for less performance by going for small form-factor/laptop.

    • @CodyAvant
      @CodyAvant 6 หลายเดือนก่อน +5

      I run 8b on my 2020 M1 MacBook Air (8gb ram and 7 core GPU) and the output token rate is much faster than a normal speech cadence.

    • @mcombatti
      @mcombatti 6 หลายเดือนก่อน +6

      I have a 9 year old windows computer that runs 13b and 8b models the same speed as your brand new $5000 Mac 👀 it was purchased at a Walmart for $270 in 2015 😂

    • @bassamel-ashkar4005
      @bassamel-ashkar4005 6 หลายเดือนก่อน +2

      Bullshit detected ​@@mcombatti

    • @mcombatti
      @mcombatti 6 หลายเดือนก่อน

      @bassamel-ashkar4005 As a machine learning engineer and master developer - over 25 years now, I must say you need to learn a bit about inference to speak. 8b can even run on edge devices like orange pi or Raspberry Pi. When models are converted to rknn for such devices, you can achieve greater speeds than base inference. Just replacing the matmul function in transformers with a more optimized function (that uses less 'compute') a model can generate 3x more tokens/ second. Pruning downstream layers and using sporadic attention, a KernelAttentionNetwork or even a KAN (not the same) in place of transformers can gain greater speeds. Until you can design an LLM architecture from scratch...shhh...
      Not to mention our novel model architecture using stock CPUs can generate >3000 tokens per second - on base mobile phones with only 4GB RAM and an 8 core processor (cost of such phone $100-400). We have demonstration after demonstration available online and social platforms like LinkedIn- where we discuss the architecture advancements with other real engineers. Things now just coming to light, we've been playing with for nearly a decade. 🤗 Nothing is new under the sun.
      A known company is currently licensing the architecture from us for their new mobile devices to be released later this fall. #NDA for collaboration. 🙏

  • @kevinmiole
    @kevinmiole 5 หลายเดือนก่อน

    I hope they give access to internet. Thank you for this

  • @naeemulhoque1777
    @naeemulhoque1777 5 หลายเดือนก่อน

    Nice, straight forward video.
    Can you make a Hardware buying guide please?

  • @chintrupal4512
    @chintrupal4512 หลายเดือนก่อน

    I can see microphone icon on the right side typing window, can it be used to summarize meetings ?

  • @PowerON-Tech
    @PowerON-Tech 5 หลายเดือนก่อน

    Thank you very much for this video.

  • @MrAshwin27
    @MrAshwin27 3 หลายเดือนก่อน

    Huge respect

  • @ArekMateusiak
    @ArekMateusiak 4 หลายเดือนก่อน +2

    Hi, does anyone knows what is needed to run well full 70B Llama 3.1 model? so it responds quickly?

  • @null4624
    @null4624 4 หลายเดือนก่อน +1

    thanks dude, I was able to run llama3.1 8b on a linux laptop with 8gb ram and am impressed..

    • @shampaghosh1241
      @shampaghosh1241 4 หลายเดือนก่อน

      Wow did you do any special modifications? My device is also has 8GB ram and an Intel i3 processor do you think I can possibly run it with decent speed?

    • @null4624
      @null4624 4 หลายเดือนก่อน

      @@shampaghosh1241 No special mods, just selected the smallest model and followed the steps from this video and ran some prompts.

  • @tikkivolta2854
    @tikkivolta2854 6 หลายเดือนก่อน +5

    i would love for you to create a tutorial how to train these models on specific data. any chance?

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน +8

      Adding to the list

    • @nohandle8008
      @nohandle8008 4 หลายเดือนก่อน

      @@SkillLeapAI awesome, thank you. I have a specific use case requiring very specific data to train the model, would love to see how effective it can be. Also concerned with the data flowing back online, can you elaborate on what is sending date back out when the machine is reconnected to the internet as others have mentioned?

  • @vladyslavklochan4181
    @vladyslavklochan4181 6 หลายเดือนก่อน +1

    Thank you for tutorial.

  • @gaganmadhan733
    @gaganmadhan733 4 หลายเดือนก่อน +1

    Can we store the files on cloud storages like AWS S3 and then run it or deploy it ??

  • @II-qh7xn
    @II-qh7xn 5 หลายเดือนก่อน

    worked with issues hats off

  • @yetkindev
    @yetkindev 5 หลายเดือนก่อน

    you are the perfect man thank you

  • @billl1715
    @billl1715 4 หลายเดือนก่อน +1

    The steps doesn't show how the models gets loaded onto WebUI. When I'm in WebUI, there are no models.

    • @SkillLeapAI
      @SkillLeapAI  4 หลายเดือนก่อน

      They should automatically. I didn’t do anything in webui to add them. They were added through terminal in easier steps

    • @Xiaoklunar
      @Xiaoklunar 2 หลายเดือนก่อน

      On the top corner there is choose model

  • @wettingfairy6764
    @wettingfairy6764 6 หลายเดือนก่อน +1

    讲的很清楚,很棒的入门指引

  • @theblockchainlawyer4877
    @theblockchainlawyer4877 9 วันที่ผ่านมา

    Is there a way to install Ollama on a Windows drive not labeled C? I want to install on a D drive, I transferred downloaded exe file to D drive, tried to install from there, no luck, no selection option with install tool?

  • @Kingkimabdu5090
    @Kingkimabdu5090 4 หลายเดือนก่อน +1

    Everything seemed fine until I clicked the link in Docker. The website page opened with an error message stating, "This page isn’t working." Can anyone offer assistance?

  • @rafaeel731
    @rafaeel731 3 หลายเดือนก่อน

    Thanks for the vid, a couple of confusions to share:
    12:18 how can a LLama 3.1 model not know anything about LLama 3 because of delay? it doesn't make a lot of sense
    Plus you compared 8B on a text exchange while you gave the 70B model a python code to decipher, then you gave the 8b a text file to summarise. We can't compare execution times unless the task is identical

  • @sumitksrivastava
    @sumitksrivastava 2 หลายเดือนก่อน

    Why do I get 400: 'NoneType' object has no attribute 'encode' error anytime i try to upload a document?

  • @andyli541
    @andyli541 6 หลายเดือนก่อน +3

    Is there a way to bring this local running Llama 3.1 onto my website? I want to share my trained AI with other people. Thanks!

    • @bilza2023
      @bilza2023 5 หลายเดือนก่อน

      There are special server on digital ocean .. but simply you installon your website and make it available throught an API.

  • @kirarakurokawa8747
    @kirarakurokawa8747 7 วันที่ผ่านมา

    when i upload an image to it all it says is "i cannot see the image"

  • @SimonFeay
    @SimonFeay 3 หลายเดือนก่อน

    when I go to workspace I don't seem to have the "Documents" tab.

  • @dashingtoon
    @dashingtoon 3 หลายเดือนก่อน

    When I try to run open-web ui, Flowise opens up. I guess they are listening to the same ports =/ ? Can I solve this in any way?

  • @walter3663
    @walter3663 5 หลายเดือนก่อน

    Thanks for the great tutorial. Can you let me know the defaut path to store chat histories? Is it possible to change it?

  • @lucifergaming9491
    @lucifergaming9491 6 หลายเดือนก่อน +1

    i use ubuntu my web ui doesnt show any models after correct installation

  • @WIWUPRODUCTIONS
    @WIWUPRODUCTIONS 2 หลายเดือนก่อน

    My docker won't show the address/ports after installing everything. What can be the problem? I'm trying to google it all day and can't find an answer

  • @terrysh7264
    @terrysh7264 5 หลายเดือนก่อน

    Hi. TY for this video. I'm wondering, do I need to train the model that I install?

    • @SkillLeapAI
      @SkillLeapAI  5 หลายเดือนก่อน +1

      No you can just use it after install

  • @zunairakhalid7358
    @zunairakhalid7358 6 หลายเดือนก่อน +1

    Can we do a APi call of this local LLM in my code ?

  • @arnolda7150
    @arnolda7150 6 หลายเดือนก่อน

    Than you so much. Can you tell me how to acces openAi with WiFi on or off?

  • @digigoliath
    @digigoliath 6 หลายเดือนก่อน

    I do appreciate this informative walkthrough though! TQVM!!

  • @yetkindev
    @yetkindev 5 หลายเดือนก่อน

    you have a great internet :D

  • @hydron7150
    @hydron7150 6 หลายเดือนก่อน +5

    wanted to try 70b with 4070ti 12gb ryzen 5 7600x 64gb 6000mhz ram and it is pretty slow, it takes 20 seconds to response "hi" prompt 😄

    • @DaveTheeMan-wj2nk
      @DaveTheeMan-wj2nk หลายเดือนก่อน

      7600x is a weak cpu for AI.
      It's a mere 6 core cpu lol.
      There is probably ways to get it better maybe.

  • @nosuchthing8
    @nosuchthing8 3 หลายเดือนก่อน

    Wait, what does docker do if the LLM is running?

  • @Kevin-fp5zo
    @Kevin-fp5zo 6 หลายเดือนก่อน +3

    Hello! What kind of hardware setup do you have to have to run OpenSource LLMs? LLMs are actually quite heavy and they require high GPU, RAM, and CPU power. Can you do a TH-cam video about which computer power parameters or PC brands are optimal for running them smoothly? I love your content. Keep up the good work! :)

    • @betabishop3144
      @betabishop3144 4 หลายเดือนก่อน

      They are indeed quite heavy but in case you haven't noticed, in each of the LLMs documentation there should be a section dedicated to the hardware requirements, google has it, meta has it and the other probably do too. You could look them up like "Llama 3.1 hardware requirements" and the first link should take you there

  • @kick_kisu
    @kick_kisu 5 หลายเดือนก่อน

    Great video. How can I run the docker for open webui on my Linux local server, but ollama running on my windows pc?

  • @robwin0072
    @robwin0072 5 หลายเดือนก่อน

    Great video.
    Question: can I redirect all models downloaded (installed) in Ollama, to a secondary drive inside my laptop?
    C: Primary System M.2 SSD 2T
    D: Secondary SSD 2T

  • @delldoesai
    @delldoesai 6 หลายเดือนก่อน

    Great video. Did you test how many documents you can upload at once and have it summarized?

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน

      I haven’t yet but it think it’s a good amount

  • @karthikb.s.k.4486
    @karthikb.s.k.4486 6 หลายเดือนก่อน

    Inference looks fast in local what configuration of Mac laptop are using please let me know. Thank you for nice tutorial.

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน

      It’s the top of the line m3 Mac with 64 gig of ram

  • @puccaso
    @puccaso 6 หลายเดือนก่อน

    9:20 i believe that there is also a docker credentials package that also works, and doesn't require the GUI bloat.

  • @mikemaldanado6015
    @mikemaldanado6015 3 หลายเดือนก่อน

    Nice video. I dont like u have to sign up for webui and don't like docker for security reasons but your fist two steps helped alot.. btw you can upload or pass large files on the command line. finally microisoft llama does not meet the traditional definition of open source, because it's not. What they did is create a new definition, their definition, and put it in the terms of service..... must be nice to be able to change the def of words willy nilly. also nothing we have today meets the official comp sci definition of AI, not by a long shot.

  • @filipskerik1477
    @filipskerik1477 5 หลายเดือนก่อน

    The Docker "thing" is only for embedding the model installed before to the webUI? Or its something that pushes all of the stuff to cloud and I don't need that NASA computer? Thanks

  • @mediatechtube
    @mediatechtube 4 หลายเดือนก่อน

    Nice video. What are the use cases for running AT locally on your computer at home? Whats the purpose when people can get a subscription? I can think of a few but i would like to know what others think. Cheers!

  • @moonduckmaximus
    @moonduckmaximus 5 หลายเดือนก่อน

    hey i keep getting a notification that llama 3.5 is ready to update from llama3.1? i click the notification in windows 11 but nothing happens. how can i verify or update llama 3.1?

  • @nessim.liamani
    @nessim.liamani 6 หลายเดือนก่อน

    Can we locally remove restraints on LLaMA models, including ethical safeguards?
    Thanks

  • @fangeming1
    @fangeming1 6 หลายเดือนก่อน +3

    How much vram is needed to run the model depends if the model of quantised or not. This should be explained in this video instead of giving contradictory information.

    • @tikkivolta2854
      @tikkivolta2854 6 หลายเดือนก่อน

      as much as you are correct i am fairly certain "giving contradicting information" wasn't the intent.

    • @moonduckmaximus
      @moonduckmaximus 6 หลายเดือนก่อน

      Hey do you know where we can get a comprehensive explanation to what we downloaded? i cant afford his course

  • @OutperformThemAllii
    @OutperformThemAllii 6 หลายเดือนก่อน

    Are there way to select hard drive when install?
    My C Drive is almost full, how can I select my other drive?

    • @R_RYT
      @R_RYT 6 หลายเดือนก่อน

      same issue. Please reply here if you find a solution.

    • @ouso3335
      @ouso3335 5 หลายเดือนก่อน

      th-cam.com/video/uj1VnDPR9xo/w-d-xo.html

    • @ouso3335
      @ouso3335 5 หลายเดือนก่อน

      @@R_RYT th-cam.com/video/uj1VnDPR9xo/w-d-xo.html

  • @swarnimdubey
    @swarnimdubey 5 หลายเดือนก่อน

    How much total of the storage it requires anyway?

  • @prajwalm.s7976
    @prajwalm.s7976 6 หลายเดือนก่อน

    Can I fine tune the 70B model and use the open web ui?

  • @MoonyongKim
    @MoonyongKim 6 หลายเดือนก่อน

    Hi. First of all, thanks for the video. It's really useful and easy to follow step by step. I am running M1 mac air and seems like it's not good enough to run llama 3.1 as it seems to freeze my computer. Which model would you recommend for M1 mac air?

  • @hezetlapy6190
    @hezetlapy6190 4 วันที่ผ่านมา

    Thanks

  • @CodeCraftHub-NAS
    @CodeCraftHub-NAS 6 หลายเดือนก่อน

    could you put the file on your web server then use it to be your search and or help?

  • @thevoice6853
    @thevoice6853 3 หลายเดือนก่อน

    can you do a tutorial on how to do it on windows, thanks

  • @Carlzora
    @Carlzora 6 หลายเดือนก่อน

    Are you able to upload images to use with prompts?

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน

      Most open models don’t have vision and if they do, they are not good. I would use ChatGPT for that

    • @longboardfella5306
      @longboardfella5306 5 หลายเดือนก่อน

      Lava models are pretty good for image analysis

  •  5 หลายเดือนก่อน

    Perfect. Thanx

  • @karlpedersen3342
    @karlpedersen3342 5 หลายเดือนก่อน

    is there a how to for window 11?

  • @HolographicKode
    @HolographicKode 4 หลายเดือนก่อน

    What's your hardware setup you run this model on?

    • @SkillLeapAI
      @SkillLeapAI  4 หลายเดือนก่อน +1

      I’m on m3 Mac 64 gig ram. I can run the 70B model and the small models respond almost instantly

    • @HolographicKode
      @HolographicKode 4 หลายเดือนก่อน

      @@SkillLeapAI mac book pro? mac pro? how much VRAM? (complete configuration). this has to be a $5K+ setup i suspect.

  • @HassanMohamedDahir-w4x4i
    @HassanMohamedDahir-w4x4i วันที่ผ่านมา

    select a model does't work for me :(

  • @mohdalki7271
    @mohdalki7271 3 หลายเดือนก่อน

    Can I train my data?

  • @nguyenphamduy3386
    @nguyenphamduy3386 5 หลายเดือนก่อน

    how to upload file *.xlsx. When i use this apple can't upload file. Help me

  • @dawnbunty7
    @dawnbunty7 6 หลายเดือนก่อน

    i have macbook m3 pro with 18 gb ram and 500 gb harddisk will this be sufficient

  • @aboubevwic2880
    @aboubevwic2880 3 หลายเดือนก่อน

    How is it offline, when you have to login to webUI?

    • @SkillLeapAI
      @SkillLeapAI  3 หลายเดือนก่อน +1

      You just have to create the account. You can turn off your WiFi after that

  • @NeptuneGadgetBR
    @NeptuneGadgetBR 6 หลายเดือนก่อน

    Hi, I couldn't run docker on my GPU, I have RTX 4090 which should help a lot, while on CPU is slow, do you have any idea how to enable my GPU on Docker - Windows 11?

    • @fl028
      @fl028 5 หลายเดือนก่อน

      Use --gpu all Option :)

  • @moonduckmaximus
    @moonduckmaximus 6 หลายเดือนก่อน

    Thank you for your effort. Any reason why my LLAMA3.1 would respond one letter @ a time? 70B

    • @moonduckmaximus
      @moonduckmaximus 6 หลายเดือนก่อน

      @HitsInSandbox 4090 with 128 gigs of ram....i dont use an antivirus

  • @pankajkhatnani2564
    @pankajkhatnani2564 6 หลายเดือนก่อน

    By the way, how can the internet be connected to get real time answers with this?

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน

      It can’t do real time data from the web

  • @sabuein
    @sabuein 6 หลายเดือนก่อน

    Thank you.

  • @SelvamuthuMR
    @SelvamuthuMR 6 หลายเดือนก่อน

    hugging face llama 3.1 model repo storage is 60 gb but it run very slow for one response and ollama run same llama 3.1model faster but size of llama 3.1 is around 5 gb. what is the difference

  • @dalatech1375
    @dalatech1375 5 หลายเดือนก่อน

    can i add image ?

  • @ndidiahiakwo7412
    @ndidiahiakwo7412 6 หลายเดือนก่อน

    Will the website version be capabe of uploading documents anytime soon? My computer isn't large enough to support running the offline models.

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน +1

      Not sure. I hope so

  • @jeremy4510
    @jeremy4510 4 หลายเดือนก่อน

    Can you do a video like this for using llama in Python

  • @ankurkumarsrivastava6958
    @ankurkumarsrivastava6958 6 หลายเดือนก่อน

    I installed llama3.1. Now how to remove previously installed llama3?

  • @RiftWarth
    @RiftWarth 6 หลายเดือนก่อน +4

    Wow the 405B model file size is still smaller than some of the Call of Duty games. LOL

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน +1

      Yea that’s true but it’s basically text. If it was video, it would a million times bigger

  • @andreac7389
    @andreac7389 6 หลายเดือนก่อน

    Hi, is this model multimodal like ChatGPT 4 Omni? I mean, can it generate code, solve mathematical problems, etc., or is it purely a linguistic model capable of easy conversation but unable to handle complex issues? In other words, my question is, do only the models hosted on the servers of OpenAI, Anthropic, or Meta have the capability to manage complex problems, or does this offline model also have that capability? Thank you.

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน +2

      They have some of that capability but online models are going to much better. It’s very difficult to run the better models offline. The best version of llama 3.1 is too complex to run on a computer and OpenAI and Anthropic dont have an open source model that you can run locally

  • @gRosh08
    @gRosh08 6 หลายเดือนก่อน

    Crazy cool.

  • @hiteshdesai2152
    @hiteshdesai2152 6 หลายเดือนก่อน

    this is great, thanks for puting in such simple and understandable way. I can run locally now, is there a way where I can point it out this local models to my python code, or my langchain/llama_index application code?

  • @frankvasquez4827
    @frankvasquez4827 5 หลายเดือนก่อน

    If I installed the 8B model first and then I want to install the 70b, Will I have both installed or the largest one will overwrite the 8B? Can I uninstall the models, just to save up some storage space 😅 (asking because I'm not too technical about it, I'm using Windows btw). Thanks in advance.

    • @SkillLeapAI
      @SkillLeapAI  5 หลายเดือนก่อน +1

      It will install both and you can choose between them. It won’t override. And yes you can remove. My new video covers that and some new upgrades

    • @frankvasquez4827
      @frankvasquez4827 5 หลายเดือนก่อน

      @@SkillLeapAI Thank you, I will watch it. I managed to install them with this video!

  • @NVX_Ink
    @NVX_Ink 6 หลายเดือนก่อน +1

    What would be an affordable, yet ideal, desktop workstation

  • @gRAVItation1988
    @gRAVItation1988 6 หลายเดือนก่อน

    Great job! I have a M1 mac with 16gb ram. Can i run 8b?

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน

      I think so

    • @tikkivolta2854
      @tikkivolta2854 6 หลายเดือนก่อน

      ​@@uncannyrobotdo you also train models and would care to elaborate? i'd be all ears

    • @tikkivolta2854
      @tikkivolta2854 6 หลายเดือนก่อน

      @@uncannyrobot i will find one, thank you!

  • @HaraldBendschneider
    @HaraldBendschneider 6 หลายเดือนก่อน

    I downloaded the Windows file "OllamaSetup.exe" and installed Ollama. What now? After clicking on the app-icon nothing happenes. Is there any tutorial for Windows out there? Running the app in CMD, I cannot use the shortcuts: C:\Users\user\AppData\Local\Programs\Ollama>show
    The command "show" is either misspelled or
    could not be found.

    • @fotszyrzk79
      @fotszyrzk79 6 หลายเดือนก่อน +2

      Hey! Open cmd once again and type "ollama run llama3.1" you can play with it in command window. I'm looking for interface now, to run and play it in a nicer way - docker mentioned by the OP need subscription - could be 0 USD, but I don't like to subscribe.

    • @fotszyrzk79
      @fotszyrzk79 6 หลายเดือนก่อน

      If you type just ollama it will print the available commands for you.

    • @HaraldBendschneider
      @HaraldBendschneider 6 หลายเดือนก่อน

      @@fotszyrzk79 Thank you! This worked. I can chat in the command window. But I wanted to have an UI and I don't understand what the icon in the taskbar window is for. View logs and Quit Ollama is all I can do.

    • @longboardfella5306
      @longboardfella5306 5 หลายเดือนก่อน

      @@HaraldBendschneiderOpenWebUI gives you the windows chat interface. I use it on Windows 11 and it works fine with Ollama and Docker. You can look at Matthew Berman’s channel for simple install instructions as well. Bottom line: most tutorials assume Mac but it’s not hard to get it all working on Windows - you have to hunt a bit more for work arounds

  • @extremelylucky999
    @extremelylucky999 6 หลายเดือนก่อน

    Would like to learn to do Llama + Groq + iPhone shortcuts to run llama.

  • @KrugeJu
    @KrugeJu 6 หลายเดือนก่อน

    still confused on github ... it didnt take me there so i dont have a clue as to where your at there

  • @christerjohanzzon
    @christerjohanzzon 6 หลายเดือนก่อน +1

    So, you don't need a fancy GPU to run Llama locally? It does say that you need an Nvidia GPU...but you're running a Mac? Please elaborate.

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน +1

      I have the built in Apple GPU. These are my specs. Chipset Model: Apple M3 Max
      Type: GPU
      Bus: Built-In
      Total Number of Cores: 40

    • @SkillLeapAI
      @SkillLeapAI  6 หลายเดือนก่อน

      The 8B model should run on variety of GPUs

    • @christerjohanzzon
      @christerjohanzzon 6 หลายเดือนก่อน

      @@SkillLeapAI Ah, I see. Thanks for explaining. :)

    • @longboardfella5306
      @longboardfella5306 5 หลายเดือนก่อน

      I believe modern Macs run a unified memory model which combines and distributes the GPU and CPU memory as needed. PCs don’t do this so need a specific amount of GPU ram on dedicated Nvidia GPUs to run models. I have an RTX8000 which is 24GB VRAM which runs all 8B models fine - but completely chokes on 70B models regardless of quantisation. For Macs it’s all about your total memory available and having enough modern GPU cores to do the CUDA processing as I understand it

  • @ancour
    @ancour 4 หลายเดือนก่อน

    How to add knowledge in this like gpt search whole internet and adding intelligence