Brillibits
Brillibits
  • 44
  • 415 573
How To Finetune Mixtral-8x7B On Consumer Hardware
In today's video, I discuss the new state-of-the-art model released by Mistral AI called Mixtral. This model is an 8x7x mixture of experts (MOE) model, which outperforms Llama 70B while being significantly times faster. It only activates two of the expert models at a time, resulting in roughly 7 billion parameters being activated in a forward pass for each token.
I go over the details of the model and how to fine-tune it on custom datasets to unleash its full power. I provide step-by-step instructions on how to use the fine-tuned LLMs and an instruct dataset to create an instruct model. I also discuss the hardware requirements, including the need for roughly 48GB of VRAM total(two RTX 3090s or RTX 4090s) and at least 32GB of RAM.
I explain the process of creating the dataset using the Dolly 15K dataset and the format of the instruct model. Additionally, I provide a walkthrough of the fine-tuning process using the Finetune_LLMs software, highlighting the important flags and options.
I discuss the performance characteristics of the fine-tuned model and demonstrate how to use the text generation inference to get results. I also give some thoughts on the future of mixture of experts models and the potential to enhance the model by selecting more experts at a time.
If you're interested in fine-tuning the Mixtral model and gaining insights from custom datasets, this video provides a comprehensive guide. Don't forget to like the video, subscribe to the channel, and join the Discord community for further discussions. Stay brilliant!
github.com/mallorbc/Finetune_LLMs
github.com/mallorbc/llama_dataset_formats
docs.docker.com/engine/install/ubuntu/
docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
huggingface.co/Brillibits/Instruct_Mixtral-8x7B-v0.1_Dolly15K
#MistralAI #MixtralModel #FineTuning #MOEModel #CustomDatasets
#GPT3 #GPT4 #GPT #Llama #ai
00:00 - Intro
00:32 - Model Overview
02:52 - Software And Hardware Requirements
07:29 - Creating Instruct Dataset
11:53 - Setting Up Finetuning Software
13:55 - Finetune Program And Flags
17:28 - Finetuning
19:49 - Testing Finished Model
21:10 - My Thoughts
22:13 - Outro
มุมมอง: 2 300

วีดีโอ

Fine-Tuning Llama 2 70B on Consumer Hardware(QLora): A Step-by-Step Guide
มุมมอง 6K10 หลายเดือนก่อน
In this video, I take you through a detailed tutorial on the recent update to the FineTune LLMs repo. This tutorial covers the process of fine-tuning Llama 70B on consumer-grade hardware. Specifically, I highlight the vital role of recent innovations like QLora and FlashAttention 2 in enabling such fine-tuning. I guide you through the nitty-gritty of conda environment setup, pip install from a ...
The 4 Essential Dataset Types for LLMs: A Deep Dive
มุมมอง 57010 หลายเดือนก่อน
In this informative video, I delve deep into the complexities of different types of dataset formats that can be utilized when fine-tuning LLMs, using Llama 2 as an example. Throughout the video, I discuss four primary dataset types, namely the pre-training format, simple format, instruct format, and chat format, explaining their specific usefulness in model training. I also share insights on th...
OpenAI Whisper Mic Update: Pip Installable and Improved Functionality
มุมมอง 7Kปีที่แล้ว
In this video, we dive into an exciting update to the Whisper Mic repo! Previously a simple project, it now boasts enhanced functionality and easy installation using pip. Whether you prefer using it as a CLI tool or integrating it into other projects, this update has got you covered. Watch as we explore the code, commands, and demonstrate the performance of this powerful tool. We also discuss p...
Best Commercial Friendly Open Source LLMs: Falcon-40B And MPT-7B
มุมมอง 643ปีที่แล้ว
In this video, we discuss the best large language models available, considering their performance, additional capabilities, and licensing for commercial applications. We take a look at the OpenLM leaderboard on Hugging Face and highlight the top model, Falcon 40B Instruct, which recently changed its license to the Apache license, making it more commercially friendly. We also introduce MPT, a mo...
StableLM: The New Best Open Source Base Models For GPT Apps!
มุมมอง 1.4Kปีที่แล้ว
Stability AI recently release 3B and 7B of what they are calling StableLM. If the early metrics are anything to go by these models will be the best models to build from for your generative AI applications. StableLM trains on more data like the LLama models, has the largest open source context window of 4096, and is under a permission license! Github: github.com/Stability-AI/StableLM Discord: di...
How To Fine-tune The Llama 1 Models(GPT3 Alternative)
มุมมอง 31Kปีที่แล้ว
The LLaMA models have impressive performance despite their relatively smaller size, with the 13B model being even better than GPT3! In this video, I go over how you can make the models even more powerful, by finetuning them on your own dataset! Github: github.com/mallorbc/Finetune_GPTNEO_GPTJ6B Model Link: huggingface.co/decapoda-research/llama-7b-hf LLaMA PR: github.com/huggingface/transformer...
Fine-tuning GPTJ(GPT3) With Docker And WandB Update
มุมมอง 4.3Kปีที่แล้ว
In this video, I go over some updates to the fine-tuning repo. Including docker support, wandb support, and other features. Repo: github.com/mallorbc/Finetune_GPTNEO_GPTJ6B Discord: discord.gg/F7pjXfVJwZ #gpt3 #chatgpt #gptj #docker #machinelearning #nlp #wandb #ai Timestamps 00:00 - Intro 00:45 - README Requirements 02:30 - Cloning Repo 03:15 - Looking At Docker Files 06:20 - Running Docker Im...
AI Voice Commands With Brillibot API
มุมมอง 702ปีที่แล้ว
Having computers able to understand what we want from them through verbal commands opens up many possible applications to make our lives easier. Unfortunately, it's often hard to properly leverage the latest methods to accomplish that task. That's where Brillibot comes in! Brillibot is a simple-to-use voice command API that takes only a few minutes to set up. #AI #machinelearning #speechtotext ...
Rebranding Channel To Brillibits
มุมมอง 360ปีที่แล้ว
Today I am announcing the rebranding of the channel from just "Blake" to Brillibits. Through this change, I hope it becomes easier to find the channel in searches. website: brillibits.com Discord: discord.gg/F7pjXfVJwZ Donations(if you want): Ethereum: 4CE913643909Fa3168297cC2857C0aDdAB389Ad8 Monero: 45mqN96o5JZZhwVTZHHiQJeyhp3WndiPC44hdrAmWeDGeCmaC1c45gTGh5eDUtEhx3JDGbsAnsD3VBXKdiorUhusUydLG22...
OpenAI Whisper: Speech To Text With Microphone Demo(Update In Description)
มุมมอง 30Kปีที่แล้ว
OpenAI has released an amazing speech text model called Whisper. It is by far the best model for this task that has been released for speech-to-text. In this video, I go over the background of the model and go over how to run it with a mic. #ai #speechtotext #machinelearning #openai #whisper #python Updated Video: th-cam.com/video/S58MGCU7Wgg/w-d-xo.html OpenAI blog post: openai.com/blog/whispe...
Stable Diffusion Tutorial: GUI, Better Results, Easy Setup, text2image and image2image
มุมมอง 25Kปีที่แล้ว
Stable Diffusion was recently released revolutionizing the open source AI community. This video goes over how to set up using docker. If one has docker already there are only a few steps and it will always work due to the nature of docker #AI #dalle2 #StableDiffusion #machinelearning #python #aiart Github: github.com/mallorbc/stable-diffusion-klms-gui model download: huggingface.co/CompVis Disc...
Installing Nvidia-Docker On Windows 10/11
มุมมอง 22Kปีที่แล้ว
In this video, I go over how to install WSL2 and Nvidia-Docker on the latest versions of Windows 10 and 11. I will reference this video many times in the future, as there are many cases where the best course of action for a model is to use docker. #docker #nvidia #machinelearning #ai Discord: discord.gg/F7pjXfVJwZ Donations(if you want): Ethereum: 4CE913643909Fa3168297cC2857C0aDdAB389Ad8 Monero...
CogVideo Tutorial: DALL-E For Text To Video Generation
มุมมอง 2.3K2 ปีที่แล้ว
DALL-E is well known for its image generation given a sentence. Well, the next logical step is video generation given a sentence, and that is what CogVideo aims to do. This is an exciting early step in this area of research. #ai #aiart #machinelearning #dalle2 #stablediffusion Github: github.com/THUDM/CogVideo Discord: discord.gg/F7pjXfVJwZ Donations(if you want): Ethereum: 4CE913643909Fa316829...
Running GPT-NeoX-20B With Hugging Face
มุมมอง 4.9K2 ปีที่แล้ว
GPT-NeoX-20B has been added to Hugging Face! But how does one run this super large model when you need 40GB of Vram? This video goes over the code used to load and split these large language models over multiple devices in order to run them. Github: github.com/mallorbc/GPTNeoX20B_HuggingFace Discord: discord.gg/F7pjXfVJwZ Donations(if you want): Ethereum: 4CE913643909Fa3168297cC2857C0aDdAB389Ad...
Resume Chatbot
มุมมอง 6992 ปีที่แล้ว
Resume Chatbot
How To Run DALL-E Mini/Mega On Your Own PC(Windows)
มุมมอง 12K2 ปีที่แล้ว
How To Run DALL-E Mini/Mega On Your Own PC(Windows)
How To Run DALL-E Mini/Mega On Your Own PC
มุมมอง 48K2 ปีที่แล้ว
How To Run DALL-E Mini/Mega On Your Own PC
How To Run GPT-NeoX-20B(GPT3)
มุมมอง 18K2 ปีที่แล้ว
How To Run GPT-NeoX-20B(GPT3)
AI Drone Footage With YOLOv5
มุมมอง 1K2 ปีที่แล้ว
AI Drone Footage With YOLOv5
GPT-NeoX-20B Model Announcement And Quick Comparison With GPTJ
มุมมอง 2.3K2 ปีที่แล้ว
GPT-NeoX-20B Model Announcement And Quick Comparison With GPTJ
Splitting GPT-J(And Other NLP Models) Over Multiple GPUs
มุมมอง 2.6K2 ปีที่แล้ว
Splitting GPT-J(And Other NLP Models) Over Multiple GPUs
HuggingFace GPT-J: Usage and Fine-tuning(Update in description)
มุมมอง 20K2 ปีที่แล้ว
HuggingFace GPT-J: Usage and Fine-tuning(Update in description)
Solana's BIG Problem: Why Solana Went Offline
มุมมอง 5832 ปีที่แล้ว
Solana's BIG Problem: Why Solana Went Offline
GPT-J(GPT 3) Few Shot Learning: Teaching The Model With Few Examples
มุมมอง 3.8K2 ปีที่แล้ว
GPT-J(GPT 3) Few Shot Learning: Teaching The Model With Few Examples
GPT 3(GPT-J-6B) Fake News Generator
มุมมอง 2.3K2 ปีที่แล้ว
GPT 3(GPT-J-6B) Fake News Generator
Fine-tuning GPT-J-6B(GPT 3)(Update in description)
มุมมอง 15K2 ปีที่แล้ว
Fine-tuning GPT-J-6B(GPT 3)(Update in description)
GPT-J-6B(GPT 3): How to Download And Use(Update in description)
มุมมอง 29K3 ปีที่แล้ว
GPT-J-6B(GPT 3): How to Download And Use(Update in description)
Creating A Custom Dataset For GPT Neo And GPT-J-6B(GPT3)
มุมมอง 18K3 ปีที่แล้ว
Creating A Custom Dataset For GPT Neo And GPT-J-6B(GPT3)
How Does Bitcoin Mining Work? Explained With Python Code
มุมมอง 6403 ปีที่แล้ว
How Does Bitcoin Mining Work? Explained With Python Code

ความคิดเห็น

  • @tihunvolkov9288
    @tihunvolkov9288 9 วันที่ผ่านมา

    ебать как у тебя всё просто, хуй у меня что работает так...

    • @tihunvolkov9288
      @tihunvolkov9288 9 วันที่ผ่านมา

      Inconsistency detected by ld.so: ../sysdeps/x86_64/dl-machine.h: 534: elf_machine_rela_relative: Assertion `ELFW(R_TYPE) (reloc->r_info) == R_X86_64_RELATIVE' failed!

  • @shravanhegde2237
    @shravanhegde2237 10 วันที่ผ่านมา

    heyy..i am working on something similar using whisperx but i am alsomaking a sample html file for start and stop recording and to print the transcription into the text area..but i have run into a problem..so basically this html sends around 10-15 seconds chunk to my python code as a .webm file which is internally converted to wave file which is then used by whisperx to transcribe but whisperx isnt transcribing..like in a 16 second audio it just print "like it". Is there anyway i could make it do it accurately and is there anyway i could use the loop snippet in my script tag from your github, thank youu

  • @lloydbailey1613
    @lloydbailey1613 25 วันที่ผ่านมา

    I'm a teacher of philosophy, psychology, and political science at a highschool. I want to introduce this perspective to my students for discussion, but I'm so illiterate I don't know where to even begin. What do I do from the second I sit down at my computer?

  • @ianchan2624
    @ianchan2624 หลายเดือนก่อน

  • @PlanetaryPoetsMultimediaGroup
    @PlanetaryPoetsMultimediaGroup 2 หลายเดือนก่อน

    thankyou mate 👍

  • @mattaylor5817
    @mattaylor5817 3 หลายเดือนก่อน

    i have been mucking about for days trying to do one simple thing with so many different approaches and it just worked out with one simple command!! whisper_mic --model small --loop --dictate thanks , good work

  • @enovsky2392
    @enovsky2392 3 หลายเดือนก่อน

    i can't download the Models

  • @ivorvrhovec6984
    @ivorvrhovec6984 3 หลายเดือนก่อน

    DOkerAnd

  • @inbox0-AI
    @inbox0-AI 4 หลายเดือนก่อน

    I can't get this to run with your exact commands and weird file formats. It keeps throwing an error, "response template not set" Which is odd because there is no variable for response template. I tuned off complition_complete it ran the fine tune. Edit: I'm an idiot on the saving checkpoints but the completion complete part still wasn't working for me.

  • @rdm7770
    @rdm7770 4 หลายเดือนก่อน

    Going to play with this today! Thanks so much for posting this and your great technical work \m/

  • @makadi86
    @makadi86 5 หลายเดือนก่อน

    does not work probably with python 3.12 the code some time works fine and in other time it hang with no result even after the timeout is passed for listening

  • @userdetails1
    @userdetails1 5 หลายเดือนก่อน

    "I will assume you already have a docker installed" a what?

  • @phenixis
    @phenixis 5 หลายเดือนก่อน

    Hi, this video have been make since years ago so I didnt hope a awnser ... but I have this error when I make the 5:58 . exec /opt/nvidia/nvidia_entrypoint.sh: exec format error can you help me ^^

  • @hongodarongo
    @hongodarongo 5 หลายเดือนก่อน

    I recommend watching the whole video first(stupid me didn't). I don't know if this order makes any logical sense but. - run the command to check version first to see if WSL 2 is already installed - try the bash nvidia smi command and see if you get the result like in the video - if not then you might need to install wsl

  • @lewing-alt1551
    @lewing-alt1551 5 หลายเดือนก่อน

    but the link isn't up in the corner right now :'(

  • @scottm2998
    @scottm2998 5 หลายเดือนก่อน

    This simply doesn't work anymore. At least not on WSL. A repo link in the DockerFile is broken. When you go to run the image, there are so many failures.

  • @OnimesShow
    @OnimesShow 5 หลายเดือนก่อน

    Hi, thanks for the video. Are there any differences in requirements if I only need to interrogate the model? For example through a chat? I do not need to fine tune it

  • @dipanshuhaldar4239
    @dipanshuhaldar4239 6 หลายเดือนก่อน

    Do you have a tutorial for fine tuning GPT Neo on custom data set ?

  • @1lyf
    @1lyf 6 หลายเดือนก่อน

    Hi Blake, Thank you very much for the video. Could you please upload a tutorial on text-generation-inference? and in your previous LLM finetuning you were using Deepspeed and finetuning the whole model could you please advise if the same can be done on Mixtral 8x7B?

  • @robertfontaine3650
    @robertfontaine3650 6 หลายเดือนก่อน

    2-3090's and nvlink seem like the lowest entry to llama 2 70b. 2 used 3090's are about the price of a single 4090. Still too expensive for my wallet but at least something I can dream about.

  • @romanbolgar
    @romanbolgar 6 หลายเดือนก่อน

    Спасибо но это всё сложно когда уже появится установка в один клик?

  • @GaneshKrishnan
    @GaneshKrishnan 7 หลายเดือนก่อน

    can you also please add the commands in the description of your video so its easier to copy paste?

  • @chris_zhp
    @chris_zhp 7 หลายเดือนก่อน

    Hi, I was trying to replicate the LLaMA 2 70B fine-tuning with 2 4090s, even with the --split_model flag, the model is loaded to one GPU only before OOM. I tried with the 7B model, which is loaded to one GPU then replicate to the other. It seems it's running in data parallel not model parallel. Is nvlink required for it to work correctly?

    • @Brillibits
      @Brillibits 7 หลายเดือนก่อน

      You have to run it in Qlora mode as well

    • @chris_zhp
      @chris_zhp 7 หลายเดือนก่อน

      I ran it in Qlora mode but its only loading to GPU 0. It kept loading till around shard 10/15 and OOM for GPU 0 while GPU1 was idle. I checked it and found the device map from Accelerator only include GPU 0, where I tried to configure accelerate with "accelerate config" setting 2 GPUs but still not working.@@Brillibits

    • @Brillibits
      @Brillibits 7 หลายเดือนก่อน

      Are you running it in docker? Can your system see both GPUs?@@chris_zhp Edit: I took a look at the code and believe they may be a bug. I will push a fix now. Let me know if it works. Go to Github and open an issue if its not

    • @chris_zhp
      @chris_zhp 7 หลายเดือนก่อน

      @@Brillibits Ty!!!! It works properly now. Thanks for sharing this amazing video <3

    • @cul8terworld
      @cul8terworld 7 หลายเดือนก่อน

      @@chris_zhp thanks for pointing out the issue and thanks for watching!

  • @_TYKER_
    @_TYKER_ 7 หลายเดือนก่อน

    Great work! Some questions from my side: How do I turn of the loop, when I am calling whisper_mic from another program? And how can I speed up the mic selection?

  • @Artholos
    @Artholos 7 หลายเดือนก่อน

    This is great, but when I run the listen_loop, it's nowhere near as fast as it is in the video. What needs to be changed to get that fresh quick real-time transcription?

  • @user-qm4to2ie1x
    @user-qm4to2ie1x 8 หลายเดือนก่อน

    Could someone also use this code and not only transcribe the audio but also diarize it? For example with 2 users speaking the one after the other.

  • @SomethingRandomChannel
    @SomethingRandomChannel 8 หลายเดือนก่อน

    Mate, this is absolutely fantastic. Thank you so much for your efforts on this repo. I was struggling to find a better way to do this coming off the normal speech_recognition library technique everyone uses but didn't want to send all my audio to google. Which would sometimes just not respond at all. I'm so glad I found your project, great stuff man love your work!

  • @zeuspasquale42
    @zeuspasquale42 8 หลายเดือนก่อน

    Hello, the content you make is impressive, I always wanted to try these text generation models and I have a question, I am about to buy a PC and I was thinking of buying an 8GB RX 580 (unfortunately I only have the budget for that in my country) and I was thinking if I could buy two of those I could do fine-tuning to the 1.3b version plus 32 GB of RAM, anyway thank you very much for your content it is very informative

    • @Brillibits
      @Brillibits 8 หลายเดือนก่อน

      Thanks for watching. You are gonna wanna follow the new video on Llama 70B but for these smaller models. You are also gonna want a nvidia GPU, even if its something like a used GTX 1080. I reccomend rtx 2060 12GB if possible.

  • @RonnyTeles
    @RonnyTeles 9 หลายเดือนก่อน

    Hello! Thanks for the video. I hear you tell on the video but I need a confirmation, because I have to buy and here it ts vert expensive: Onde single card RTX 3090 can handle the model GPT-J-6B?

    • @Brillibits
      @Brillibits 9 หลายเดือนก่อน

      Yes. A 3090 can handle GPT 6B. It can run models that are even twice as large.

  • @alisoneugene726
    @alisoneugene726 9 หลายเดือนก่อน

    Promo_SM

  • @yongtao9433
    @yongtao9433 9 หลายเดือนก่อน

    cool! how about 8 v100 16g. 128g vram in total.

    • @Brillibits
      @Brillibits 9 หลายเดือนก่อน

      You can split the mode over multiple GPUs like I did. I suspect that this would work.

  • @TV-ch4ql
    @TV-ch4ql 9 หลายเดือนก่อน

    What is the training time of this model in 3090?

    • @Brillibits
      @Brillibits 9 หลายเดือนก่อน

      Depends on MANY things. Dataset size, content, etc. I did this model in like a day.

    • @TV-ch4ql
      @TV-ch4ql 9 หลายเดือนก่อน

      @@Brillibits I used the Dolly15k data set just like your video. I have one RTX A6000(48G) on my server. The estimated time so far is 694 days. After 7 days, the learning progress is 0%. Are there any expected issues? Please advise. thank you. 0%| | 817/737000 [20:07:57<16242:16:41, 79.43s/it] The execution command is as follows. python trl_finetune.py --block_size 1024 --eval_steps 368 --save_steps 368 -tf instruct_train_42.csv -vf instruct_validation_42.csv -m meta-llama/Llama-2-70b-hf --split_model -b 1 --log_steps 368 -lr 2e-4 -e 1000 --gradient_accumulation_steps 16 --pad_token_id=18610 --use_int4

  • @kyledinh8369
    @kyledinh8369 9 หลายเดือนก่อน

    very well explained!

    • @Brillibits
      @Brillibits 9 หลายเดือนก่อน

      Glad it was helpful!

  • @nz69
    @nz69 9 หลายเดือนก่อน

    faster-whisper implement?

    • @Brillibits
      @Brillibits 9 หลายเดือนก่อน

      Definitely a good idea. Not currently supported. May add in the future. Definitely open to PRs

    • @sirlotas
      @sirlotas 8 หลายเดือนก่อน

      That would make this just brilliant mate!

  • @yakupakpnar4017
    @yakupakpnar4017 9 หลายเดือนก่อน

    Can somebody please help? When i try to generate a image it says at the end: ImportError: DLL load failed: The specified module could not be found.

  • @itzslyr643
    @itzslyr643 9 หลายเดือนก่อน

    I see that after finetuning the model i get a .json and .bin adapter file, how would I run my model using these? Can I use them with llama cpp or how do I go about using the finetuning. I guess my goal is a chat that uses my finetuned model.

  • @threepe0
    @threepe0 10 หลายเดือนก่อน

    It'd be great to allow for a udp stream as an alternative to a device. Wondering if it'd be feasible to replace the timeout with a sort of voice activity detection timeout. These two additions would make this a fantastic component for a home automation system.

  • @iforels
    @iforels 10 หลายเดือนก่อน

    Thanks for an interesting concept! Did you try to improve math reasoning for 70B llama2? I have a small cluster with 8 GPUs NVIDIA A100 40G and try to find a dataset to improve the base model

  • @Brillibits
    @Brillibits 10 หลายเดือนก่อน

    Discord: discord.gg/F7pjXfVJwZ

  • @tiagocosta2689
    @tiagocosta2689 10 หลายเดือนก่อน

    amazing video!

  • @tiagocosta2689
    @tiagocosta2689 10 หลายเดือนก่อน

    great video!

  • @gr8frendforu
    @gr8frendforu 10 หลายเดือนก่อน

    Will this work on 2 tesla v100 64GB?

    • @Brillibits
      @Brillibits 10 หลายเดือนก่อน

      It should. Once the GPUs get old enough, int4 does not work as well at reducing memory requirements. My m40 for example does not reduce the model memory usage as much. That is the only way it wouldnt work.

  • @unveil7762
    @unveil7762 10 หลายเดือนก่อน

    Nice i ll try that in Touchdesigner i have a chat gpt reader and i was looking for something to send my voice to chat chpt. This will work amazing

  • @mild3510
    @mild3510 10 หลายเดือนก่อน

    Can i run it by 4090 ??

    • @Brillibits
      @Brillibits 10 หลายเดือนก่อน

      You need more VRAM than a single 4090 can provide. If you have two of them, yes it can work.

    • @ahmetttt10
      @ahmetttt10 9 หลายเดือนก่อน

      @@Brillibits Fine-tuning with 2 GPUs took how long? How long do you think it would take to fine-tune a text dataset with 100k rows?

    • @plainpixels
      @plainpixels 9 หลายเดือนก่อน

      You can run it on a 4090 if you covert it to exllama2 format and load it with exllama2.

  • @woongda
    @woongda 10 หลายเดือนก่อน

    you need 2 x 3090 or 48GB VRAM to finetune 70B model ? So for 13B model, I should be able to do the same with 1 3090 card? I hope to have more hardware requirement details so that I can determine with this procedure is useful to me. Thanks!

    • @Brillibits
      @Brillibits 10 หลายเดือนก่อน

      13B is definitely possible with a single 3090

    • @Aristocle
      @Aristocle 10 หลายเดือนก่อน

      @@Brillibits then, do you need a vram with about x2 ##B parameters?

  • @leesanghun4950
    @leesanghun4950 10 หลายเดือนก่อน

    Hello, I just found this repo today and I want to use it translator for korean... how to set up this language to others?

  • @nair889
    @nair889 11 หลายเดือนก่อน

    what's that terminal? just i cant use theese commands

    • @Brillibits
      @Brillibits 10 หลายเดือนก่อน

      Highly recommend you look at the update video. THis one is very old.

  • @boasorte6808
    @boasorte6808 11 หลายเดือนก่อน

    I'd like to do some testing and research with this, but my machine doesn't have a GPU, I have approx 30 RAM and a multi-core CPU. I don't want to pay a virtual machine in the cloud for research, it will be wasted money, I prefer to understand how it works on my local machine and later when I understand how to do it, pay to train it. What are you suggesting? I think I've read that you can train without a GPU, but I don't know where to start.

    • @Brillibits
      @Brillibits 10 หลายเดือนก่อน

      You can't train models this large without a GPU

  • @aeharrison1able
    @aeharrison1able 11 หลายเดือนก่อน

    nice. how would i stop the loop in my assistant?

    • @Brillibits
      @Brillibits 10 หลายเดือนก่อน

      ctrl c

  • @adamryason5509
    @adamryason5509 11 หลายเดือนก่อน

    This video is just installing WSL and Docker. To use NVIDIA gpus you have to take additional steps including the NVIDIA Container Toolkit

    • @wireghost897
      @wireghost897 5 หลายเดือนก่อน

      LMAO, exactly.

    • @dicksonchng8189
      @dicksonchng8189 5 หลายเดือนก่อน

      This is why i miss the youtube dislike button

    • @RoyAAD
      @RoyAAD หลายเดือนก่อน

      Is there a vid out there for the toolkit?