How To Run GPT-NeoX-20B(GPT3)

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ก.ย. 2024

ความคิดเห็น • 55

  • @aamirmirza2806
    @aamirmirza2806 2 ปีที่แล้ว +1

    Awesome , I managed to get it working , may be next time please consider a video on fine tuning , few shot learning or train on new text.

    • @Brillibits
      @Brillibits  2 ปีที่แล้ว +5

      Fine tuning will require the cloud. I may make a video on those topics in the future.

  • @logosking2848
    @logosking2848 ปีที่แล้ว

    This is a really informative video. I'm buying a couople of 3090s soon to try this

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Definitely use the Huggingface version. I have a video on that.

  • @alx8439
    @alx8439 ปีที่แล้ว +2

    This is probably where AMD's integrated graphics should shine, especially the most recent rdna 2 and 3 - because you can allocate your ram to be used as vram.

    • @drkmnml9850
      @drkmnml9850 ปีที่แล้ว

      possible to run in integrated graphics?

    • @alx8439
      @alx8439 ปีที่แล้ว +1

      @@drkmnml9850 with rocm definitely

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      That would be interesting. It will definitely be slower, but maybe allow the use of larger models easier.

  • @OnimesShow
    @OnimesShow 7 หลายเดือนก่อน

    Hi, thanks for the video. Are there any differences in requirements if I only need to interrogate the model? For example through a chat? I do not need to fine tune it

  • @francomaro7228
    @francomaro7228 ปีที่แล้ว

    I have realized that at least in windows you can assign extra paging memory from the SSD and this memory is also shared with the video card, it is not really necessary to have 2 3090. It would be really stupid, windows 11 comes with 16gb extra by default that it takes from the hard drive.

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Thanks for watching and thanks for sharing!

    • @drkmnml9850
      @drkmnml9850 ปีที่แล้ว +1

      How you have configured that?

  • @romanbolgar
    @romanbolgar 8 หลายเดือนก่อน

    Спасибо но это всё сложно когда уже появится установка в один клик?

  • @nathanmoyer7297
    @nathanmoyer7297 ปีที่แล้ว

    can nvidia ReBAR work for the vram? I've got a 64gb ram system amd I'm looking into getting a 3050 or something with ReBAR to get this working

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Not sure. EIther way I recommend using HuggingFace implementation now

  • @abubakarsaeed358
    @abubakarsaeed358 2 ปีที่แล้ว

    when you were copying the get link and cloning the repo, where did you paste it is that command line or anaconda?

  • @yasharthsingh805
    @yasharthsingh805 ปีที่แล้ว +1

    Does this model train on GPU (vRAM) or CPU RAM while fine-tuning?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +2

      This is not a video on finetuning(though a video on that likely will be coming soon). When finetuning, we use both.

  • @neuron8186
    @neuron8186 2 ปีที่แล้ว

    can you tell me how to connect gpus in order to train

  • @Elintasokas
    @Elintasokas ปีที่แล้ว

    Looking at this today, with the likes of GPT-4 existing, 20B is in no way massive anymore, just a year later. Arguably it wasn't massive even a year ago.

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Kinda standard size now lol

  • @philippecourtemanche3278
    @philippecourtemanche3278 2 ปีที่แล้ว

    Currently have a shopping cart with two 3090s in it and I'm just staring at the page. I would very much like to run an API service using something like this.

    • @Brillibits
      @Brillibits  2 ปีที่แล้ว +1

      If you have a use case, its a fairly reasonable investment. 4000s series may be coming out soon though.

    • @philippecourtemanche3278
      @philippecourtemanche3278 2 ปีที่แล้ว

      @@Brillibits Yeah, I wonder what the premium will be for the 4000 series.

    • @timothynewton7500
      @timothynewton7500 2 ปีที่แล้ว +1

      Just get some refurbished NVIDIA Tesla k80s. Very affordable.

    • @danielw7290
      @danielw7290 ปีที่แล้ว

      @@timothynewton7500 I already have a 3090, will the nvidia K80 work with it…?

    • @francomaro7228
      @francomaro7228 ปีที่แล้ว

      Allocate disk cache with windows 11

  • @user-tf8vh8uw9f
    @user-tf8vh8uw9f 2 ปีที่แล้ว +2

    Hi Blake, thanks for the video. I tested the model using the host setup (initially I had some problems too, but creating a conda environment with python 3.8 resolved these), there is a weird thing however.
    Let's say I set top_p to 1, temperature to 0.7 and give it a prompt like "Hello, my name is". No matter how many times I run generate on it, it keeps putting out the exact same text, in this case "Hello, my name is Mabel, I'm a new member of the community...". If I change the temperature, the output changes too, but only once. After that, it keeps repeating again. I tried this in "interactive" mode as well, but the results are equally predictable after a restart. I'm no expert, but it seems like it's doing some sort of greedy search, instead of actual random sampling, making the output deterministic. Any suggestion would be much appreciated!

    • @Brillibits
      @Brillibits  2 ปีที่แล้ว

      That is odd. What did you set top_k to? I will look into this later myself.

    • @Enju-Aihara
      @Enju-Aihara 2 ปีที่แล้ว +1

      don't use it like a chatbot

  • @drkmnml9850
    @drkmnml9850 ปีที่แล้ว

    which wls you use with which terminal and which linux ?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      I am using Ubuntu 20.04. I am not using WSL most of the time, including here.

  • @mikemansour4634
    @mikemansour4634 ปีที่แล้ว

    Can this work on an M1 Max apple silicon chip ?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      There is limited support for pytorch and apple silicon. If you have the memory, try running it using the huggingface model. I have a seperate video going over that.

    • @ETHIOTECHJ
      @ETHIOTECHJ ปีที่แล้ว

      😂

  • @arthurmelo2490
    @arthurmelo2490 ปีที่แล้ว

    Hello Sir, Is it possible to run on CPU ?

  • @Seedley
    @Seedley ปีที่แล้ว

    could it work on two 3080's?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      If you use HuggingFace int8 and have 12GB models perhaps.

  • @alexisvillegas1953
    @alexisvillegas1953 ปีที่แล้ว

    what is ur console?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Powershell in this video

    • @alexisvillegas1953
      @alexisvillegas1953 ปีที่แล้ว

      @@Brillibits w11? in v10 i don't see powershell like that

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      @@alexisvillegas1953 Its powershell but through the newish windows terminal. I believe you can install it through the MS store.

    • @alexisvillegas1953
      @alexisvillegas1953 ปีที่แล้ว

      @@Brillibits i found it, thanks

  • @Seedley
    @Seedley ปีที่แล้ว

    could you run it on colab?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      It may be possible to do this now. If will need a GPU with at least 24GB of VRAM though and use int8 through HuggingFace

  • @goncharovblog
    @goncharovblog 2 ปีที่แล้ว

    this is better than openai gpt3?

    • @Brillibits
      @Brillibits  2 ปีที่แล้ว

      GPT3 comes in many sizes. It's not better than the largest one, but it's close to the performance of comparable sizes

    • @goncharovblog
      @goncharovblog 2 ปีที่แล้ว

      @@Brillibits what size openai GPT3 has?

    • @Brillibits
      @Brillibits  2 ปีที่แล้ว

      @@goncharovblog The largest one is 175 billion parameters.

    • @albertstarfield
      @albertstarfield 2 ปีที่แล้ว

      some sources says based on the benchmark that GPT-NEOX is really close to DaVinci model of OpenAI but OpenAI DaVinci GPT-3 is still slightly better

    • @Elintasokas
      @Elintasokas ปีที่แล้ว

      @@albertstarfield GPT-NeoX is complete garbage and a light year away from even vanilla GPT-3 Davinci. You will easily see this if you use both back to back.

  • @danielcaoili6890
    @danielcaoili6890 2 ปีที่แล้ว

    You got the answer finally:
    'The "meaning of life" is to make God happy and to be a good person.'