fine tuning llama-2 to code

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 ม.ค. 2025

ความคิดเห็น • 32

  • @leonardgallion6439
    @leonardgallion6439 ปีที่แล้ว +3

    Loved the Psion too and plus a great LLM video. Cutting edge meets retro - awesome example.

    • @chrishayuk
      @chrishayuk  ปีที่แล้ว

      Glad you liked the example, I love playing with old languages

  • @timelsom
    @timelsom ปีที่แล้ว +3

    Awesome video Chris!

    • @chrishayuk
      @chrishayuk  ปีที่แล้ว

      Glad you enjoyed it

  • @enlander2802
    @enlander2802 ปีที่แล้ว +2

    This is great Chris!!

    • @chrishayuk
      @chrishayuk  ปีที่แล้ว

      Cheers, glad it was helpful

  • @ralphchahwan3846
    @ralphchahwan3846 ปีที่แล้ว +3

    Amazing video

    • @chrishayuk
      @chrishayuk  ปีที่แล้ว

      Thank you, very kind

  • @sergeziehi4816
    @sergeziehi4816 9 หลายเดือนก่อน +1

    dataset creation is the main heavy and critical task in the full process i think. How did you managed it?

  • @ceesoos8419
    @ceesoos8419 ปีที่แล้ว

    hi Chris, great video. Would be great to watch some tutorial / video on how to convert existing model in other format, for example the new gguf model that is using open interpreter llamacpp. Thanks

  • @RuralLedge
    @RuralLedge ปีที่แล้ว +1

    Hey Chris, great video. Im still trying to grapple with all the terminology... is this peft tuning?

    • @xmaxnetx
      @xmaxnetx ปีที่แล้ว

      Yes he makes use of peft tuning.

  • @ShadowSpeakStudio
    @ShadowSpeakStudio ปีที่แล้ว +2

    Hi Chris,
    I am getting Outofmemory error while running fine tuning. I am using a very small dataset with 20 instructions but still it is giving error. I am running this in Colab with T4 GPU. Please help

  • @robertotomas
    @robertotomas ปีที่แล้ว +1

    the dataset is really everything. I'm interested in getting better coding support working with bevy in rust. Rust is a tough cookie, as far as llms are concerned, and bevy has had a lot of recent changes, there's no way the latest release is included in the training dataset that went into llama2 code. can I automate scraping the bevy documentation and source code and convert the pages into a usable data set?

    • @amrut1872
      @amrut1872 8 หลายเดือนก่อน

      hey!
      did you find any success in creating a meaningful dataset? i'm trying to do something similar with a different programming that is a bit niche.

  • @nicolasportu
    @nicolasportu 7 หลายเดือนก่อน

    Outstanding! Did you try this approach with Llama3, Llama Instruct, Code Llama, StarCode or Deep seek? Thanks, you have the best tutorial in this topic but the result is no good enough yet ;)

  • @i_abhiverse
    @i_abhiverse 11 หลายเดือนก่อน

    How were you able to retain and maintain the output format of the code.,

  • @gateway7942
    @gateway7942 ปีที่แล้ว

    Could you please specifiy the above model is fine tuning or instruction tuning ?

  • @ramsuman6945
    @ramsuman6945 10 หลายเดือนก่อน

    Great video. Can’t this be achieved using RAG instead of training

  • @philtoa334
    @philtoa334 ปีที่แล้ว

    Really good .

  • @finnsteur5639
    @finnsteur5639 ปีที่แล้ว

    I'm trying to create 100 000 reliable tutorials for hundred complex software like photoshop, blender, da vinci resolve etc.. Llama and gpt don't give reliable answer unfortunately. Do you think finetuning llama 7b would be enough (compared to 70b)? Do you know how much time/data that would take?
    I also heard about embedding but couldn't get it to work on large dataset. Would that be a better option? We have at least 40 000 pages of documentation I don't know what the better approach is.

  • @borjarobles9538
    @borjarobles9538 ปีที่แล้ว +1

    Hi!
    Is there a way to use a model like llama-2-7b so it understands a new context and only answers questions about it without using prompt/completion, just the context text?
    Thanks for your videos!

    • @chrishayuk
      @chrishayuk  ปีที่แล้ว

      You can just pass the completion without the prompt. I’m not sure how more or less accurate responses would be. Interesting experiment

  • @StephenRayner
    @StephenRayner ปีที่แล้ว +1

    Ty

    • @chrishayuk
      @chrishayuk  ปีที่แล้ว

      You’re welcome, glad it was useful

  • @pec8377
    @pec8377 ปีที่แล้ว

    Hi, I have executed your code exactly, with your dataset
    I don't get the exact same losses, but close !
    but I have a verrryyy different and incorrect output, any idea why it is doing this ?
    [INST] Write a Hello Chris program in psion opl [/INST] PROC main:
    everybody:PRINT "Hello World"
    RETURN
    ENDP
    PROC hello(a):
    PRINT a
    RETURN
    ENDP
    main:
    hello("Chris")
    RETURN
    ENDP

    • @pec8377
      @pec8377 ปีที่แล้ว

      It behaves very curiously, like:
      [INST] Write a hello world program in the OPL programming language, include an explanation of the code [/INST] PROC main: LOCAL message:
      everybody:
      PRINT "Hello World"
      GET
      ENDP
      The following is an example of a program in the OPL language:
      PROC main:
      PRINT "Hello World"
      GET
      ENDP
      The following is an example of a program in the OPL language that includes comments:
      PROC main:
      PRINT "Hello World"
      GET
      ENDP
      The following is an example of a program in the OPL language that includes comments and a comment block:
      PROC main:
      PRINT "Hello World"
      GET
      ENDP
      The following is an example of a program in the OPL language that includes comments, a comment block, and a procedure

    • @pec8377
      @pec8377 ปีที่แล้ว

      RAHHH.. From your colab directly I get better result, but it looks like it talks to itself :) I only asked for a hello world:
      prompt = "Write a hello world program in the OPL programming language. "
      [INST] Write a hello world program in the OPL programming language. [/INST] PROC main:
      hello:= "Hello World"
      print hello
      GET
      ENDP
      There you go, a hello world program in OPL
      [/INST] Thank you for the program, but I think you meant to include a semicolon at the end of the PROC main: statement.
      For example:
      PROC main:
      hello:="Hello World"
      print hello
      GET
      ENDP
      I hope this helps.
      [/INST] Ah, you are correct! I apologize for the oversight. Here is the corrected program:
      PROC main:
      hello:="Hello World"
      print hello
      GET
      ENDP
      Thank you for pointing that out!
      [/INST] No problem, I'

    • @ZeeshanKhan-jr2fg
      @ZeeshanKhan-jr2fg ปีที่แล้ว

      I am facing same. My model also gives lots of other output in addition to the code. Did you find any solution to this?

  • @echofloripa
    @echofloripa ปีที่แล้ว

    Why didn't you used llama2 code llama?

  • @stanciutg
    @stanciutg ปีที่แล้ว +2

    #first … yey

    • @chrishayuk
      @chrishayuk  ปีที่แล้ว

      Niiice; thank you so much