Mixtral8-7B: Overview and Fine-Tuning
ฝัง
- เผยแพร่เมื่อ 27 ก.ค. 2024
- Mixture of Experts (MoE) models replace the feedforward neural networks within transformers with two core components:
1) A sparse MoE layer.
2) A router network to select which experts will process which token!
In this session, we explore:
- Mistral AI and their meteoric rise to a $415M Series A just 6 months after a $112M seed round just 6 months ago!
- Their current product line that includes embedding models, an API model endpoint, and their flagship open-source models
- The architecture of Mistral8-7B, where it stands relative to other models on the Open LLM Leaderboard, and how it differs from a classic transformer architecture
- How to run inference using Mixtral and how to instruct-tune the model using Mosaic Instruct V3! - วิทยาศาสตร์และเทคโนโลยี
Awesome
That's absolutely fantastic thanks a lot.
(We do get the point across with or without the shouting btw! 😅)
😭😭... one day our microphone game will be so dialed in people will miss the yells @truehighs7845. Thanks for enjoying the journey with us for now!
Excellent video.
Google Colab Notebook: colab.research.google.com/drive/1bjXE8_n9P20ON5yinQ5zREkYLrP761pP?usp=sharing
Slides: www.canva.com/design/DAF25_C6esU/VLaRe5rFCGKucELnp-dRbg/edit?DAF25_C6esU&
I found you through Prompt Engineering youtube channel! Can you PLEASE make a video in regards to natural language to SQL query generation? I'm confused on if i should do finetuning, or maybe some custom RAG? I would basically like the model to be aware of my schema, so i dont have to keep sending the schema in each prompt, and simply have it respond with the correct sql query. Would love your insight on this, and maybe if i can use Mixtral8-7B for this task! Another sidenote, why not create a community on discord or something so people can connect with you! Subbed brother thanks for the great videos.
Hey!
We actually do have a community Discord: discord.gg/RzhvYvAwzA
We talk about NL-SQL in a previous event - check out the link here: th-cam.com/video/3J83aygkbX0/w-d-xo.html