- 245
- 692 401
Prompt Engineer
India
เข้าร่วมเมื่อ 15 เม.ย. 2023
Join the AI, AGI, ASI Revolution !
Stay updated with all the latest advancements in the world of Artificial Intelligence. This channel is your ultimate destination for the most recent developments and insights, all at your fingertips.
Join us to keep a vigilant eye on the cutting-edge of technology and be part of the AI revolution!
Topics: Latest Trends on AI, LLMs, OpenAI, Anthroic, Google, Coding, Python, AutoGen, MemGPT, AutoGPT, API Integration, RunPoDs, Devika, Salad GPU, Microsoft Azure and much more.
Stay updated with all the latest advancements in the world of Artificial Intelligence. This channel is your ultimate destination for the most recent developments and insights, all at your fingertips.
Join us to keep a vigilant eye on the cutting-edge of technology and be part of the AI revolution!
Topics: Latest Trends on AI, LLMs, OpenAI, Anthroic, Google, Coding, Python, AutoGen, MemGPT, AutoGPT, API Integration, RunPoDs, Devika, Salad GPU, Microsoft Azure and much more.
671 Billion Parameters, One Model: DeepSeek-V3 Deep Dive
671 Billion Parameters, One Model: DeepSeek-V3 Deep Dive
Welcome to an in-depth exploration of DeepSeek-V3, the groundbreaking Mixture-of-Experts (MoE) language model featuring an impressive 671 billion parameters, with 37 billion activated per token! Combining innovative architectures like Multi-head Latent Attention (MLA) and an auxiliary-loss-free strategy for load balancing, DeepSeek-V3 redefines efficiency and performance. Whether you're interested in its robust pre-training on 14.8 trillion tokens or its state-of-the-art benchmarks in math, code, and multilingual tasks, this video unpacks it all for you.
Don't forget to like, comment, and subscribe to stay updated with cutting-edge AI techniques!
Links:
X posts: x.com/karpathy/status/1872362712958906460
Blog Post: github.com/deepseek-ai/DeepSeek-V3
Chat: chat.deepseek.com
API: platform.deepseek.com
Hugging Face: huggingface.co/deepseek-ai/DeepSeek-V3-Base
------------------------------------------------
Learn More:
Try Out Gloud GPUs on Novita AI (Affiliate Link): fas.st/t/EvuzAkeX
-------------------------------------------------
CHANNEL LINKS:
🕵️♀️ Join my Patreon for keeping up with the updates: www.patreon.com/PromptEngineer975
☕ Buy me a coffee: ko-fi.com/promptengineer
📞 Get on a Call with me at $50 Calendly: calendly.com/prompt-engineer48/call
💀 GitHub Profile: github.com/PromptEngineer48
🔖 Twitter Profile: prompt48
Other videos that you would love:
th-cam.com/video/DurejOD5FTk/w-d-xo.html
th-cam.com/video/WNYV8rk6wJw/w-d-xo.html
th-cam.com/video/IZfgbOgeXOA/w-d-xo.html
th-cam.com/video/88jbPOmBOaU/w-d-xo.html
th-cam.com/video/9UrWEUIiZ5c/w-d-xo.html
th-cam.com/video/lhQ8ixnYO2Y/w-d-xo.html
th-cam.com/video/QTv3DQ1tY6I/w-d-xo.html
th-cam.com/video/gcMdzGrDLlw/w-d-xo.html
th-cam.com/video/GKr5URJvNDQ/w-d-xo.html
#DeepSeekV3, #AIModel, #ArtificialIntelligence, #MachineLearning, #OpenSourceAI, #AIRevolution, #671BParameters, #DeepLearning, #NextGenAI, #TechInnovation, #AIExplained, #TechBreakthrough, #FutureOfAI, #MLExperts, #AIArchitecture, #AIResearch, #TechReview, #AITrends, #MachineIntelligence, #AIForEveryone
Timeline:
0:00 - Intro
13:36- Solving AIME Problems
Welcome to an in-depth exploration of DeepSeek-V3, the groundbreaking Mixture-of-Experts (MoE) language model featuring an impressive 671 billion parameters, with 37 billion activated per token! Combining innovative architectures like Multi-head Latent Attention (MLA) and an auxiliary-loss-free strategy for load balancing, DeepSeek-V3 redefines efficiency and performance. Whether you're interested in its robust pre-training on 14.8 trillion tokens or its state-of-the-art benchmarks in math, code, and multilingual tasks, this video unpacks it all for you.
Don't forget to like, comment, and subscribe to stay updated with cutting-edge AI techniques!
Links:
X posts: x.com/karpathy/status/1872362712958906460
Blog Post: github.com/deepseek-ai/DeepSeek-V3
Chat: chat.deepseek.com
API: platform.deepseek.com
Hugging Face: huggingface.co/deepseek-ai/DeepSeek-V3-Base
------------------------------------------------
Learn More:
Try Out Gloud GPUs on Novita AI (Affiliate Link): fas.st/t/EvuzAkeX
-------------------------------------------------
CHANNEL LINKS:
🕵️♀️ Join my Patreon for keeping up with the updates: www.patreon.com/PromptEngineer975
☕ Buy me a coffee: ko-fi.com/promptengineer
📞 Get on a Call with me at $50 Calendly: calendly.com/prompt-engineer48/call
💀 GitHub Profile: github.com/PromptEngineer48
🔖 Twitter Profile: prompt48
Other videos that you would love:
th-cam.com/video/DurejOD5FTk/w-d-xo.html
th-cam.com/video/WNYV8rk6wJw/w-d-xo.html
th-cam.com/video/IZfgbOgeXOA/w-d-xo.html
th-cam.com/video/88jbPOmBOaU/w-d-xo.html
th-cam.com/video/9UrWEUIiZ5c/w-d-xo.html
th-cam.com/video/lhQ8ixnYO2Y/w-d-xo.html
th-cam.com/video/QTv3DQ1tY6I/w-d-xo.html
th-cam.com/video/gcMdzGrDLlw/w-d-xo.html
th-cam.com/video/GKr5URJvNDQ/w-d-xo.html
#DeepSeekV3, #AIModel, #ArtificialIntelligence, #MachineLearning, #OpenSourceAI, #AIRevolution, #671BParameters, #DeepLearning, #NextGenAI, #TechInnovation, #AIExplained, #TechBreakthrough, #FutureOfAI, #MLExperts, #AIArchitecture, #AIResearch, #TechReview, #AITrends, #MachineIntelligence, #AIForEveryone
Timeline:
0:00 - Intro
13:36- Solving AIME Problems
มุมมอง: 612
วีดีโอ
Dynamic Quantization with Unsloth: Shrinking a 20GB Model to 5GB Without Accuracy Loss!
มุมมอง 1.2K21 วันที่ผ่านมา
In this video, I dive into the fascinating world of dynamic quantization using Unsloth and show how we can reduce a 20GB language model to just 5GB-without sacrificing performance! 🚀 Discover the challenges of quantizing models with approaches like 4-bit quantization and learn why selectively choosing layers based on error plots is key to success. I'll walk you through how Unsloth's dynamic qua...
Unlocking the Power of Ollama’s Structured JSON Output
มุมมอง 1.9K21 วันที่ผ่านมา
In this video, we dive into Ollama’s incredible feature for structured JSON output. We'll explore multiple examples of how to utilize this functionality effectively, showcasing its potential for modern applications. In the early days of working with language models (LLMs), free-flowing outputs were often sufficient. However, with the evolving demands of application development, we now require m...
How to Set Up Ollama for Seamless Function Calls with this Crazy Update #ollama
มุมมอง 2Kหลายเดือนก่อน
In this video, we will explore the function-calling capabilities of the latest version of Ollama using local large language models. We'll take a close look at how the Ollama team has streamlined the process of writing function calls, making it incredibly easy to get started. I'll walk you through setting everything up on my local system, demonstrating the simplicity and efficiency of these new ...
Breaking Barriers: LLAMA-Mesh and the Future of 3D Content Creation
มุมมอง 1.1Kหลายเดือนก่อน
In this video, we dive deep into LLAMA-Mesh, a groundbreaking approach that extends the capabilities of large language models to the realm of 3D mesh generation. Learn how LLAMA-Mesh leverages language models to: Generate 3D meshes directly from textual prompts. Integrate conversational abilities with 3D content creation. Bridge the gap between text and 3D modalities for interactive design work...
Can AI Predict Your Customer's Reactions? TinyTroupe Demo
มุมมอง 1.3Kหลายเดือนก่อน
In this video we are going to be testing out Tiny-Troupe. TinyTroupe is an experimental Python library that allows the simulation of people with specific personalities, interests, and goals. This allows us to investigate a wide range of convincing interactions and consumer types, with highly customizable personas, under conditions of our choosing. The focus is thus on understanding human behavi...
Why RAG Systems are About to Get a Whole Lot Better!
มุมมอง 657หลายเดือนก่อน
Explore how M3DocRAG, a cutting-edge multi-modal retrieval system, revolutionizes multi-page, multi-document understanding! We'll break down its innovative approach compared to traditional text-based RAG models, dive into the embedding and visual models that power it, and analyze its new test benchmark, M3DocVQA. Witness three powerful examples showcasing M3DocRAG’s ability to integrate visual ...
Master Qwen2.5 Coder Artifacts like a PRO with Ollama and Open Web UI!
มุมมอง 8Kหลายเดือนก่อน
In this video, we will explore the Qwen2.5-Coder-32B-Instruct model from Alibaba. Not only will we delve into its features, but we will also demonstrate how to use Ollama and Open Web UI to get this model up and running. Additionally, we will cover the "artifacts" feature, which is inspired by Anthropic. To set everything up, we will be using cloud GPUs through Novita AI. We are also going to s...
Why This Open-Source Code Model Is a Game-Changer!
มุมมอง 4.9Kหลายเดือนก่อน
In this video, we'll be exploring OpenCoder, an open-source cookbook for top-tier code language models. Opencoder is a cutting-edge code model that surpasses Qwen in performance, including on MMLU and other key benchmarks. We'll demonstrate how to use this model on cloud GPUs, and you can also run it on your own system using Ollama. If you're unfamiliar with Ollama, this video will guide you th...
How to run Llama Vision on Cloud GPUs using Ollama #ollama
มุมมอง 1Kหลายเดือนก่อน
In this video, we dive into the world of cutting-edge AI by testing out the powerful Llama 3.2 models-both the 11B and 90B versions-on a cloud GPU provided by Novita AI. These multi-model architectures, including the Vision collection, are known for their instruction-tuned image reasoning capabilities. We'll explore the performance, outputs, and real-world applications of these advanced models,...
The new Stable Diffusion 3.5 Large is AMAZING | Busy Person's Guide & Setup on Cloud GPUs
มุมมอง 740หลายเดือนก่อน
In this video, we’ll be testing Stability Diffusion 3.5-specifically the large, large turbo, and medium versions. Stability Diffusion is an incredible tool, and we're going to run it on a cloud GPU hosted by Novita AI. I’ll walk you through everything from setting up Hugging Face tokens to downloading models directly from Hugging Face. We’ll be using a library called Diffusers, which will handl...
Revolutionary Free AI Image Editor is a Game Changer!
มุมมอง 815หลายเดือนก่อน
Discover OmniGen - a revolutionary AI model that generates images from both text and images, no plugins needed. We'll show you how to set it up, test it on a RTX 4090, and create stunning visuals with its simple interface on Gradio. We will go through the installation steps step-by-step and address the common pitfalls in getting this done. Perfect for developers and artists looking to explore t...
Chatting with My AI Girlfriend on Telegram! | Meet Katie the AI Bot 🤖💕 [a to z code setup]
มุมมอง 2.5K2 หลายเดือนก่อน
In today’s video, we’re diving into the world of AI companionship with Katie, the AI girlfriend bot on Telegram! 💬 Katie is a virtual personality powered by LLaMA 3 and Novita’s image generation capabilities. She’s designed to be engaging, friendly, and fun, bringing life-like conversation and even photorealistic images to the chat experience. 🔧 How It Works: Katie is developed with a Python ba...
Generate Videos Automatically using LLMs for your Social Media Posts
มุมมอง 3412 หลายเดือนก่อน
📄 TH-cam Video Description: Are you looking to generate stunning videos from text in seconds? Meet the Auto Video Generator-an AI-driven tool that transforms news articles and search queries into captivating video content! 🚀 Features: • 🌐 Web Scraping: Fetches real-time news articles on any topic • ✍️ AI Summarization: Condenses text into short, engaging summaries • 🖼️ Image Generation: Auto-cr...
STOP Wasting Time with Inefficient AI Tools, Switch to Anthropic API Today!
มุมมอง 6472 หลายเดือนก่อน
In this video, we will explore the basics of using Anthropic's APIs. We'll cover how to get started, review messaging formats, examine the various models available from Anthropic, and look at the parameters you can adjust. We’ll also dive into the streaming object, how to use it, and explore Anthropic's impressive vision capabilities. This series of videos will prepare you for advanced tasks, s...
Huggingface opens doors for Ollama with this new Integration
มุมมอง 2.6K2 หลายเดือนก่อน
Huggingface opens doors for Ollama with this new Integration
This AI can Create Music Perfectly Synced to Videos ! #MuVi
มุมมอง 8482 หลายเดือนก่อน
This AI can Create Music Perfectly Synced to Videos ! #MuVi
The Future of Multimodal AI | Open-Source Mixture-of-Experts Model #aria
มุมมอง 5892 หลายเดือนก่อน
The Future of Multimodal AI | Open-Source Mixture-of-Experts Model #aria
New Mistral Models are too Good: Ministral 3B and 8B | Quality Testing on Virtual GPUs
มุมมอง 6592 หลายเดือนก่อน
New Mistral Models are too Good: Ministral 3B and 8B | Quality Testing on Virtual GPUs
All in One LLM Hosting ⚡Solution free up your Time | Deploy your Apps easily
มุมมอง 4242 หลายเดือนก่อน
All in One LLM Hosting ⚡Solution free up your Time | Deploy your Apps easily
How to Get your LLMs to OBEY | Easiest Fine-tuning Interface for Total Control over your LLMs
มุมมอง 7922 หลายเดือนก่อน
How to Get your LLMs to OBEY | Easiest Fine-tuning Interface for Total Control over your LLMs
OpenAI's SWARM is the Ultimate Multi-agent Framework | Run using Local LLMs or OpenAI API Keys
มุมมอง 2.8K2 หลายเดือนก่อน
OpenAI's SWARM is the Ultimate Multi-agent Framework | Run using Local LLMs or OpenAI API Keys
Smart AI Flight Recommendation Systems | Full Stack Code
มุมมอง 4252 หลายเดือนก่อน
Smart AI Flight Recommendation Systems | Full Stack Code
The AI Framework That Thinks and Acts Like a Human | Agent S
มุมมอง 2.2K2 หลายเดือนก่อน
The AI Framework That Thinks and Acts Like a Human | Agent S
Palmyra Tool Calling Ability EXPOSED! Better than OpenAI
มุมมอง 4572 หลายเดือนก่อน
Palmyra Tool Calling Ability EXPOSED! Better than OpenAI
🚀Revolutionary NotebookLM : Found an Open source Alternative 💓
มุมมอง 2.1K2 หลายเดือนก่อน
🚀Revolutionary NotebookLM : Found an Open source Alternative 💓
AI wins the Nobel Prize in Physics 2024
มุมมอง 2002 หลายเดือนก่อน
AI wins the Nobel Prize in Physics 2024
Stop Paying for Web Crawlers (Use this Instead)
มุมมอง 3.2K2 หลายเดือนก่อน
Stop Paying for Web Crawlers (Use this Instead)
The Weird Connection Between Reward Models and Better Decision Making
มุมมอง 2972 หลายเดือนก่อน
The Weird Connection Between Reward Models and Better Decision Making
Blazingly FAST Image Generation using FLUX 1.1 (Pro)
มุมมอง 5132 หลายเดือนก่อน
Blazingly FAST Image Generation using FLUX 1.1 (Pro)
...for example... :)
how does it do in ARC and frontier math?
Drinking game ideas: every time he says ollama :D
Nice one
is this api key is free or cost?
It can work on google collab ?
95% is zero, because each component will be 95%, therefore the error stacks up quickly to be unusable. Imagine a programming language that only gave 95% accuracy. It would be unusable.
is it possible to connect it with dify?
How to find tune with .md files as dataset? Most of the software documentation comes as markdown files. How to use them to fine tune models?
My question is how to safe the f16 merged directly as the safetensors, because it is getting binary .bin as native format ?
Can you alter the ai to give it a name
Any option to run this on Ollama, that is literally the cheapest! Thanks again for the tutorial! When you referencing the "original" directory, should all those directories referenced need to be in root and or in the original directory? The yaml is saying root.
Congratulations brother, hoping to see you in the first place.
btw ottodev might've performed better
i finally got to watch the whole vid. perhaps it'd have faired better with the parts that didn't copy? maybe that led to some confusion. also, a bit strange how the emoji confetti physics looked different between the qwen artifacts one and the "local" one... any idea why? maybe a temperature issue?
I hope they can and will implement this in Ollama ASAP. :-)
Hmm
It's easy to import any of these models after shrinking them though at least, definitely something you can script without much hassle.
@@chronicallychill9979 So at the end these are regular gguf models that ollama can load?
great examination of this, I was wanting to see how this worked. So llama is not only the real open ai, but also are seeminly actively trying to make it easy for people to use and modify it. I should probably look in to llama more.
Hi, Thanks for the video, btw I am facing issue with the last part where I could not able to access Salad endpoint and getting 403 Forbidden Error. Please let me know if you have anything for the same.
Please try again. It's been some time I used Salad.
If i run the fastapi file using uvicorn then it will run on my local host or the machince local host.
machine local host
But given that OpenAI, Claude, Llama, or any LLM can do this already, why do we need ollama for it?
Local LLM.. now we can do that with local llms
Brilliant video. Examples with simple explanations thank you
Welcome
Dang, what a great video
Glad you enjoyed it!
How can I give output of function back to LLM so, my final answer is from LLM instead of what function's returns.
This is the video for specific RP dataset creation that I have been needing. Please do a deep dive. Thanks.
I am working on a follow-up video that goes deeper into RP dataset creation!
Did not work. Errors. If execute Python ingest.
If I prompt "What is sky color ?" It answer using "Calling function: subtract_two_numbers" . It seems that llama3.2 is forced to use tool also if use tool has not meaning ?
I want to see a demo where the llm realises it doesn't have a tool it needs, clones a tools github project, adds the tool, commits it, and then uses it.
Understood.. cool. Create tools on the fly. Nice..
If I enter “hey, how's it going?” as a prompt, what result do I get?
Oh. Ur question has to be using any one of the functions mentioned. Updates coming soon.
Thanks for the clip and explanation. I see the fuction sometime takes number sometime it takes string, 1. how about concade "Prompt" + "Engineer" = "Prompt Engineer" , 2. Add Five + Two = 7. the place you wrote changing the char to int, kind of odd
Yes. some times the model outputs the arguments as ints. sometimes as strings. In order to address the issue, what I did was to convert the strings to numbers. As of now, concading prompt and engineer wont work. You questions should be based on the functions that you have provided only
❤awesome thanks for sharing
Welcome.
Things are advancing so fast... Now I've started using Windsurf, and wow, it feels like I'm a pro already. It's kind of scary because I don’t want to become overly reliant on it since I'm still new. I stumbled upon your channel almost (or over) a year ago, where you taught me the basics of chatbots. Thanks, man-without you, I wouldn't be where I am today.
Glad to hear// Nice work!
investing in 3090 24gb a few years back was a good choice :D Local 32B baby
Wow. nice
To be real, these small models are faster and better for many use cases.
indeed
hey there! i actually want to make a chatbot for my college project, I want to fine tune the LLM with my own dataset and then deploy the project with an UI but due to low computational power I am facing a lot problems in the deployment stage, could you possibly help me here? big thanks
reduce to 4 bit.. no other option. u can reduce to 4 bit, make a gguf, and use it using ollama. fastest and best option.
thank you, very helpful.How to deploy NVIDIA'S AI models as API Using Flowise AI.
Thank you very much! The only guide I was able to use to create a bot You are the best!!!
You are welcome! Keep creating awesome things!
Thank you for sharing the tutorial! I found it a bit cumbersome that Ollama requires additional installations for a graphical interface. However, the Page Assist you introduced seems much simpler since it only requires installing a Chrome extension to use. That said, when using a locally downloaded Ollama model through the browser, does it still count as running the LLM locally? Can it operate offline to avoid privacy exposure? Thank you!
Welcome 🤗 Yes. It counts as local run. So no data exposure
I would like to suggest you, make your voice a little bit solfter use any ai tool, it will definitely help increase views and subscribers.
Noted, will try to work on that in future videos.
Nah! The voice is Ok! Much better than those robotic AI voices out there.
Thanks
Thanks.
This was nice
Thanks
Hi I am connecting from another computer. I have my Ollama in AWS cloud. How do I make it where I can train it like what you did here?
there was no training just injest and spit.. if you have ollama in AWS cloud, you need to somehow use that via api calls.
forget conda and just use uv, much lighter and faster and installs in seconds via pip install uv
Since you mentioned this twice, the next video will be on uv instead of conda. as a respect for you.. Thanks for watching.
Funny accent 🤣😂 Speak english please
🤗🤗😄😄
Check out my recent videos, u will get dramatic changes.
Does it work with local models?
Not tried yet. But I will try and let u know
answer is in the repo's discussions
Can 32B model run on RTX3090 config? It's really cheaper
I have tested this out. Yes you can run.
@@AlexanderAk in most cases you can run ever you don't have GPU at all, my models run on xeon and it worked more well than people expect, I try Lamma 32b, it a bit slow on my cheapest processor and for real I no found a reason to use that model in my personal coding task , difference between 7b and 32b model in code output not that big on my test, queen coder do task good in both option👍
That’s awesome! Thanks mate
Thanks
in your next video, can you show how to connect this to a RAG, a Customer DB, Appointment setting, etc. Basically something more than just talking to the Ai.
Noted.
Can we expect greater latency given multimodal?
Yes. we can expect but give the fact that the embeddings will be formed beforehand, it should reduce the latency. and btw if it's accurate, i would soften on the latency side as well. even gpt-4o has higher latency model while thinking step by step.
Did you get your rag wireframe set up for big company use?
Depends on the use case. But right now, I'm setting up a RAG pipeline to set up an answering machine on SQL database of the company, where the company has about 300 number of SQL tables. Thats what I'm working on.
@@PromptEngineer48 I've similar request, let us know once you have the solution. I'm having trouble with multiple tables using langchain
Cool. That will be amazing..
i'd recommend using uv instead of conda: pip install uv - uv pip install ... - reason: uv resolves module conflicts and has a bunch of other benefits, plus runs async. it can do venvs, init and manages a project's packages, ... it's my #1 python modules management tool replaced conda and pip itself. runs in jupyter also !pip install uv !uv pip install ...
Luv this. Will compare and evaluate
Will try this as well, thanks for the tip!
Thanks.
@@PromptEngineer48 ty - why not do a deep dive and make a vid about it, there aren't many (recent ones) on yt - it's quite powerful and can even manage the modules in your code
hi there. even without installing the artifacts (v2) it shows the execution part...maybe the devs already implemented it inside the core.. ?
Execution inside the open webui?
@@PromptEngineer48 sry for my bad english ... I mean... it does that thing you claim artifacts do, that white 'playground' window on the right side that shows the code in action... you described the process of installing a 'function' to achieve that... and when I installed the open-webui two days ago, without knowing anything about it , just clean and clear installation via pip > it already had this feature... So I decided to install the function artifacts_v2 also - and - yeah... nothing changed... :D
Oh. Sounds great. Mine was not doing that!!. Cool.