- 99
- 415 101
AI Bites
United Kingdom
เข้าร่วมเมื่อ 30 มิ.ย. 2020
AI Bites helps you understand AI concepts and research papers by providing clear and concise explanations. More recently we have started explaining AI tools and gotten more hands-on.
Please subscribe & let the learning begin!
Please subscribe & let the learning begin!
Run DeepSeek r1 locally on a laptop | 3 ways without coding
The DeepSeek r1 model has created a stir in the AI community. The model is currently free to use, we never know if it will be a paid service. The good news is that the model is open-source. So we can download and run it locally (even on our laptop).
So, in this video I am sharing 3 ways I found out to run DeepSeek or for that matter any open-source LLMs LOCALLY!
Hope it's useful!
⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️
0:00 - Intro
0:31 - First way/option
4:06 - Second way/option
6:34 - Third way/option
9:41 - Extro
AI BITES KEY LINKS
Website: www.ai-bites.net
TH-cam: www.youtube.com/@AIBites
Twitter: ai_bites
Patreon: www.patreon.com/ai_bites
Github: github.com/ai-bites
So, in this video I am sharing 3 ways I found out to run DeepSeek or for that matter any open-source LLMs LOCALLY!
Hope it's useful!
⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️
0:00 - Intro
0:31 - First way/option
4:06 - Second way/option
6:34 - Third way/option
9:41 - Extro
AI BITES KEY LINKS
Website: www.ai-bites.net
TH-cam: www.youtube.com/@AIBites
Twitter: ai_bites
Patreon: www.patreon.com/ai_bites
Github: github.com/ai-bites
มุมมอง: 125
วีดีโอ
DeepSeek Janus Pro 7b - Unified Vision and generation in one model (paper explained)
มุมมอง 3689 ชั่วโมงที่ผ่านมา
Janus Pro from DeepSeek - Unified Vision and generation in one model DeepSeek, the company that stunned the world with its R1 model has recently released a Multimodal model. It falls under the category of unified multimodal models where a single model is used to both understand and also generate images from prompts. In this video, lets go through the JANUS PRO model from DeepSeek and understand...
DeepSeek r1 vs ChatGPT - A brutally honest review | Full Review
มุมมอง 1.5K12 ชั่วโมงที่ผ่านมา
DeepSeek r1 vs ChatGPT - A brutally honest review DeepSeek has stirred the industry in the last couple of weeks. It's been particularly compared against OpenAI's o1 model. To help people choose between r1 and ChatGPT, in this video, we compare both the r1 and o1 models side by side. We explore math reasoning and logical and visual reasoning. We have placed particular emphasis on Math problem so...
DeepSeek R1 Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (paper explained)
มุมมอง 3.6K16 ชั่วโมงที่ผ่านมา
DeepSeek R1 Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (paper explained) DeepSeek R1 is the latest model from DeepSeek. It is the first work to show that directly training with Reinforcement Learning is sufficient. We don't need the Supervised Fine-Tuning(SFT) step typically followed while training LLMs. In this video, we read the paper and understand the model archit...
CrewAI - Crash Course | Tools in CrewAI | Part - 4
มุมมอง 36314 วันที่ผ่านมา
CrewAI - Crash Course | Tools in CrewAI | Part - 4 Crew AI is getting well-established as one of the go-to Python frameworks for developing and orchestrating AI agents. In the pervious videos in the CrewAI Crash Course series, we looked into Crew, Flow and Knowledge. This video is dedicated to CrewAI Tools. PREVIOUS VIDEO IN CREW AI Series CrewAI Part - 1: th-cam.com/video/jFTlvw0N_JM/w-d-xo.ht...
Transformers^2 - Self-Adaptive LLMs | SVD Fine-tuning | End of LoRA fine tuning? | (paper explained)
มุมมอง 6K14 วันที่ผ่านมา
Transformers^2 - Self-Adaptive LLMs | SVD Fine-tuning We have come a long way with fine-tuning LLMs. Low-rank Adaptation or LoRA has been established as the go-to method for fine-tuinig LLMs. We also have QLoRA which has also been established as the established approach for inference on compute budget. But none of the methods adapt the LLM weights. This paper proposes a self-adaptive approach t...
crewai crash course - Part 3 | knowledge
มุมมอง 42021 วันที่ผ่านมา
This is the third video in the crash course series on CrewAI. While the first 2 videos talk about CrewAI and flow, this one is all about knowledge. Knowledge is crucial to ground the agents and to avoid hallucinations. Please stay tuned for the rest of the videos in this series. Previous Videos: Part 1 - th-cam.com/video/jFTlvw0N_JM/w-d-xo.htmlsi=y64oj70KIJb7PeuV Part 2 - th-cam.com/video/1QNG4...
CrewAI Crash Course | Meeting Assistant development with CrewAI Flow | Part 2
มุมมอง 52921 วันที่ผ่านมา
Crew AI is getting well-established as one of the go-to Python frameworks for developing and orchestrating AI agents. In the first part of the video series on Crew AI we saw about agents and tasks. We developed an end-to-end crew with 3 agents collaborating to develop a snake game. In this video, let's move on to the next idea, flow. Flow enables different agents and crews within the crew to co...
CrewAI - Crash Course | End-to-end Game Development with Crew AI | Part - 1
มุมมอง 1.5Kหลายเดือนก่อน
Crew AI is getting well established as one of the go-to Python frameworks for developing and orchestrating AI agents. So what does it take to do a simple end-to-end Agentic project with Crew AI? What are the components and building blocks of Crew AI? This video strives to answer these questions. ⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️ 0:00 - What are agents? 1:04 - Why agents? 1:52 - Python Frameworks for...
Gemini 2.0 from Google - Emerging star in coding? | Full Review - Part 2
มุมมอง 395หลายเดือนก่อน
Google has just released its latest and most powerful Multi-modal model to date. How does it do compared to other models like Claude 3.5 Sonnet or GPT-4o? In the pervious video (Part 1), we saw about the Chat UI and looked into the features introduced in Gemini 2.0 in AI studio. In this video, let's test its coding and reasoning capabilities with several examples as close as possible to real-wo...
Gemini 2.0 from Google - Is it jack of all trades? | Full Review (Part 1)
มุมมอง 588หลายเดือนก่อน
Google has just released its latest and its most powerful Multi-modal model to date. How does it do compared to other models like Claude 3.5 Sonnet or GPT-4o? Is it just a jack of all trades, but master of none? Or has it nailed the multi-modal department. In this video I demo its multi-modal capabilities in several example real-world scenarios. This is part 1 of a 2-part video series. Please l...
How good is Claude + MCP at replacing full-stack developers? Lets test!
มุมมอง 1.9Kหลายเดือนก่อน
Claude 3.5 sonnet is already the most powerful model in code generation. Adding Model Context Protocol (MCP) tools can make it even more sophisticated. In this video, I walk you through the steps needed to equip Claude with MCP. Once it works, I will test Claude by giving a single long prompt with all the steps needed to write code, commit, push, branch, and merge it in Git. Hope it's useful! A...
Structured output from Ollama | Local LLM + VLM | Quick Hands-on
มุมมอง 966หลายเดือนก่อน
Ollama recently announced that it supports structured outputs. This means that we can get a serializable response straightaway without post-processing the response. It's a nice little handy feature of the open-source tool. In this video, let's look into the setups needed for structured outputs along with a couple of examples, all running on a Macbook laptop! OLLAMA KEY LINKS Ollama Announcement...
The new Llama 3.3 vs GPT-4o - full review | Is free Llama 3.3 sufficient?
มุมมอง 5Kหลายเดือนก่อน
Meta AI has just released Llama 3.3 70 B parameters model. The model seems to be on par with that of its flagship model Llama 3.1 405 B parameter model. But how does it compare with the GPT-4o model? Is it worth paying for the Open AI model or is open source sufficient? In fact, is the open source model better? Let's find out in this video. ⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️ 0:00 - Intro 2:11 - Bench...
RAG - Vector DBs for RAG | Indexing and Similarity in Vector DBs
มุมมอง 523หลายเดือนก่อน
In a Retrieval Augmented Generation(RAG) pipeline, the last step in the pre-processing step is the Vector DB. Whenever we want to reuse the embeddings, we are better off storing them in persistent store rather than embed data every single time the user queries the system. This is where Vector DBs come into play. In this video let's dive into the Vector DBs, why, what and different steps in buil...
RAG - Embeddings for RAG | BERT and SBERT | Sentence Transformers
มุมมอง 1.4K2 หลายเดือนก่อน
RAG - Embeddings for RAG | BERT and SBERT | Sentence Transformers
Mixture of Transformers for Multi-modal foundation models (paper explained)
มุมมอง 7662 หลายเดือนก่อน
Mixture of Transformers for Multi-modal foundation models (paper explained)
AI App to talk to your laptops locally (Local Alexa) - hands-on
มุมมอง 2812 หลายเดือนก่อน
AI App to talk to your laptops locally (Local Alexa) - hands-on
LightRAG - A simple and fast RAG that beats GraphRAG? (paper explained)
มุมมอง 4.5K2 หลายเดือนก่อน
LightRAG - A simple and fast RAG that beats GraphRAG? (paper explained)
How I generate unlimited AI images for free!
มุมมอง 5313 หลายเดือนก่อน
How I generate unlimited AI images for free!
bitnet.cpp from Microsoft: Run LLMs locally on CPU! (hands-on)
มุมมอง 1.3K3 หลายเดือนก่อน
bitnet.cpp from Microsoft: Run LLMs locally on CPU! (hands-on)
The new claude 3.5 sonnet - computer use, benchmark and more
มุมมอง 1773 หลายเดือนก่อน
The new claude 3.5 sonnet - computer use, benchmark and more
Introduction to PDF Parsing, challenges and methods (RAG Series)
มุมมอง 4733 หลายเดือนก่อน
Introduction to PDF Parsing, challenges and methods (RAG Series)
Swarm from Open AI - routines, handoffs and agents explained with code
มุมมอง 7343 หลายเดือนก่อน
Swarm from Open AI - routines, handoffs and agents explained with code
Meta Movie Gen Research Paper explained
มุมมอง 5063 หลายเดือนก่อน
Meta Movie Gen Research Paper explained
Contextual Information Retrieval for improving your RAG pipeline (from Anthropic)
มุมมอง 1.5K3 หลายเดือนก่อน
Contextual Information Retrieval for improving your RAG pipeline (from Anthropic)
Qwen2.5 coder - Combines code generation with reasoning to build coding agents!
มุมมอง 1.2K4 หลายเดือนก่อน
Qwen2.5 coder - Combines code generation with reasoning to build coding agents!
Qwen2.5 Math - world's leading open-source Math model?
มุมมอง 6454 หลายเดือนก่อน
Qwen2.5 Math - world's leading open-source Math model?
Qwen 2.5 - The Small Language Model? (a quick look)
มุมมอง 1K4 หลายเดือนก่อน
Qwen 2.5 - The Small Language Model? (a quick look)
o1 preview from OpenAI is all about reasoning - A comprehensive look
มุมมอง 3964 หลายเดือนก่อน
o1 preview from OpenAI is all about reasoning - A comprehensive look
The explanation was awesome and to the point really liked it
Thank you 🙂
A Better explanation even which will lead in all Sources.
yeah this isn't o1
updated now!
Bro what did you update, it's the same thing, it has to think. Produce thinking tokens. Yours replies directly. Google how to use o1
The video title says o1, but I see that you are using ChatGPT 4o or 4o mini most of the time. I don’t think it’s accurate comparison.
Do apologise guys. So what happened is, I started with o1, switched to ChatGPT at some point. Then forgot to switch back. But Nice catch though 😉 I will update the title accordingly. Thanks 👍
updated now!
great video
Thank you 🙂
great video
Thanks!
Fascinating review. I glanced at the paper, particularly at GRPO and GAE. GRPO looks a lot like Fuzzy-Logic with nodes or attention heads adapted to experience (e.g. such as "relative" via using K-means group clustering). Looking more deeply at GAE (Generalized Advantage Estimation) it is for an adaptive control system. I would not be surprised if the origin of the deep learning usage of Theta is an angle of a pendulum.
Overlapping membership functions used in Fuzzy Logic is very similar to KL.
Don't have much experience with fuzzy logic. But I like your perspective 🙂
Great explanation
Thanks 👍
I was used to think that Chatgpt is the only best model, But then I realised most of the Llama models are on par with gpt for free. I am TH-cam Scriptwriter and I realised both are almost same, and If I fine tune a 8b llama model on my computer with my scripts, it can outperform Gpt4o at Scriptwriting. A pretty crazy realisation today.
Great. So do u use these models for your scripts?
I have been using it in the last two weeks` i have built business related products. The trick of getting the best out of it is before 'Prompting' on the window directly which is faster but not deep enough. However, if you want high level realistic domain specific contents, first, click the 'DeepThink (R1)' icon and it turns blue then start prompting, which is the research stage, a little bit slow but worth it to get amazing results of whatever you niche in.
Thank you for your explanation !
My pleasure 😊
thanks a lot.
Most welcome!
Thank you always for your high-quality paper analyses. Do you have any plans to create videos on Deepseek v3 or r1 papers? They are incredibly good (and hot) models but I personally think that these, especially GRPO and the reward system, are not quite scalable to achieve o3 or higher level. I would love to hear your opinion.
Thanks for your feedback.. Irrespective of how good the DeepSeek models are, I think RL will be the next big thing. Now that DeepSeek has shown it's possible, biggies like Google and Meta might explore RL to the fullest. 🙂
hi ...your videos are extremely helpful in understanding crewAI and how it can be implemented...can you provide a link or the github repo of the sample projects you have created(game and assistant) so that can view the code and understand better how it works?...thank you!
Hey thx. Need to commit them to github. Shall I get back next week or would you want them urgently? 😊
Good job> Are you gonna walk through Part-5?
yes, in the making ;-) will be published soon!
It doesn't appear much different than prefix tuning, except prefixes are inserted dynamically based on the class of an input.
That is a pretty big difference
Love this series man. More CrewAI after it’s over please!
Thanks! encouraging :-)
soooo test time training essentially
Kind of. It's the logic progression. Test time training feels more like a feature where this is like a framework imo.
I would rather call it inference in 2 ways. First inference to choose which scale vector to use. Next inference to do the actual inference :-)
thanks for your inputs
Dude, that feels too good to be true...
yes, evideice things are progressing fast in AI :-)
I’m going to make a transformer that accepts as input all current transformers in sequence, to predict the next transformer in the sequence.
Universal Turing Machine
AI Research in a Nutshell
Pretty sure that would collapse the wave function and dissappear into another dimension. Be careful.
hah.. fantastic idea and nice side project :-)
he he...
13:56 Good references and video!
Thank you
Interesting video! However, I was a bit disappointed with the final result. Using Pygame could have made it more visually appealing. That said, it was still a great video. Thanks for putting it together!
hey yes, I am totally with you on this. Its basically foolproofing for the future. CrewAI is just providing the framework to make the agents communicate and work seamlessly. When we have LLMs and multi-modal models that are super capable, we will then see astounding results! Thanks for the positive feedback.
Neat explanation
Glad you think so!
light rag評價2極
Great presentation of a superior model.
Are these techniques "Post-training quantization" or "Quantization-aware training" ?
Post training
hi can u make a video that explain about efficientnet lite?
At some point, yup 👍
I like how he lead us to papers.
Thank you 🙂
Can I create a dataset only with questions and answers? Without context?
Yes if we finetune on such a dataset it will be good at questioning answering
Thank
my pleasure :-)
U helped me complete whole unit in one vedio,keep posting wonderful vedios like thissss :))
Glad to hear that
what the diff with API? why they create MCP when you can use API
From what I understand, mcp can have the context and the memory of the sessions so its more aware whereas apis are individually requested.
I guess it's just simpler to use in your llm app.
thank you for your inputs. Hope that answers :-)
great explanation, thank you!
glad you like it :-)
How much of vram is required to run qwen coder 2.5 7b version?
It gave you feedback on the sudoku game of 4o it’s said invalid move
yes, but I tried valid moves and tried like 4-5 times. Still with no luck. I edited those while making the video :-)
4o is free to use and not 200 PM😂
Ok I would say 20 pm if you want to use it extensively 😊
Error occurred: Error running graph: Error building Component File: It looks like we're missing some important information: session_id (unique conversation identifier). Please ensure that your message includes all the required fields. constantly facing this error
how to fit Gemma 27b model to finetune on my free colab GPU ( T4 15GB memory ). Is there any way ? please explain
why padding = right , shouldn't it be left as it is next token generation where left side in sequence require padding
Thank you for your well-intentioned and sincere explanation. It's great to hear advice from someone who has a good grasp of the subject.
Very encouraging to keep going 👍
Thanks very much. But github link is broken
Can we use lightrag to pass the context to a fine tuned LLM?
I believe so. why not? LLM is just at the heart of your RAG pipeline. LightRAG is one of the better ways to build that pipeline, similar to graph RAG. So I don't see a problem with it
Terrible explanation, the background music makes all worse
Sorry to know. It's one of the older videos. We have hopefully improved ever since 😊
Excellent explanation
Cheers
which screen recorder do you use ?
OBS
thank you friend
glad it was useful! 🙂
Its llama 3 8B. What that "100B" at the end of model name means? Llama is either 8B or 100B! What it mean?
so 100B is for billions of parameters. The more the params, the model is supposed to be better. 8b, 4b or 2b stands for bits in quantization. We use quantization to reduce the model size to make it run locally on our laptops or CPU desktops.
If you have M1+ chip then llama.cpp will use metal and GPU. Right?
Yes, from their github page github.com/ggerganov/llama.cpp I gather that they support Metal. So, it should in turn be leveraging the GPU on the Mac.
What databases are used in light rag? Do you use both a vector and graph db?
though it uses graph structures for indexing, I believe in terms of storing the embeddings, its just any other vector DB
Doesn’t it relies on file-based storage instead vectordb?
Nicely explained! Keep the good work going!! 🤗
Thank you 🙂
Are you interested in more of theory or hands on implementation style videos? Your input will be very valuable 👍
@@AIBites I'm interested in more videos on concept understanding as the implementations are easily available
@@AIBites Yes. We all want