AI Bites
AI Bites
  • 99
  • 415 101
Run DeepSeek r1 locally on a laptop | 3 ways without coding
The DeepSeek r1 model has created a stir in the AI community. The model is currently free to use, we never know if it will be a paid service. The good news is that the model is open-source. So we can download and run it locally (even on our laptop).
So, in this video I am sharing 3 ways I found out to run DeepSeek or for that matter any open-source LLMs LOCALLY!
Hope it's useful!
⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️
0:00 - Intro
0:31 - First way/option
4:06 - Second way/option
6:34 - Third way/option
9:41 - Extro
AI BITES KEY LINKS
Website: www.ai-bites.net
TH-cam: www.youtube.com/@AIBites
Twitter: ai_bites​
Patreon: www.patreon.com/ai_bites​
Github: github.com/ai-bites​
มุมมอง: 125

วีดีโอ

DeepSeek Janus Pro 7b - Unified Vision and generation in one model (paper explained)
มุมมอง 3689 ชั่วโมงที่ผ่านมา
Janus Pro from DeepSeek - Unified Vision and generation in one model DeepSeek, the company that stunned the world with its R1 model has recently released a Multimodal model. It falls under the category of unified multimodal models where a single model is used to both understand and also generate images from prompts. In this video, lets go through the JANUS PRO model from DeepSeek and understand...
DeepSeek r1 vs ChatGPT - A brutally honest review | Full Review
มุมมอง 1.5K12 ชั่วโมงที่ผ่านมา
DeepSeek r1 vs ChatGPT - A brutally honest review DeepSeek has stirred the industry in the last couple of weeks. It's been particularly compared against OpenAI's o1 model. To help people choose between r1 and ChatGPT, in this video, we compare both the r1 and o1 models side by side. We explore math reasoning and logical and visual reasoning. We have placed particular emphasis on Math problem so...
DeepSeek R1 Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (paper explained)
มุมมอง 3.6K16 ชั่วโมงที่ผ่านมา
DeepSeek R1 Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (paper explained) DeepSeek R1 is the latest model from DeepSeek. It is the first work to show that directly training with Reinforcement Learning is sufficient. We don't need the Supervised Fine-Tuning(SFT) step typically followed while training LLMs. In this video, we read the paper and understand the model archit...
CrewAI - Crash Course | Tools in CrewAI | Part - 4
มุมมอง 36314 วันที่ผ่านมา
CrewAI - Crash Course | Tools in CrewAI | Part - 4 Crew AI is getting well-established as one of the go-to Python frameworks for developing and orchestrating AI agents. In the pervious videos in the CrewAI Crash Course series, we looked into Crew, Flow and Knowledge. This video is dedicated to CrewAI Tools. PREVIOUS VIDEO IN CREW AI Series CrewAI Part - 1: th-cam.com/video/jFTlvw0N_JM/w-d-xo.ht...
Transformers^2 - Self-Adaptive LLMs | SVD Fine-tuning | End of LoRA fine tuning? | (paper explained)
มุมมอง 6K14 วันที่ผ่านมา
Transformers^2 - Self-Adaptive LLMs | SVD Fine-tuning We have come a long way with fine-tuning LLMs. Low-rank Adaptation or LoRA has been established as the go-to method for fine-tuinig LLMs. We also have QLoRA which has also been established as the established approach for inference on compute budget. But none of the methods adapt the LLM weights. This paper proposes a self-adaptive approach t...
crewai crash course - Part 3 | knowledge
มุมมอง 42021 วันที่ผ่านมา
This is the third video in the crash course series on CrewAI. While the first 2 videos talk about CrewAI and flow, this one is all about knowledge. Knowledge is crucial to ground the agents and to avoid hallucinations. Please stay tuned for the rest of the videos in this series. Previous Videos: Part 1 - th-cam.com/video/jFTlvw0N_JM/w-d-xo.htmlsi=y64oj70KIJb7PeuV Part 2 - th-cam.com/video/1QNG4...
CrewAI Crash Course | Meeting Assistant development with CrewAI Flow | Part 2
มุมมอง 52921 วันที่ผ่านมา
Crew AI is getting well-established as one of the go-to Python frameworks for developing and orchestrating AI agents. In the first part of the video series on Crew AI we saw about agents and tasks. We developed an end-to-end crew with 3 agents collaborating to develop a snake game. In this video, let's move on to the next idea, flow. Flow enables different agents and crews within the crew to co...
CrewAI - Crash Course | End-to-end Game Development with Crew AI | Part - 1
มุมมอง 1.5Kหลายเดือนก่อน
Crew AI is getting well established as one of the go-to Python frameworks for developing and orchestrating AI agents. So what does it take to do a simple end-to-end Agentic project with Crew AI? What are the components and building blocks of Crew AI? This video strives to answer these questions. ⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️ 0:00 - What are agents? 1:04 - Why agents? 1:52 - Python Frameworks for...
Gemini 2.0 from Google - Emerging star in coding? | Full Review - Part 2
มุมมอง 395หลายเดือนก่อน
Google has just released its latest and most powerful Multi-modal model to date. How does it do compared to other models like Claude 3.5 Sonnet or GPT-4o? In the pervious video (Part 1), we saw about the Chat UI and looked into the features introduced in Gemini 2.0 in AI studio. In this video, let's test its coding and reasoning capabilities with several examples as close as possible to real-wo...
Gemini 2.0 from Google - Is it jack of all trades? | Full Review (Part 1)
มุมมอง 588หลายเดือนก่อน
Google has just released its latest and its most powerful Multi-modal model to date. How does it do compared to other models like Claude 3.5 Sonnet or GPT-4o? Is it just a jack of all trades, but master of none? Or has it nailed the multi-modal department. In this video I demo its multi-modal capabilities in several example real-world scenarios. This is part 1 of a 2-part video series. Please l...
How good is Claude + MCP at replacing full-stack developers? Lets test!
มุมมอง 1.9Kหลายเดือนก่อน
Claude 3.5 sonnet is already the most powerful model in code generation. Adding Model Context Protocol (MCP) tools can make it even more sophisticated. In this video, I walk you through the steps needed to equip Claude with MCP. Once it works, I will test Claude by giving a single long prompt with all the steps needed to write code, commit, push, branch, and merge it in Git. Hope it's useful! A...
Structured output from Ollama | Local LLM + VLM | Quick Hands-on
มุมมอง 966หลายเดือนก่อน
Ollama recently announced that it supports structured outputs. This means that we can get a serializable response straightaway without post-processing the response. It's a nice little handy feature of the open-source tool. In this video, let's look into the setups needed for structured outputs along with a couple of examples, all running on a Macbook laptop! OLLAMA KEY LINKS Ollama Announcement...
The new Llama 3.3 vs GPT-4o - full review | Is free Llama 3.3 sufficient?
มุมมอง 5Kหลายเดือนก่อน
Meta AI has just released Llama 3.3 70 B parameters model. The model seems to be on par with that of its flagship model Llama 3.1 405 B parameter model. But how does it compare with the GPT-4o model? Is it worth paying for the Open AI model or is open source sufficient? In fact, is the open source model better? Let's find out in this video. ⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️ 0:00 - Intro 2:11 - Bench...
RAG - Vector DBs for RAG | Indexing and Similarity in Vector DBs
มุมมอง 523หลายเดือนก่อน
In a Retrieval Augmented Generation(RAG) pipeline, the last step in the pre-processing step is the Vector DB. Whenever we want to reuse the embeddings, we are better off storing them in persistent store rather than embed data every single time the user queries the system. This is where Vector DBs come into play. In this video let's dive into the Vector DBs, why, what and different steps in buil...
RAG - Embeddings for RAG | BERT and SBERT | Sentence Transformers
มุมมอง 1.4K2 หลายเดือนก่อน
RAG - Embeddings for RAG | BERT and SBERT | Sentence Transformers
Mixture of Transformers for Multi-modal foundation models (paper explained)
มุมมอง 7662 หลายเดือนก่อน
Mixture of Transformers for Multi-modal foundation models (paper explained)
AI App to talk to your laptops locally (Local Alexa) - hands-on
มุมมอง 2812 หลายเดือนก่อน
AI App to talk to your laptops locally (Local Alexa) - hands-on
LightRAG - A simple and fast RAG that beats GraphRAG? (paper explained)
มุมมอง 4.5K2 หลายเดือนก่อน
LightRAG - A simple and fast RAG that beats GraphRAG? (paper explained)
How I generate unlimited AI images for free!
มุมมอง 5313 หลายเดือนก่อน
How I generate unlimited AI images for free!
bitnet.cpp from Microsoft: Run LLMs locally on CPU! (hands-on)
มุมมอง 1.3K3 หลายเดือนก่อน
bitnet.cpp from Microsoft: Run LLMs locally on CPU! (hands-on)
The new claude 3.5 sonnet - computer use, benchmark and more
มุมมอง 1773 หลายเดือนก่อน
The new claude 3.5 sonnet - computer use, benchmark and more
Introduction to PDF Parsing, challenges and methods (RAG Series)
มุมมอง 4733 หลายเดือนก่อน
Introduction to PDF Parsing, challenges and methods (RAG Series)
Swarm from Open AI - routines, handoffs and agents explained with code
มุมมอง 7343 หลายเดือนก่อน
Swarm from Open AI - routines, handoffs and agents explained with code
Meta Movie Gen Research Paper explained
มุมมอง 5063 หลายเดือนก่อน
Meta Movie Gen Research Paper explained
Contextual Information Retrieval for improving your RAG pipeline (from Anthropic)
มุมมอง 1.5K3 หลายเดือนก่อน
Contextual Information Retrieval for improving your RAG pipeline (from Anthropic)
Qwen2.5 coder - Combines code generation with reasoning to build coding agents!
มุมมอง 1.2K4 หลายเดือนก่อน
Qwen2.5 coder - Combines code generation with reasoning to build coding agents!
Qwen2.5 Math - world's leading open-source Math model?
มุมมอง 6454 หลายเดือนก่อน
Qwen2.5 Math - world's leading open-source Math model?
Qwen 2.5 - The Small Language Model? (a quick look)
มุมมอง 1K4 หลายเดือนก่อน
Qwen 2.5 - The Small Language Model? (a quick look)
o1 preview from OpenAI is all about reasoning - A comprehensive look
มุมมอง 3964 หลายเดือนก่อน
o1 preview from OpenAI is all about reasoning - A comprehensive look

ความคิดเห็น

  • @KoukuntlaPranav-d1h
    @KoukuntlaPranav-d1h 11 ชั่วโมงที่ผ่านมา

    The explanation was awesome and to the point really liked it

    • @AIBites
      @AIBites 11 ชั่วโมงที่ผ่านมา

      Thank you 🙂

  • @PoleesuSrinivasCh
    @PoleesuSrinivasCh 3 วันที่ผ่านมา

    A Better explanation even which will lead in all Sources.

  • @dennisbarzanoff9025
    @dennisbarzanoff9025 5 วันที่ผ่านมา

    yeah this isn't o1

    • @AIBites
      @AIBites 4 วันที่ผ่านมา

      updated now!

    • @dennisbarzanoff9025
      @dennisbarzanoff9025 4 วันที่ผ่านมา

      Bro what did you update, it's the same thing, it has to think. Produce thinking tokens. Yours replies directly. Google how to use o1

  • @gopinathmerugumala
    @gopinathmerugumala 5 วันที่ผ่านมา

    The video title says o1, but I see that you are using ChatGPT 4o or 4o mini most of the time. I don’t think it’s accurate comparison.

    • @AIBites
      @AIBites 4 วันที่ผ่านมา

      Do apologise guys. So what happened is, I started with o1, switched to ChatGPT at some point. Then forgot to switch back. But Nice catch though 😉 I will update the title accordingly. Thanks 👍

    • @AIBites
      @AIBites 4 วันที่ผ่านมา

      updated now!

  • @crowdpicker2735
    @crowdpicker2735 5 วันที่ผ่านมา

    great video

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      Thank you 🙂

  • @mukeshreddy7909
    @mukeshreddy7909 6 วันที่ผ่านมา

    great video

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      Thanks!

  • @SapienSpace
    @SapienSpace 7 วันที่ผ่านมา

    Fascinating review. I glanced at the paper, particularly at GRPO and GAE. GRPO looks a lot like Fuzzy-Logic with nodes or attention heads adapted to experience (e.g. such as "relative" via using K-means group clustering). Looking more deeply at GAE (Generalized Advantage Estimation) it is for an adaptive control system. I would not be surprised if the origin of the deep learning usage of Theta is an angle of a pendulum.

    • @SapienSpace
      @SapienSpace 7 วันที่ผ่านมา

      Overlapping membership functions used in Fuzzy Logic is very similar to KL.

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      Don't have much experience with fuzzy logic. But I like your perspective 🙂

  • @KhurramXahiL-py5dq
    @KhurramXahiL-py5dq 7 วันที่ผ่านมา

    Great explanation

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      Thanks 👍

  • @educatics_shorts
    @educatics_shorts 7 วันที่ผ่านมา

    I was used to think that Chatgpt is the only best model, But then I realised most of the Llama models are on par with gpt for free. I am TH-cam Scriptwriter and I realised both are almost same, and If I fine tune a 8b llama model on my computer with my scripts, it can outperform Gpt4o at Scriptwriting. A pretty crazy realisation today.

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      Great. So do u use these models for your scripts?

  • @kassugebresellasie803
    @kassugebresellasie803 7 วันที่ผ่านมา

    I have been using it in the last two weeks` i have built business related products. The trick of getting the best out of it is before 'Prompting' on the window directly which is faster but not deep enough. However, if you want high level realistic domain specific contents, first, click the 'DeepThink (R1)' icon and it turns blue then start prompting, which is the research stage, a little bit slow but worth it to get amazing results of whatever you niche in.

  • @francesclopez6192
    @francesclopez6192 7 วันที่ผ่านมา

    Thank you for your explanation !

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      My pleasure 😊

  • @amortalbeing
    @amortalbeing 7 วันที่ผ่านมา

    thanks a lot.

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      Most welcome!

  • @dd-pb3tx
    @dd-pb3tx 8 วันที่ผ่านมา

    Thank you always for your high-quality paper analyses. Do you have any plans to create videos on Deepseek v3 or r1 papers? They are incredibly good (and hot) models but I personally think that these, especially GRPO and the reward system, are not quite scalable to achieve o3 or higher level. I would love to hear your opinion.

    • @AIBites
      @AIBites 4 วันที่ผ่านมา

      Thanks for your feedback.. Irrespective of how good the DeepSeek models are, I think RL will be the next big thing. Now that DeepSeek has shown it's possible, biggies like Google and Meta might explore RL to the fullest. 🙂

  • @RNCS_SanathKumarN
    @RNCS_SanathKumarN 13 วันที่ผ่านมา

    hi ...your videos are extremely helpful in understanding crewAI and how it can be implemented...can you provide a link or the github repo of the sample projects you have created(game and assistant) so that can view the code and understand better how it works?...thank you!

    • @AIBites
      @AIBites 5 วันที่ผ่านมา

      Hey thx. Need to commit them to github. Shall I get back next week or would you want them urgently? 😊

  • @kassugebresellasie803
    @kassugebresellasie803 14 วันที่ผ่านมา

    Good job> Are you gonna walk through Part-5?

    • @AIBites
      @AIBites 14 วันที่ผ่านมา

      yes, in the making ;-) will be published soon!

  • @pensiveintrovert4318
    @pensiveintrovert4318 17 วันที่ผ่านมา

    It doesn't appear much different than prefix tuning, except prefixes are inserted dynamically based on the class of an input.

    • @kellymoses8566
      @kellymoses8566 13 วันที่ผ่านมา

      That is a pretty big difference

  • @proterotype
    @proterotype 17 วันที่ผ่านมา

    Love this series man. More CrewAI after it’s over please!

    • @AIBites
      @AIBites 14 วันที่ผ่านมา

      Thanks! encouraging :-)

  • @augmentos
    @augmentos 19 วันที่ผ่านมา

    soooo test time training essentially

    • @clarencejones4717
      @clarencejones4717 18 วันที่ผ่านมา

      Kind of. It's the logic progression. Test time training feels more like a feature where this is like a framework imo.

    • @AIBites
      @AIBites 18 วันที่ผ่านมา

      I would rather call it inference in 2 ways. First inference to choose which scale vector to use. Next inference to do the actual inference :-)

    • @AIBites
      @AIBites 18 วันที่ผ่านมา

      thanks for your inputs

  • @zhaoboxu833
    @zhaoboxu833 19 วันที่ผ่านมา

    Dude, that feels too good to be true...

    • @AIBites
      @AIBites 18 วันที่ผ่านมา

      yes, evideice things are progressing fast in AI :-)

  • @jeremyhofmann7034
    @jeremyhofmann7034 19 วันที่ผ่านมา

    I’m going to make a transformer that accepts as input all current transformers in sequence, to predict the next transformer in the sequence.

    • @aaaaaaaaaaaaaaaaa5763
      @aaaaaaaaaaaaaaaaa5763 19 วันที่ผ่านมา

      Universal Turing Machine

    • @shariqfaraz
      @shariqfaraz 18 วันที่ผ่านมา

      AI Research in a Nutshell

    • @cheekybastard99
      @cheekybastard99 18 วันที่ผ่านมา

      Pretty sure that would collapse the wave function and dissappear into another dimension. Be careful.

    • @AIBites
      @AIBites 18 วันที่ผ่านมา

      hah.. fantastic idea and nice side project :-)

    • @AIBites
      @AIBites 18 วันที่ผ่านมา

      he he...

  • @SphereofTime
    @SphereofTime หลายเดือนก่อน

    13:56 Good references and video!

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Thank you

  • @gopinathmerugumala
    @gopinathmerugumala หลายเดือนก่อน

    Interesting video! However, I was a bit disappointed with the final result. Using Pygame could have made it more visually appealing. That said, it was still a great video. Thanks for putting it together!

    • @AIBites
      @AIBites 26 วันที่ผ่านมา

      hey yes, I am totally with you on this. Its basically foolproofing for the future. CrewAI is just providing the framework to make the agents communicate and work seamlessly. When we have LLMs and multi-modal models that are super capable, we will then see astounding results! Thanks for the positive feedback.

  • @chadyonfire7878
    @chadyonfire7878 หลายเดือนก่อน

    Neat explanation

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Glad you think so!

  • @rogerroan7583
    @rogerroan7583 หลายเดือนก่อน

    light rag評價2極

  • @garywashington9391
    @garywashington9391 หลายเดือนก่อน

    Great presentation of a superior model.

  • @greenptgt4258
    @greenptgt4258 หลายเดือนก่อน

    Are these techniques "Post-training quantization" or "Quantization-aware training" ?

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Post training

  • @nuralifahsalsabila9057
    @nuralifahsalsabila9057 หลายเดือนก่อน

    hi can u make a video that explain about efficientnet lite?

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      At some point, yup 👍

  • @superfreiheit1
    @superfreiheit1 หลายเดือนก่อน

    I like how he lead us to papers.

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Thank you 🙂

  • @superfreiheit1
    @superfreiheit1 หลายเดือนก่อน

    Can I create a dataset only with questions and answers? Without context?

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Yes if we finetune on such a dataset it will be good at questioning answering

  • @pabloescobar2738
    @pabloescobar2738 หลายเดือนก่อน

    Thank

    • @AIBites
      @AIBites 26 วันที่ผ่านมา

      my pleasure :-)

  • @Luffynami143
    @Luffynami143 หลายเดือนก่อน

    U helped me complete whole unit in one vedio,keep posting wonderful vedios like thissss :))

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Glad to hear that

  • @TheSopk
    @TheSopk หลายเดือนก่อน

    what the diff with API? why they create MCP when you can use API

    • @saber8387
      @saber8387 หลายเดือนก่อน

      From what I understand, mcp can have the context and the memory of the sessions so its more aware whereas apis are individually requested.

    • @heesongkoh
      @heesongkoh หลายเดือนก่อน

      I guess it's just simpler to use in your llm app.

    • @AIBites
      @AIBites 26 วันที่ผ่านมา

      thank you for your inputs. Hope that answers :-)

  • @ISLAInstruments
    @ISLAInstruments หลายเดือนก่อน

    great explanation, thank you!

    • @AIBites
      @AIBites 26 วันที่ผ่านมา

      glad you like it :-)

  • @amit4rou
    @amit4rou หลายเดือนก่อน

    How much of vram is required to run qwen coder 2.5 7b version?

  • @sharon8811
    @sharon8811 หลายเดือนก่อน

    It gave you feedback on the sudoku game of 4o it’s said invalid move

    • @AIBites
      @AIBites 26 วันที่ผ่านมา

      yes, but I tried valid moves and tried like 4-5 times. Still with no luck. I edited those while making the video :-)

  • @lupusreginabeta3318
    @lupusreginabeta3318 หลายเดือนก่อน

    4o is free to use and not 200 PM😂

    • @AIBites
      @AIBites หลายเดือนก่อน

      Ok I would say 20 pm if you want to use it extensively 😊

  • @archijogi7021
    @archijogi7021 2 หลายเดือนก่อน

    Error occurred: Error running graph: Error building Component File: It looks like we're missing some important information: session_id (unique conversation identifier). Please ensure that your message includes all the required fields. constantly facing this error

  • @pratikvyas-g2r
    @pratikvyas-g2r 2 หลายเดือนก่อน

    how to fit Gemma 27b model to finetune on my free colab GPU ( T4 15GB memory ). Is there any way ? please explain

  • @pratikvyas-g2r
    @pratikvyas-g2r 2 หลายเดือนก่อน

    why padding = right , shouldn't it be left as it is next token generation where left side in sequence require padding

  • @scnak
    @scnak 2 หลายเดือนก่อน

    Thank you for your well-intentioned and sincere explanation. It's great to hear advice from someone who has a good grasp of the subject.

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      Very encouraging to keep going 👍

  • @РичардЖиулиевич
    @РичардЖиулиевич 2 หลายเดือนก่อน

    Thanks very much. But github link is broken

  • @karthikreddy9504
    @karthikreddy9504 2 หลายเดือนก่อน

    Can we use lightrag to pass the context to a fine tuned LLM?

    • @AIBites
      @AIBites 26 วันที่ผ่านมา

      I believe so. why not? LLM is just at the heart of your RAG pipeline. LightRAG is one of the better ways to build that pipeline, similar to graph RAG. So I don't see a problem with it

  • @davide0965
    @davide0965 2 หลายเดือนก่อน

    Terrible explanation, the background music makes all worse

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Sorry to know. It's one of the older videos. We have hopefully improved ever since 😊

  • @azmyin
    @azmyin 2 หลายเดือนก่อน

    Excellent explanation

    • @AIBites
      @AIBites 25 วันที่ผ่านมา

      Cheers

  • @SetuAI
    @SetuAI 2 หลายเดือนก่อน

    which screen recorder do you use ?

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      OBS

  • @Advokat7V
    @Advokat7V 2 หลายเดือนก่อน

    thank you friend

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      glad it was useful! 🙂

  • @AK-ox3mv
    @AK-ox3mv 2 หลายเดือนก่อน

    Its llama 3 8B. What that "100B" at the end of model name means? Llama is either 8B or 100B! What it mean?

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      so 100B is for billions of parameters. The more the params, the model is supposed to be better. 8b, 4b or 2b stands for bits in quantization. We use quantization to reduce the model size to make it run locally on our laptops or CPU desktops.

  • @rafaeel731
    @rafaeel731 2 หลายเดือนก่อน

    If you have M1+ chip then llama.cpp will use metal and GPU. Right?

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      Yes, from their github page github.com/ggerganov/llama.cpp I gather that they support Metal. So, it should in turn be leveraging the GPU on the Mac.

  • @AIWhale3
    @AIWhale3 2 หลายเดือนก่อน

    What databases are used in light rag? Do you use both a vector and graph db?

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      though it uses graph structures for indexing, I believe in terms of storing the embeddings, its just any other vector DB

    • @dramarama359
      @dramarama359 2 วันที่ผ่านมา

      Doesn’t it relies on file-based storage instead vectordb?

  • @prathameshdinkar2966
    @prathameshdinkar2966 2 หลายเดือนก่อน

    Nicely explained! Keep the good work going!! 🤗

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      Thank you 🙂

    • @AIBites
      @AIBites 2 หลายเดือนก่อน

      Are you interested in more of theory or hands on implementation style videos? Your input will be very valuable 👍

    • @prathameshdinkar2966
      @prathameshdinkar2966 2 หลายเดือนก่อน

      @@AIBites I'm interested in more videos on concept understanding as the implementations are easily available

    • @iamrubel
      @iamrubel 2 หลายเดือนก่อน

      @@AIBites Yes. We all want