Data Science Basics
Data Science Basics
  • 225
  • 947 552
Mastering Document Parsing with LlamaParse from LlamaIndex: Complete Guide
In this video, I will walk you through the document parsing using LlamaParse from LlamaIndex. LlamaParse allows you to securely parse complex documents such as PDFs, PowerPoints, Word documents and spreadsheets into structured data using state-of-the-art AI.
LlamaParse is available as a standalone REST API, a Python package, a TypeScript SDK, and a web UI.
First, I will walk you through the UI and then implement the same thing via python code. Let’s dive into it.
00:00 Introduction to Document Parsing
01:01 Exploring LlamaParse Documentation and Features
02:12 Understanding Document Parsing Limitations
06:55 Hands-On with LamaParse UI
09:56 Parsing Techniques and Modes
16:13 Advanced Parsing Instructions and Examples
31:36 Formatting Output with Parsing
32:27 Extracting Information from Excel Files
33:06 Image Parsing Demonstration
33:45 Parsing Audio Files
34:56 Exploring Output Modes and Limitations
35:36 Multimodal Parsing and Vendor Models
39:19 Setting Up the Code Environment
43:11 Running Parsing Examples in Code
46:18 Advanced Parsing Instructions
48:26 Using Auto Mode for Parsing
49:22 Extracting Specific Page Information
51:28 JSON Output and Audio Parsing in Code
52:53 Multimodal Parsing in Code
53:29 RAG Example and Embeddings
59:15 Recap and Conclusion
Link ⛓️‍💥
www.llamaindex.ai/blog/introducing-llamaparse-premium
docs.cloud.llamaindex.ai/llamaparse/getting_started
www.llamaindex.ai/
cloud.llamaindex.ai/
github.com/sudarshan-koirala/youtube-stuffs
docs.astral.sh/uv/
console.groq.com/login
------------------------------------------------------------------------------------------
☕ Buy me a Coffee: ko-fi.com/datasciencebasics
✌️Patreon: www.patreon.com/datasciencebasics
------------------------------------------------------------------------------------------
🤝 Connect with me:
📺 TH-cam: www.youtube.com/@datasciencebasics?sub_confirmation=1
👔 LinkedIn: www.linkedin.com/in/sudarshan-koirala/
🐦 Twitter: mesudarshan
🔉Medium: medium.com/@sudarshan-koirala
💼 Consulting: topmate.io/sudarshan_koirala
#llamaparse #llamaindex #documentparsing #datasciencebasics
มุมมอง: 482

วีดีโอ

Building Your First AI Agents With Phidata & models from Groq | Beginners Guide
มุมมอง 5K21 ชั่วโมงที่ผ่านมา
In this video, I will show how you can create a simple agent, multi-agent using Phidata. We start with the basics of setting up an AI project in a virtual environment, proceed with creating individual agents such as a web search agent using DuckDuckGo and a finance agent utilising Yahoo Finance. We then demonstrate how to combine these agents into a multi-agent system and run everything from th...
Docling from IBM | Open Source Library To Make Documents AI Ready | LlamaIndex
มุมมอง 878วันที่ผ่านมา
Dive into the capabilities of IBM's open source AI tool, Docling, designed for efficient document parsing and exporting. This video explores how DocLink works, its easy-to-use interface, and its ability to handle various document types including PDFs, DOCX, PowerPoints, and more. The video covers setting up the environment, basic and advanced features, and integrating Docling with Lama Index fo...
Get Started With Github Copilot Free in Visual Studio Code 🔥
มุมมอง 91814 วันที่ผ่านมา
In this video, I will explore how to set up and use GitHub Copilot in VS Code effectively. Learn about the announcements made on December 18th regarding GitHub Copilot's free plan, how to configure it, and various commands you can run. We also cover privacy settings, creating projects from scratch or existing ones, generating commit messages, and using Copilot Edit for multi-file editing. Perfe...
Extremely Fast Python Package Manager | written in Rust 🚀
มุมมอง 65321 วันที่ผ่านมา
In this video, we explore UV, a versatile and ultra-fast tool for managing Python projects and packages. Learn how to install UV, initialise projects, manage dependencies, and utilize various useful commands. The video highlights UV's speed and efficiency in handling multiple Python versions, creating virtual environments, and running scripts. Discover why UV is a powerful alternative to other ...
All You Need To Know About Amazon Bedrock
มุมมอง 37221 วันที่ผ่านมา
In this video, I will cover the highlights of AWS ReInvent 2024 and take a detailed look into the updates and features of Amazon Bedrock. From exploring the Bedrock console UI, configurations, and newly added models in the Bedrock marketplace to advanced functionalities like prompt routers, model routing, and watermark detection, we guide you through all the essential aspects of Bedrock. Additi...
Top 5 Essential Resources for Learning Generative AI
มุมมอง 336หลายเดือนก่อน
In this video, I'll guide you through five essential resources for anyone interested in learning about AI, particularly generative AI. We start with Hugging Face, a platform for collaborating on machine learning models and applications. Next is DeepLearning.AI, founded by Andrew Ng, offering a variety of AI courses and practical applications. The third resource is a site that evaluates AI model...
aisuite: Unified Interface for Multiple Generative AI Providers
มุมมอง 298หลายเดือนก่อน
In this video, we dive into aisuite, an exciting new package from Andrew Ng and his team that provides a simple, unified interface to interact with multiple generative AI models, including OpenAI, LLaMA, and others. We explore its features, demonstrate installation and implementation steps, and highlight how it allows developers to switch and compare responses from different large language mode...
Mastering Prompt Engineering with LangSmith's Prompt Canvas
มุมมอง 536หลายเดือนก่อน
In this video, we dive into LangSmith's Prompt Canvas, an innovative tool for developing and optimising AI prompts. The video explores the user interface and features of Prompt Canvas as a simplified and efficient prompt creation experience inspired by OpenAI's canvas UX. The host demonstrates how to use the tool, provides walkthroughs of various functionalities like editing prompts, utilising ...
Exploring Open Canvas: The Open Source Alternative to ChatGPT Canvas
มุมมอง 2.1K2 หลายเดือนก่อน
In this video, we will delve into Open Canvas from LangChain, an open-source alternative to ChatGPT Canvas. We explore its key features, including built-in memory, the ability to start from existing documents, and comprehensive UX for writing and coding. The video also provides a step-by-step guide on how to use Open Canvas both online and locally. Additionally, we discuss different functionali...
Maximize Your Efficiency: Exploring Canvas in ChatGPT for Writing and Coding
มุมมอง 3712 หลายเดือนก่อน
In this video, I will explain the new 'Canvas' feature introduced by OpenAI in the paid version of ChatGPT. The tutorial covers using Canvas for both writing and coding, highlighting its interactive and dynamic functionalities. Learn how to open sections within Canvas, utilize quick shortcuts, perform advanced edits, and leverage the history feature. The coding section shows practical examples ...
ChatGPT Search & Alternatives
มุมมอง 2002 หลายเดือนก่อน
In this video, I will explain into the newly enhanced web search functionality introduced by OpenAI in ChatGPT on October 31st, 2024. This updated feature, now available to Plus and Teams users and rolling out to free users soon, allows for internet searches directly within the chat interface, with results sourced from various partnered providers. We'll explore the interface, discuss its capabi...
Run GGUF models from Hugging Face Hub on Ollama and OpenWebUI
มุมมอง 2.7K2 หลายเดือนก่อน
Discover how to run large language models locally on your computer using Hugging Face and Ollama in this comprehensive tutorial. Learn to navigate through an extensive collection of over a million models available on Hugging Face and easily run them with a single command. The video offers a step-by-step guide to downloading models into Ollama and managing them via Open Web UI 00:00 Introduction...
Prompt Generator From OpenAI | ANYONE Can Write Prompts With This New Feature
มุมมอง 1.5K2 หลายเดือนก่อน
In this video, I will introduces a hidden feature in the OpenAI Playground that allows users to generate system instructions automatically. The tutorial explains how developers and newcomers can access and utilize this feature to enhance their applications with AI-generated prompts. The feature is currently available in free beta, providing an opportunity to test and refine system instructions ...
Super Easy Way To Parse Documents | LlamaParse Premium 🔥
มุมมอง 1.9K3 หลายเดือนก่อน
In this video, we dive into LamaParse Premium from LamaIndex that offers robust document parsing capabilities. We start by reviewing a blog post on the new Premium features and proceed to showcase the Lama Cloud UI for practical demonstrations. The video covers how LamaParse can parse complex documents, including diagrams and equations, and provides examples of using LamaParse via both the UI a...
AI/BI Dashboards | Databricks New AI Powered Visualization Tool
มุมมอง 1.3K3 หลายเดือนก่อน
AI/BI Dashboards | Databricks New AI Powered Visualization Tool
DATABRICKS AI/BI GENIE | No Code Interface For Your Data | Text TO SQL
มุมมอง 7653 หลายเดือนก่อน
DATABRICKS AI/BI GENIE | No Code Interface For Your Data | Text TO SQL
Exploring Databricks Notebook: New Features and Functionalities Overview
มุมมอง 6073 หลายเดือนก่อน
Exploring Databricks Notebook: New Features and Functionalities Overview
Use Llava In GroqCloud & OpenWebUI
มุมมอง 1K4 หลายเดือนก่อน
Use Llava In GroqCloud & OpenWebUI
Open WebUI: Local ChatGPT Alternative | For Complete Begineers | Full Tutorial
มุมมอง 17K4 หลายเดือนก่อน
Open WebUI: Local ChatGPT Alternative | For Complete Begineers | Full Tutorial
Extract Table Info From SCANNED PDF & Summarise It Using Llama3.1 via Ollama | LangChain
มุมมอง 2.8K4 หลายเดือนก่อน
Extract Table Info From SCANNED PDF & Summarise It Using Llama3.1 via Ollama | LangChain
Claude 3.5 Sonnet Artifacts
มุมมอง 6124 หลายเดือนก่อน
Claude 3.5 Sonnet Artifacts
Installing and Using LangGraph Studio | First Agent IDE
มุมมอง 3.5K4 หลายเดือนก่อน
Installing and Using LangGraph Studio | First Agent IDE
Introduction to LangGraph: Building and Enhancing LLM Agents
มุมมอง 1.5K4 หลายเดือนก่อน
Introduction to LangGraph: Building and Enhancing LLM Agents
Implementing Guardrails in Amazon Bedrock: A Step-by-Step Guide
มุมมอง 7705 หลายเดือนก่อน
Implementing Guardrails in Amazon Bedrock: A Step-by-Step Guide
Agents For Amazon Bedrock | NO CODE
มุมมอง 5925 หลายเดือนก่อน
Agents For Amazon Bedrock | NO CODE
Llama 3.1 | The Best LLM is now Open Source | TRY Locally & Online
มุมมอง 9895 หลายเดือนก่อน
Llama 3.1 | The Best LLM is now Open Source | TRY Locally & Online
Tools Available Now In HuggingChat 🔥
มุมมอง 7427 หลายเดือนก่อน
Tools Available Now In HuggingChat 🔥
Chat With Documents | Fully Managed RAG on Amazon Bedrock | NO-CODE
มุมมอง 3.2K7 หลายเดือนก่อน
Chat With Documents | Fully Managed RAG on Amazon Bedrock | NO-CODE
Getting Started With Amazon Bedrock | Simple ChatUI with Chainlit and LangChain
มุมมอง 2.2K7 หลายเดือนก่อน
Getting Started With Amazon Bedrock | Simple ChatUI with Chainlit and LangChain

ความคิดเห็น

  • @i1y4sakkus
    @i1y4sakkus วันที่ผ่านมา

    Does a Phidata fine-tuned python agent app run on Vercel with a Nextjs app? I mean, is it compatible with JS on any deployment service?

    • @datasciencebasics
      @datasciencebasics 17 ชั่วโมงที่ผ่านมา

      It should be possible. While Phidata is primarily designed for Python-based AI applications, it’s not directly compatible with Next.js or Vercel’s JavaScript environment. However, you can still leverage Phidata in a Next.js application deployed on Vercel by creating a separate Python backend that uses Phidata and integrating it with your Next.js frontend.

  • @Sepatu182
    @Sepatu182 วันที่ผ่านมา

    Can these AI commands be limiting?

  • @deepak5074
    @deepak5074 2 วันที่ผ่านมา

    Great sir, Keep continue this series

  • @aurimasc5333
    @aurimasc5333 3 วันที่ผ่านมา

    Awesome video, thanks!

  • @devByDash
    @devByDash 3 วันที่ผ่านมา

    hey, this video was great, I just wanna know a few things from your experience. 1. How is Google's embedding model compared to Fast_Embedding or OpenAI. 2. Pinecone store vs Qudrant any difference? I suppose not. 3. Your chunk size is 2000 with 100 overlaps. How is this doing so far or have you lowered it in production to 80-20 ratio. 4. And please make a video on how to do context await chunking and how to use metadata and all to efficiently use vector db.

  • @rajnd
    @rajnd 4 วันที่ผ่านมา

    1:05:34 ... Thanks Bro... Not just your technical and AI skills are awesome.... you also rock as a presenter... and a tutor.... Keep up the good work... Wishing you a very very Happy New Year 2025... Rgds

    • @datasciencebasics
      @datasciencebasics 3 วันที่ผ่านมา

      You are welcome. Thanks for the feedback, always feels great to hear someone learning from my videos. Happy new yr to u 2 !!

  • @aayushironside3506
    @aayushironside3506 4 วันที่ผ่านมา

    Dude please give us something new, you did 99% what Kish naik did in previous turorial.

  • @deepak5074
    @deepak5074 4 วันที่ผ่านมา

    thanks sir make more such videos learn a lot

  • @madeshj8764
    @madeshj8764 4 วันที่ผ่านมา

    Can I deploy this llm via flutter??

  • @PravamayaDas
    @PravamayaDas 5 วันที่ผ่านมา

    Please do more detailed videos in MLFlow. I am interested to watch your videos on Databricks

  • @PritiSurange
    @PritiSurange 6 วันที่ผ่านมา

    can wew use this for my organization, (client) we have to parse pdf for client in data bricks cloud

  • @nijalshakya6770
    @nijalshakya6770 6 วันที่ผ่านมา

    Dai really love and appreciate your content, nepali ma pani banaunus na ekdam helpful hunxa hami jasto underprivileged harulai thank you

  • @RohitSharma-uw2eh
    @RohitSharma-uw2eh 6 วันที่ผ่านมา

    Why hindi audio track is not available 😢

  • @BusinessViewer
    @BusinessViewer 7 วันที่ผ่านมา

    Sir, great tutorial. I’m getting : I got - unauthorised api error, incorrect so I used Import os Os.env and input the api key Now I’m getting, Failed to call a function. Please adjust your print. See ‘failed generation’ for more details. If you can please help me out; that would be great. Thanks

    • @datasciencebasics
      @datasciencebasics 7 วันที่ผ่านมา

      hei, by just the info provided by you, I am not sure what is going wrong but if it is related to environment variables, there are diff ways to read env variables, here is one, import os # Hardcoding environment variables os.environ['MY_VARIABLE'] = 'my_value'

    • @BusinessViewer
      @BusinessViewer 7 วันที่ผ่านมา

      @@datasciencebasicsmust be a network error, it’s working now. I saw other comments calling you a copy cat, not supporting reposting but I’ve to say you added genuine value to it and that others are reposting the subject too (since it’s in the official documentation). I want to let you know that your content was better than the “original” in a way it helped me understand it more. Thank you, and I started using uv - made my life easier

    • @BusinessViewer
      @BusinessViewer 7 วันที่ผ่านมา

      Actually there’s still an error. If I put stream = False it shows max result = 1 but no capital’s name but if I put stream = True it shows connection failed Edit: changed groq model, it’s working for now

    • @datasciencebasics
      @datasciencebasics 7 วันที่ผ่านมา

      Glad that its working now and you got something new to learn out of it. Thanks !

  • @GustavoMontanha
    @GustavoMontanha 8 วันที่ผ่านมา

    What a beautiful video 😊

  • @deepak5074
    @deepak5074 8 วันที่ผ่านมา

    Which browser are you using arc?

  • @NarutoUzumaki-u4q3n
    @NarutoUzumaki-u4q3n 8 วันที่ผ่านมา

    Make some unique tutorials We already have this in Krish Naik Tutorials

  • @SolidBuildersInc
    @SolidBuildersInc 8 วันที่ผ่านมา

    Thank you so much for such an eloquent insight for multi-agents. Appreciate the simplicity and instructional approach for an effective delivery of a challenging subject.

  • @THEINDIANSAVIOR
    @THEINDIANSAVIOR 9 วันที่ผ่านมา

    Bro try to make more videos , its really interesting .

  • @alteredalley
    @alteredalley 9 วันที่ผ่านมา

    Helpful video! Thank you

    • @datasciencebasics
      @datasciencebasics 9 วันที่ผ่านมา

      You are welcome, glad that the video is helpful!!

  • @saltygamer8435
    @saltygamer8435 9 วันที่ผ่านมา

    bro what browser is that, it looks sick

  • @shaiknaveed78
    @shaiknaveed78 11 วันที่ผ่านมา

    you are amazing, your explanations are so so easy to understand, keep up the good work Thank you!!!!

    • @datasciencebasics
      @datasciencebasics 11 วันที่ผ่านมา

      You are welcome. Glad that it was helpful!!

    • @michelecarbonella2189
      @michelecarbonella2189 8 วันที่ผ่านมา

      Thanks for this video. It would be interesting and useful to show how to handle multiple uploaded PDFs, not just a single one." 🙏🏻

  • @ApoorvaG-g5d
    @ApoorvaG-g5d 12 วันที่ผ่านมา

    could please tell , how to fetch job id and send it when failed to teams webhook notification

  • @ApoorvaG-g5d
    @ApoorvaG-g5d 12 วันที่ผ่านมา

    Hi sudarshan , could you please make video on fethcing job ID and sending notification for job failure to webhook teams

    • @datasciencebasics
      @datasciencebasics 12 วันที่ผ่านมา

      Hei, the simple way is to provide the email of the teams channel in the Email section and select the appropriate options, check this video ! Day 22: Databricks Workflows | 30 Days of Databricks th-cam.com/video/lJrzgQfH1tc/w-d-xo.html

  • @sainaidu680
    @sainaidu680 13 วันที่ผ่านมา

    Great effort 🎉 . But we are excepting a end to end project with etl, using aws as cloud.

  • @userrjlyj5760g
    @userrjlyj5760g 14 วันที่ผ่านมา

    Not sure what's the point of all this sophistication if you are going to present it while using GPT-4o-mini!!! Your video title mentions "Open Source Alternative to ChatGPT" whats the 'opensource' aspect of what you're sharing here!!!? Just the UI?? So, what!? sSimple html/js can do the whole trick!!! You literally demonstrated an opensource app while using API connected to GPT-4o-mini!!! Why you didn't demonstrate the whole thing with some LLM models??

    • @datasciencebasics
      @datasciencebasics 13 วันที่ผ่านมา

      Hei, I showed you with one of the model there but you can choose different models other than openai’s model. Also, as I mentioned in the video there are some additional features which are not in ChatGPT canvas. Thanks for the feedback!!

  • @varunlobo
    @varunlobo 15 วันที่ผ่านมา

    Thanks!

    • @datasciencebasics
      @datasciencebasics 15 วันที่ผ่านมา

      You are welcome, thanks for the support!!

  • @kenchang3456
    @kenchang3456 15 วันที่ผ่านมา

    This series is terrific. A real treasure. Thanks,

    • @datasciencebasics
      @datasciencebasics 15 วันที่ผ่านมา

      You are welcome. This kind of comment is what motivates me to create more content :) It feels great to know someone somewhere is finding it helpful 🙂

  • @kenchang3456
    @kenchang3456 15 วันที่ผ่านมา

    Very excellent video. Thank you for sharing your experience.

    • @datasciencebasics
      @datasciencebasics 15 วันที่ผ่านมา

      You are welcome, glad that it was helpful!!

  • @fealgu100
    @fealgu100 16 วันที่ผ่านมา

    Great video. I deployed this very same rag-app, but noticed it did not display😁 well on an mobile phone. Any thoughts? Beginner ove' here.

  • @zapy422
    @zapy422 16 วันที่ผ่านมา

    can you ask about a dataframe?

  • @lifeasprerna
    @lifeasprerna 22 วันที่ผ่านมา

    Very informative! Thanks for sharing

    • @datasciencebasics
      @datasciencebasics 22 วันที่ผ่านมา

      You are welcome, glad it was informative!!

  • @SiddharthPant
    @SiddharthPant 22 วันที่ผ่านมา

    This looks great. It takes a lot of the right things from cargo, npm, and composer ecosystems.

  • @budsayalaohapensaeng6869
    @budsayalaohapensaeng6869 22 วันที่ผ่านมา

    I need to know this step can I install into the conda enviroment?

    • @datasciencebasics
      @datasciencebasics 18 วันที่ผ่านมา

      you should be able to do it, give a try!

  • @Master_of_Chess_Shorts
    @Master_of_Chess_Shorts 23 วันที่ผ่านมา

    Thanks! It simplifies a lot of different commands for setting up a new project. I made a script for my environment but I'll consider uv for package maintenance.

  • @Anoniem-o4y
    @Anoniem-o4y 24 วันที่ผ่านมา

    But if i game now with one of those cloud gpus on, will it increase my fps.

    • @datasciencebasics
      @datasciencebasics 18 วันที่ผ่านมา

      give a try if it helps, haven’t tried myself for games.

  • @arunkrishna1036
    @arunkrishna1036 24 วันที่ผ่านมา

    Thanks sir. I have a question, is it possible to create a tool using GenAi which can ingest pdf documents of varying structure and then extract specific data dynamically and pass it as a context into LLM?

    • @datasciencebasics
      @datasciencebasics 24 วันที่ผ่านมา

      hello, you can check Llamaparse from LlamaIndex. here is one of the video I created, th-cam.com/video/S_F4RUhKaV4/w-d-xo.htmlsi=hjbSttAqOn85_wUV

    • @arunkrishna1036
      @arunkrishna1036 24 วันที่ผ่านมา

      @ Thank you 🤩

  • @limitless1692
    @limitless1692 27 วันที่ผ่านมา

    You explained something really important here. First that Llama2 model can do vector embeding RAG And you showed that nomic-embed-text can do the same RAG embeding, but nomic-embed-text it is alot better and it definetely it is alot faster. I mean the speed differance is insane!! Thank you for showing that.

  • @ashwarya26
    @ashwarya26 29 วันที่ผ่านมา

    Please create one detailed video on End to End machine Learning experiment with mlflow. This whole databricks playlist is superb :)

  • @stanTrX
    @stanTrX 29 วันที่ผ่านมา

    Thanks, all i ask is perfect table extraction with all the formatting and accuracy. what s my best bet?

    • @datasciencebasics
      @datasciencebasics 24 วันที่ผ่านมา

      You are welcome, you can give LlamaParse a try th-cam.com/video/S_F4RUhKaV4/w-d-xo.htmlsi=XHE98g6xAuh0u8jb

  • @mansigandhi4483
    @mansigandhi4483 หลายเดือนก่อน

    I using community version of databricks , the Repo isnt available? any work around?

    • @datasciencebasics
      @datasciencebasics 29 วันที่ผ่านมา

      in community edition, its not possible atm !!

  • @naufalnazaruddin3605
    @naufalnazaruddin3605 หลายเดือนก่อน

    thank u

  • @naufalnazaruddin3605
    @naufalnazaruddin3605 หลายเดือนก่อน

    thank you

  • @naren06938
    @naren06938 หลายเดือนก่อน

    Can we make all commands into one shell script file as script.sh and ./script.sh, Like in Linux, here how can we run all at once?

  • @naren06938
    @naren06938 หลายเดือนก่อน

    Really you touch each small keen observations, Superb.....i expect much complex ETL pipeline projects, but not found in ur playlists.... please make projects

  • @HareeshkarRavi
    @HareeshkarRavi หลายเดือนก่อน

    Thanks for the detailed explanation as a beginner it's so much helpful

  • @HM-gm1kn
    @HM-gm1kn หลายเดือนก่อน

    buddy this video could have been 10 seconds. Jeeez

    • @datasciencebasics
      @datasciencebasics หลายเดือนก่อน

      in that case, there is command in the thumbnail itself buddy. Its always good to explain stuffs so a complete beginner can also understand!

  • @rand1461
    @rand1461 หลายเดือนก่อน

    Keep going I will watch it many thanks

  • @AK-millum
    @AK-millum หลายเดือนก่อน

    How to save the retriever and load data from it later?

  • @adanpalma4026
    @adanpalma4026 หลายเดือนก่อน

    Thanks.. Just one question. You have some videos showing diferent. techniques for. rag... What is the most stable and roubust way to build a very good. RAG with higt success rate, becuase i am seen. that there are too many videos out there all say that theri techniques are the best for good RAG.