man this is how tutorials need to be A real use case which is something different from the examples already being used in the Documentation this is so much better compared to all other yt channels which simply do whats already mentioned in the docs and adding no value at all
Very good job 👏 It would be nice to extract the contact information of the venue and email them automatically with a predefined email template for photography business to seek new clients. 😊
It''s the css_selector which is typically the problematic part. I was kinda hoping this crawler would use AI to somehow magically deal with randomized class names etc. edit: I know there are things like scrapy shell but it's a bit tedious.
yeah thats the main problem to solve! if llm can automatically pick the css selector from the prompt then it will be very easy otherwise! we can just use puppeter and other crawling libraries i dont see much difference
Why to use a llm when the information it’s perfectly structured on the site… probably tagged ln the css so you can assign location phone etc directly to the database. Don’t gent why to use LLM
Thank you, Guys he give you the first step on the road use ur brain to make that advanced specially with so much Free Advanced AIs. Whish you all the best
I really appreciate the detailed steps in this tutorial. Would love to see a comparison between these methods and HasData's approach to scraping complex web data. Any suggestions?
Is the code from the video a generic code that can serve as a multi-purpose tool for any king of scraping?? or is it just specific for the shown use case in this video??
You can easily adjust the code to work with different websites. If I was you, I'd throw the code into cursor and ask it to make tweaks based on the next website you want to scrape.
А зачем нейросеть для такого простого парсинга данных?Это делается питоном без проблем. Покажите как работать с файлами где есть скрытие формы, подгрузка с аяксом по клику и тд.
Why I really need to use deepseek here? Isnt it overkill? I mean the webpage is pretty much structured. One can still use python standard libraries to extract the same information right? No need to have powerful processing machine / high computation cost etc, right?
it's a silly example because it's the 'perfect' site to scrape. too easy. in fact this site is used in other youtube web scraping clickbait videos. who hires a programmer for scraping wedding sites antway?
This looks great for websites that are directories, but what about a typical company website. Will this technique be effective at scraping all the pages of the website look for company data like locations, phone numbers, etc. and contact info for all partners of a law firm? Currently I use perplexity and n8n to do this?
Can you please advise if what you show in the video violates T&Cs of the websites that are being scraped? As far as I know any automated data scraping usually violates T&Cs. Websites retain the right to ban your IP, or take you to court.
It's weird to use AI and then it requires to specify a model. AI is intelligent. So why not just say: extract all relevant data and put that into a suitable format to the AI. If you have to specify the format then you could also take the extra steps and extract the data with some xpath statements. That will save the cost of using AI. I guess technology is not that mature yet.
Schemas, xml, XPath, sitemaps. Rudimentary concepts can propel the industry. At times, the hype of AI drowns out the simple approaches. I agree with you.
To sum it up, true intelligence requires understanding of context. It's when Ai can understand context of instruction that it can do so. Truth is it is that smart, but needs u to systematically give it a prompt that can give it context to execute Nd give u the right result or answer
@@KenDores-oy9mc "systematically" 100% this. The capability and technology is already there. We (humans), just have to fine tune and harness that power.
Hi! can I scrape whatsapp contacts with status? I mean, I need to know the details of my tab of contacts in order to delete who are not add me as contact. Thank you.
No! Great question! When I started creating this tutorial, I was focused on running DeepSeek-r1 locally. However, my computer wasn't the strongest so it really struggled with long context window. I could have dropped this when I moved over from the local model to the groq model.
Only 1033 in, but this sounds so cool! My primary scraping goal is to scrape the lds website, and copy all talks from conferences and posts from ensign to three folders: "presidents", "quarum of the 12", and a default folder. I want to do this checking name & date, then checking a list of when they joined the quarem or became president, and sort them appropriately, where quarem includes the presidents, and the default includes them all. (Yes, up to 3 copies) so if Russell Nelson gave a talk when he was only in the 70, it'd go to default only, despite his current position.
That is an error maybe you miss something. Computer always right and it will show an error if something goes wrong. You can judge which are wrong maybe 1. men/you 2. methode 3. tools
I get the same. It works fine if I uxse my local ollamd model, the API key also works fine with a curl request. Im currently investigating this as not everyone is affected. What are you runingf on windows or Mac/Linux?
I can't see why I'd need an AI for that. This can be done easily with usual tools. And I guess it would even be 10 times the speed. Where is the benefit?
it literally has dollar signs on the listings. you can probably have it go into each one and grab the Starting Price and anything else you specify instead.
Excellent video. I would appreciate it if you could activate the new TH-cam option for automatic AI audio translation in your videos so that I can listen to your videos in Spanish. Thank you.
This is a nice example, but I guess you don’t really know how photographers and venues operate. Most wedding venues don’t actually have a house hotographer, instead the person getting married hires a photographer, who then just goes to the wedding venue of choice. So I don’t know how valuable it is in your example to give a photographer a list of wedding venue leads.
Hey aiwithbrandon , really nice video! I was wondering if I could help you with more Quality Editing in your videos with good pricing & turnaround time and will also make a highly engaging Thumbnail which will help your video to reach a wider audience ! Lmk what you think ?
man this is how tutorials need to be
A real use case which is something different from the examples already being used in the Documentation
this is so much better compared to all other yt channels which simply do whats already mentioned in the docs and adding no value at all
This is a quality level of class... I wish I knew this year ago....
Love this video-thanks Brandon for explaining the topic so clearly. Liked, subscribed and joined your Skool community.
Very good job 👏
It would be nice to extract the contact information of the venue and email them automatically with a predefined email template for photography business to seek new clients. 😊
Enjoyed this! Thanks for the source code too Brandon!
was searching for this from a week thanks man
It''s the css_selector which is typically the problematic part. I was kinda hoping this crawler would use AI to somehow magically deal with randomized class names etc. edit: I know there are things like scrapy shell but it's a bit tedious.
yeah thats the main problem to solve! if llm can automatically pick the css selector from the prompt then it will be very easy otherwise! we can just use puppeter and other crawling libraries i dont see much difference
Why to use a llm when the information it’s perfectly structured on the site… probably tagged ln the css so you can assign location phone etc directly to the database. Don’t gent why to use LLM
@@jrosmail The site's source can change, then your scraping doesn't work anymore
Thank you, Guys he give you the first step on the road use ur brain to make that advanced specially with so much Free Advanced AIs.
Whish you all the best
Thank you for all the support 😁
Thank you very much.
You are an excellent teacher with a fabulous talent for clarity. Nice job!
Thank you so much! This made my day 😁
Believe me, you are one of the best teachers who made complex things look like nothing .Thanks, brother @bhancock_ai
this very video has earned you my subscription dude....... Good job ... I'm subscribed now
loved the explanation. You deserve a sub
I really appreciate the detailed steps in this tutorial. Would love to see a comparison between these methods and HasData's approach to scraping complex web data. Any suggestions?
This is a great ! thanks for sharing mate
Thanks! was just about get into Scrapy.
amazing thanks
Absolute Gold!
Amazing content as usual!
Nice Marketing for Crawl AI
Expensive way of doing scraping
care to share a cheaper way please?
@@deephousemorocco4802I believe scrapy will do this faster and cheaper. But you need to play a bit more with selectors and data cleaning.
And slow
@@deephousemorocco4802 Just classical HTML Dom parsing. Using CSS selectors or xpath. The field of scraping wasn't invented with LLMs
@@deephousemorocco4802 Use something like Scrapewell - just make API calls for the pages and parse with Python.
is this also able to scrape the images with it? if so, what lines do we need to edit?
Is there anyway to automatically pass that css component to make it more dynamic?!
Way outside of my wheelhouse ! but learning..
Very nice and clear explain about AI Crawler. Where can I get the code to study on it a lit bit more?
It's in the first link in the description 😁
Excellent video, what software do you use to record your video tutorials?
Just what I need!
Is the code from the video a generic code that can serve as a multi-purpose tool for any king of scraping?? or is it just specific for the shown use case in this video??
You can easily adjust the code to work with different websites.
If I was you, I'd throw the code into cursor and ask it to make tweaks based on the next website you want to scrape.
А зачем нейросеть для такого простого парсинга данных?Это делается питоном без проблем. Покажите как работать с файлами где есть скрытие формы, подгрузка с аяксом по клику и тд.
Why I really need to use deepseek here? Isnt it overkill? I mean the webpage is pretty much structured. One can still use python standard libraries to extract the same information right? No need to have powerful processing machine / high computation cost etc, right?
it's a silly example because it's the 'perfect' site to scrape. too easy. in fact this site is used in other youtube web scraping clickbait videos. who hires a programmer for scraping wedding sites antway?
Crawl4AI is great, I installed it, but I’ve never been able to get it fully working with the API and custom parameters, did you try with an api?
I've never tried the API either.
I know a ton of developers also like browser base so I recommend trying that one too if you're not a fan of Crawl4AI
excellent presentation, but need to watch the tutorial and read understand the docs of GroqCloud, Crawl4AI, and phidata
Good tutorial. QQ:
Can this not be done just by python selenium ? What are we gaining using Deepseek & Crawl4AI ?
looks interesting
I think it is useful for scraping the image for products in my webshop, right?
This looks great for websites that are directories, but what about a typical company website. Will this technique be effective at scraping all the pages of the website look for company data like locations, phone numbers, etc. and contact info for all partners of a law firm? Currently I use perplexity and n8n to do this?
i cant do anything in deepseek it always give me this error: The server is busy. Please try again later.
Try getting an openrouter key. That’s how I had to work around that issue
Good please recommended more
how it acts against "robots" file that prevents scraping at many websites?
Why just do things, when you can go off and do them?
Can you please advise if what you show in the video violates T&Cs of the websites that are being scraped? As far as I know any automated data scraping usually violates T&Cs. Websites retain the right to ban your IP, or take you to court.
Nice one!! I also tried to make it uncensored on my channel about the chinese conent
But y photobro no just go listing site himself? Y he need haz excel sheet ?
Nicely structured tutorial btw thanks 🙏🏽
Can i use it to scrape twitter
It's weird to use AI and then it requires to specify a model. AI is intelligent. So why not just say: extract all relevant data and put that into a suitable format to the AI. If you have to specify the format then you could also take the extra steps and extract the data with some xpath statements. That will save the cost of using AI. I guess technology is not that mature yet.
Technology is well mature to scrape dynamically but I did not get why he specified css selectors 🥲
Schemas, xml, XPath, sitemaps. Rudimentary concepts can propel the industry. At times, the hype of AI drowns out the simple approaches. I agree with you.
To sum it up, true intelligence requires understanding of context. It's when Ai can understand context of instruction that it can do so. Truth is it is that smart, but needs u to systematically give it a prompt that can give it context to execute Nd give u the right result or answer
@@KenDores-oy9mc "systematically" 100% this. The capability and technology is already there. We (humans), just have to fine tune and harness that power.
BrowserUse does exactly this. It’s smart enough to understand the page by using vision AI capabilities and decide the next best action
is it me or LLM scraping feels like the most inefficient way to do such task?
Hi! can I scrape whatsapp contacts with status? I mean, I need to know the details of my tab of contacts in order to delete who are not add me as contact. Thank you.
How much do you charge for the scraping?😊
If you need help, please contact me.
can you scrape data from google?
Is it possible to scrape LinkedIn profile data
Wow amazing
ok if i have to type the logic for the info- container extraction myself
why we are using ai 😂
or i am thinking ahead of time😅
No! Great question!
When I started creating this tutorial, I was focused on running DeepSeek-r1 locally. However, my computer wasn't the strongest so it really struggled with long context window.
I could have dropped this when I moved over from the local model to the groq model.
Only 1033 in, but this sounds so cool! My primary scraping goal is to scrape the lds website, and copy all talks from conferences and posts from ensign to three folders: "presidents", "quarum of the 12", and a default folder. I want to do this checking name & date, then checking a list of when they joined the quarem or became president, and sort them appropriately, where quarem includes the presidents, and the default includes them all. (Yes, up to 3 copies) so if Russell Nelson gave a talk when he was only in the 70, it'd go to default only, despite his current position.
LGO! Do you know if it works with proxies and authenticated users ?
let me know if u found that
: Enables dynamic proxy rotation to avoid IP bans and enhance security during web crawling. found this
I have an issue when running main.py that always shows 'Invalid API Key,' but I double-checked that my API KEY is correct.
That is an error maybe you miss something. Computer always right and it will show an error if something goes wrong. You can judge which are wrong maybe 1. men/you 2. methode 3. tools
I get the same. It works fine if I uxse my local ollamd model, the API key also works fine with a curl request. Im currently investigating this as not everyone is affected. What are you runingf on windows or Mac/Linux?
code repo??
I can't see why I'd need an AI for that. This can be done easily with usual tools. And I guess it would even be 10 times the speed. Where is the benefit?
This seems complex to get the data
I use LM Studio instead of Groq
is it paid?
@ it’s free?
@ thanks buddy
22:06 Why are the prices just a $ sign?
it literally has dollar signs on the listings. you can probably have it go into each one and grab the Starting Price and anything else you specify instead.
Wow! I look forward to another useful and easy to follow AI project
Excellent video. I would appreciate it if you could activate the new TH-cam option for automatic AI audio translation in your videos so that I can listen to your videos in Spanish. Thank you.
That early return statement is funny 🤣
Can Crawl4ai scrape Reddit?
This is a nice example, but I guess you don’t really know how photographers and venues operate. Most wedding venues don’t actually have a house hotographer, instead the person getting married hires a photographer, who then just goes to the wedding venue of choice. So I don’t know how valuable it is in your example to give a photographer a list of wedding venue leads.
Just a demo, your use case may be different.
You gained a new sub and I also joined your Skool community. Amazing content! Thank you.
:/ I cant save your video to a playlist turn off option made for children / kids or w.e this prevents playlisting thats all it does! ty
Hey! All of my videos are set to “not made for children” so I’m not sure why it’s giving you that issue 🤔
Wish this could scrape LinkedIn!
all this and no phone number and email????? like bro...
Crazy shit. Right? 💯😂😂😂
@ clickbait bullshit video
AutoKeybo runs DeepSeek.
Great video per usual!
Amazing! You are a rock star!
Groq is not free
it is actually free for a limited usage!
Instead of CSV you said SCV... Too much starcraft! lol
😂😂😂 I need to stop watching Clem dominate everyone else in SC2! Lol
Overkill. Ask AI to build a custom system leveraging free tools and plug the code into a VM. You'd be surprised.
Sorry but you talk too much. You should cut to the chase and start coding immediately. I don't need the preliminary details. I will get it as you code
Si
I just came to unsubscribe. I do this for all finger pointing videos. Follow the follower!
there are already a million scrapping tools outthere that don't need a dozen interconnected services...like seriously
enjoy your 20 years in jail
😂
Hey aiwithbrandon , really nice video! I was wondering if I could help you with more Quality Editing in your videos with good pricing & turnaround time and will also make a highly engaging Thumbnail which will help your video to reach a wider audience ! Lmk what you think ?
Youre handsome bro i think im in love