Python Web Scraping - Should I use Selenium, Beautiful Soup or Scrapy? [2020]

แชร์
ฝัง
  • เผยแพร่เมื่อ 17 ต.ค. 2024

ความคิดเห็น • 115

  • @aabergkvist
    @aabergkvist 4 ปีที่แล้ว +72

    Beautiful Soup
    + User Friendly
    + Easy to Learn & Master
    - Requires Dependencies
    - Inefficient
    Scrapy
    + Efficient
    + Portability
    - Not User Friendly
    Selenium
    + Versatile
    + Works well with Javascript
    - Not Meant to be a Web Scraper
    - Inefficient

    • @sebastianfors4491
      @sebastianfors4491 2 ปีที่แล้ว +3

      I hope you scraped this from the video because that looks like an awful lot of work to type out...

    • @aabergkvist
      @aabergkvist 2 ปีที่แล้ว +2

      @@sebastianfors4491 actually, I was stuck on a commute iirc and wrote it down to "fortify" the learnings from the video :)

  • @KiteHQ
    @KiteHQ  4 ปีที่แล้ว

    If you liked this video, join the Kite Developer Community on Facebook for access to more resources + support from fellow Python developers. Time to level up! facebook.com/groups/505658083720291

  • @homeheart1276
    @homeheart1276 2 ปีที่แล้ว +2

    Well done, Sir. You just made it into my "0. Top Resources" Bookmark folder...the competition to get in there is insane and your roommates are very few and far between. It's not what you did in this video per se, it is HOW you did it. Concise, clear, to the point, and not made artificially long to improve your TH-cam revenue. *Make sure* you are advertising to entrepreneurs and I.T. professionals; we have little time (or patience). Thanks again! Well done.

  • @guillaumehoareau1161
    @guillaumehoareau1161 3 ปีที่แล้ว +6

    Wow, the quality of the video and the editing is outstanding.

  • @kotarouriderblack6118
    @kotarouriderblack6118 ปีที่แล้ว +2

    For my use case using Selenium is perfect because I hate dealing with pesky buttons on dynamic webpage.

  • @adamblomfield7914
    @adamblomfield7914 3 ปีที่แล้ว +1

    Love this quick video summary. content is perfect and I got exactly what I came for. Some tiny constructive feedback on the delivery would be to speak about them in the same order throughout.
    0:39 - Selenium, BS, Scrapy
    main content - BS, Selenium, Scrapy (best order in my opinion)
    summary (4:34) - BS, Scrapy, Selenium
    Keep up the great work!

  • @KiteHQ
    @KiteHQ  4 ปีที่แล้ว +5

    Let us know what topic we should cover next!

    • @nghiepcrypto7034
      @nghiepcrypto7034 4 ปีที่แล้ว +2

      Pandas and Numpy, working with excel please!
      I know there are a lot of video content talking about these, but I believe that you can do that better. Thanks!

  • @saurabhbhambry
    @saurabhbhambry 4 ปีที่แล้ว +8

    Great video! Love how it's concise and to the point. Quick question, can Scrapy be used for scraping sites that use Javascript for dynamic loading too? Or is Selenium the only choice for such a scenario?

    • @abc.2924
      @abc.2924 3 ปีที่แล้ว

      It can, if you combine it with splash and run it using docker

  • @araza554
    @araza554 4 ปีที่แล้ว +11

    Hey, did you use Adobe After Effects or some other tool in the starting of video where you were elaborating the agenda of this video?

  • @KapilSharma-co8xq
    @KapilSharma-co8xq 4 ปีที่แล้ว +2

    Which elements scrapy can fetch??? Like beautiful soup can extract HTML and XML.
    I have switched to beautiful soup.

  • @airmanfair
    @airmanfair 4 ปีที่แล้ว +1

    I actually downloaded kite as per your suggestion and am using it now with jupyterlab. It's pretty neat!

  • @stanlukash33
    @stanlukash33 4 ปีที่แล้ว +3

    Appreciate this video man. Lots of stuff clarified.

    • @gaurav2979
      @gaurav2979 4 ปีที่แล้ว

      What did you chose ? What was clarified to you.?

  • @aydencraig9542
    @aydencraig9542 3 ปีที่แล้ว

    Thank you for video! I'm going to check out your web scraping tutorials now!

  • @alihusham1560
    @alihusham1560 4 ปีที่แล้ว +3

    what do you use for your video animations and graphics?

  • @jackbird5839
    @jackbird5839 3 ปีที่แล้ว +3

    awesome tutorial, thank you for your video. it is very clear and easy. Also as newby in Shopify eCommerce i am using ""e-scraper"" to scrape shopify stores, all product data from my supplier sites and other sources. It helps me a lot. maybe it helps somebody too.
    Thank you for your input!!!

    • @willjohn6807
      @willjohn6807 3 ปีที่แล้ว +1

      Thank you Jack, ESCRAPER helped me a lot. Plus now I know the pros and cons of the three Python web scraping frameworks. Thank you Kite.

    • @vskiy26
      @vskiy26 3 ปีที่แล้ว +1

      Jack, eScraper is an awesome solution! Thank you.

  • @daviyokogawa4237
    @daviyokogawa4237 4 ปีที่แล้ว

    Your video was so easy to understand and help me a lot to know which way to go

    • @KiteHQ
      @KiteHQ  4 ปีที่แล้ว +1

      Glad we could help, Davi! :)

  • @AmitTiwari-wf1xj
    @AmitTiwari-wf1xj 4 ปีที่แล้ว +5

    Mark my word! If you continuously put videos of such great content than you will reach million sub in few years. by the way, I suscribed

    • @gaurav2979
      @gaurav2979 4 ปีที่แล้ว

      Do you have any idea on skrapping

  • @zone66
    @zone66 2 ปีที่แล้ว +2

    so if i want to scrape a large amount of webpages while also activating javascript, i would need to go with Selenium, event though Scrapy would be crawling much faster (out of the box). Would be great to have some kind of tutorial for using Scrapy together with Selenium. I think those too should get along somehow. I guess the only problem is Scrapy is single-threaded and Selenium would Block when its called in this single-threaded environment multiple times, or something like hat.

    • @janpost8598
      @janpost8598 ปีที่แล้ว

      Was wondering that as well. Won't mind a steeper learning curve as long as it is both efficient and handles JavaScript

  • @sozno4222
    @sozno4222 3 ปีที่แล้ว

    Great job on this video. I love how precise it is.

  • @ambarishkapil8004
    @ambarishkapil8004 4 ปีที่แล้ว +5

    HI, Firstly I want to congratulate you on your new youtube channel and hope that it will be as successful as your product. You are putting in great content, and the dev community really appreciates the hard work. As a future video idea, I would like to suggest "Design Patterns". This would cater to python enthusiasts falling in both ends of the spectrum. Thanks, Cheers!

  • @user-qv7rw7dq1d
    @user-qv7rw7dq1d 4 ปีที่แล้ว +5

    I hate to be that guy, but Beautiful soup, is not a framework. It's a package/library designed for basic scraping, but its not a framework. In fact, you could, in theory, use BS along with Scrapy as the engine. In comparison, you wouldn't be able to use both Flask and Django together for example, because they are on the same level (Frameworks).
    Comparing BS to Scrapy, is like comparing Jinja to Django. It doesn't really make sense... even though they both sort of accomplish similar tasks.
    It kind of feels like you pieced this video together quickly and are giving slightly misleading info.

    • @gaurav2979
      @gaurav2979 4 ปีที่แล้ว

      What is your advice for a selenium with python automation guy new to the task and domain of web skrapping.

    • @user-qv7rw7dq1d
      @user-qv7rw7dq1d 4 ปีที่แล้ว +3

      @@gaurav2979 Don't focus on tutorials after the first few weeks. Try to build something as quickly as possible. That's what makes you better.

  • @luishenriquedavilapossatti5308
    @luishenriquedavilapossatti5308 3 ปีที่แล้ว

    Hi!
    In my application I need to open a web page, fill a form and then click in a button, then get some data that will be loaded in the page. What would you recommend?
    Thanks in advance

  • @ibarix
    @ibarix 3 ปีที่แล้ว

    ok, i want to scrape a football teams' forms from a js website, the amount of data is not big. i should go with selenium then? tnx

  • @SaurabhGupta-ns8gx
    @SaurabhGupta-ns8gx 2 ปีที่แล้ว

    Which can be the best for web scraping?

  • @bah0n1
    @bah0n1 4 ปีที่แล้ว +1

    Is there possible to add some python selenium script backend of our website. If it is not then why and if will then how. I go a website like auto like/auto followers/auto reaction is they use some kind of selenium script

  • @bluesdog88
    @bluesdog88 4 ปีที่แล้ว +5

    Great tutorial, thanks for the insight, you saved me a lot of reading ;)

  • @Noname304y2u2
    @Noname304y2u2 ปีที่แล้ว

    Thank you! That was very helpful!

  • @nikhildeshpande1247
    @nikhildeshpande1247 2 ปีที่แล้ว

    I was extracting text from perticular website it is giving response [500] error ?? anyone knows what it is??

  • @detaineddeveloper
    @detaineddeveloper 4 ปีที่แล้ว +4

    Thank you for making this video! I'm glad I watched this first before starting to build a scraper.

    • @gaurav2979
      @gaurav2979 4 ปีที่แล้ว

      What did u chose ? Scrapy or selenium ?

    • @gaurav2979
      @gaurav2979 4 ปีที่แล้ว

      As beautiful bla bla seems like for kids

  • @Saywhatohno
    @Saywhatohno 3 ปีที่แล้ว

    Great video!!! Can you login to a website like you can with selenium? becasue with selenium you can parse through your userid and password and log into salesforce for eaxample and then scrape accordingly. Do any of the other dependencies or python library provides that feauture? btw this is also the reason i like using selenium.

  • @allcool27gaming
    @allcool27gaming 3 ปีที่แล้ว +19

    This dude looks like his birthday is on May 2nd

    • @24mem0
      @24mem0 3 ปีที่แล้ว

      yoooo facts

    • @s.predator536
      @s.predator536 ปีที่แล้ว

      Why you said like that🤔
      I am curious to know that because my birthday is on may 2nd

    • @stephenmandelbaum2027
      @stephenmandelbaum2027 14 วันที่ผ่านมา

      Lmao, I don't get it, but funny 😅

  • @BookOfMorman
    @BookOfMorman 4 ปีที่แล้ว +2

    Great content! Thanks! Quick note, maybe center yourself higher in the frame for the camera. Most people have only a little room from the top of the frame to the top of their head. When you center your head in frame as you did, it kinda just makes you look short. Like you could be 7 feet tall but that centering makes you look like a hobbit!
    Anyway, keep up the great work!

  • @Tracks777
    @Tracks777 4 ปีที่แล้ว +24

    nice content

  • @yashhhhraj
    @yashhhhraj 4 ปีที่แล้ว

    Can we use python requests library with scrapy to make post requests to api? I'm done with web scraper but stuck at api

  • @artabra1019
    @artabra1019 4 ปีที่แล้ว

    what is better python scrapings or js scraping

  • @anilpanwar8710
    @anilpanwar8710 4 ปีที่แล้ว

    @Kite , i want to scrap 1 million records from a website and there are some javascript like some click event require, i know scrapy and selenium, so please tell me which what should i use , scrapy or selenium ?

  • @idromano
    @idromano 2 ปีที่แล้ว

    This video is so well done!

  • @Captinofthemudslayer
    @Captinofthemudslayer 3 ปีที่แล้ว

    bs4 content returned different than page im viewing for example @t. any ideas

  • @prabaharanp2825
    @prabaharanp2825 4 ปีที่แล้ว

    If inspect element not allowed for a page,how could we scrsp

  • @imaginzationworld
    @imaginzationworld 2 ปีที่แล้ว

    Do you charge to create a web scrapper?

  • @nicodemus399
    @nicodemus399 4 ปีที่แล้ว

    hi, Have you tried Scrapy Splash? for js pages.

  • @pythonenthusiast9292
    @pythonenthusiast9292 4 ปีที่แล้ว

    How do we know if the website is using JavaScript or HTML/XML to load contents?

    • @willy7968
      @willy7968 ปีที่แล้ว

      Disable javascript from the browser

  • @willkingsley8454
    @willkingsley8454 4 ปีที่แล้ว +1

    Aside from the learning curve, would Scrapy be the best option?

    • @ahmadaminfarooq8495
      @ahmadaminfarooq8495 4 ปีที่แล้ว

      Yes but only downside is it doesn't allow JS rendering.

    • @fabianrodriguez1226
      @fabianrodriguez1226 4 ปีที่แล้ว +1

      @@ahmadaminfarooq8495 You could use Scrapy and Splash to render JS

    • @shashwatpuri6496
      @shashwatpuri6496 4 ปีที่แล้ว +2

      It definitely is, Scrapy is powerful and sole purpose is scraping and handling huge amounts of data
      Moreover , middlewares and pipelines allow you to clean the data and store them to database like mongo, sqlite3 !
      Moreover, to scrape JavaScript websites, there's good support for Scrapy-splash integration via docker !

  • @dishydez
    @dishydez 3 ปีที่แล้ว

    This is great! Thanks a lot. By the way, could you do a guide on helium? It's a wrapper for selenium but easier to use though I can't get it to work for some reason. Would appreciate a guide video/series.

  • @draytond
    @draytond 4 ปีที่แล้ว +6

    Summary: Use Scrapy if your data set will be large, else use BeautifulSoup

    • @NathanKwadade
      @NathanKwadade 4 ปีที่แล้ว +1

      Thanks 🙏 😊

    • @NathanKwadade
      @NathanKwadade 4 ปีที่แล้ว

      I used BeautifulSoup and made me a nice broth of data which I converted to CSV file format. Thx

    • @matthiasoberleitner5942
      @matthiasoberleitner5942 3 ปีที่แล้ว

      I just generally use Scrapy. As soon as you know how to set it up its not really a hassle even for small things. If anything's reactive and too hard to fetch, then I use Selenium in my Scrapy framework for those things I need it for

    • @informativecontent4778
      @informativecontent4778 3 ปีที่แล้ว

      Lolx selenium is better

  • @Buhassan5656
    @Buhassan5656 4 ปีที่แล้ว +1

    I want to scrape amazon.com (for monitor arms) and extract prices and shipping weight of each item. Therefore, (it is required to open the page of each item). So what framework you think suits this situation??

    • @PropertyTak
      @PropertyTak 4 ปีที่แล้ว

      Yes i can Paid. I'm Freelancer

    • @fabianrodriguez1226
      @fabianrodriguez1226 4 ปีที่แล้ว

      Scrapy would be the best approach although you should use proxies and agents

    • @SWIFTzTrigger
      @SWIFTzTrigger 4 ปีที่แล้ว

      @@fabianrodriguez1226 can you explain what you mean by use proxies and agents?

  • @paulowiz
    @paulowiz 4 ปีที่แล้ว +1

    I would like a video to explain more about scrapy , because there are few information on internet

  • @jasp402
    @jasp402 3 ปีที่แล้ว

    How can I enter a page that has recaptcha2?

  • @AlessandroBottoni
    @AlessandroBottoni 3 ปีที่แล้ว

    Excellent video! Kudos!

  • @mikaelmonjour_programming
    @mikaelmonjour_programming 4 ปีที่แล้ว +2

    asyncio & aiohttp + parser of choice 😘

  • @GlennMascarenhas
    @GlennMascarenhas 4 ปีที่แล้ว

    One could use Helium over Selenium. Helium is built on Selenium but much easier in terms of function calls

  • @anonymosranger4759
    @anonymosranger4759 4 ปีที่แล้ว +6

    Amazing Content, New Sub Here!

  • @odhypradhana6556
    @odhypradhana6556 4 ปีที่แล้ว +1

    I came looking for copper and found gold
    btw great video as always, the edits are really cool~!

  • @pankajjoshi8292
    @pankajjoshi8292 4 ปีที่แล้ว

    Sir what about parseHub ? is parseHub free ?

  • @TropicalDev
    @TropicalDev 4 ปีที่แล้ว

    Aight bet I’m downloading kite

  • @aronpaul7544
    @aronpaul7544 4 ปีที่แล้ว +1

    When are you going to post new videos on this playlist? It's been a while 😒

  • @elyasmoshirpanahi7184
    @elyasmoshirpanahi7184 4 ปีที่แล้ว +1

    nice video really informative

    • @KiteHQ
      @KiteHQ  4 ปีที่แล้ว

      Glad you found it useful!

  •  4 ปีที่แล้ว

    Why do you say bs4 needs dependencies and the others do not? To begin with, scrapy needs twisted, you even mention it yourself!

  • @АлександрКотов-ц7л
    @АлександрКотов-ц7л 4 ปีที่แล้ว +2

    Hello, your videos very useful

  • @juanroman7130
    @juanroman7130 2 ปีที่แล้ว

    I want to just save image url to hyper link to it

  • @brandflouride3867
    @brandflouride3867 3 ปีที่แล้ว

    nice animations bro

  • @notsure6834
    @notsure6834 4 ปีที่แล้ว

    What is scrappy?

  • @anttonalamettala5367
    @anttonalamettala5367 2 ปีที่แล้ว

    nice content, thnkyou

  • @Диссидент-г8с
    @Диссидент-г8с 4 ปีที่แล้ว +2

    Very interesting video)) I like it =)

  • @kestonsmith1354
    @kestonsmith1354 3 ปีที่แล้ว

    I hate Beautifulsoup because it never works for me

  • @fallenboy1947
    @fallenboy1947 4 ปีที่แล้ว

    wanted to install and test kite.
    i became so sad when found out it needs avx instructions in order to install.

  • @alixaprodev
    @alixaprodev 4 ปีที่แล้ว

    Nice work 👍

  • @darktealglasses
    @darktealglasses 3 ปีที่แล้ว

    Please tune the video's audio up

  • @king-star9860
    @king-star9860 4 ปีที่แล้ว

    why dont you use kite

  • @djosearth3618
    @djosearth3618 2 ปีที่แล้ว

    thx

  • @ajosifoski
    @ajosifoski 4 ปีที่แล้ว +2

    I would add requests and lxml in adition

    • @ajosifoski
      @ajosifoski 4 ปีที่แล้ว

      Btw, after knowing kite, I cant live without it, python without kite is empty shell, but together are two pearls!

  • @GunjanShrimali
    @GunjanShrimali 4 ปีที่แล้ว +1

    Good

  • @expat2010
    @expat2010 4 ปีที่แล้ว

    What's not to like about this video?

  • @dchaba154
    @dchaba154 3 ปีที่แล้ว

    tldr
    Simple project? BeautifulSoup
    Complex? Scrappy
    Selenium? Better not use it 😁

  •  4 ปีที่แล้ว

    I like! Keep it up! Would you like to be TH-cam friends? :)

  • @TheRedTeam
    @TheRedTeam 3 ปีที่แล้ว

    Dude sounds nervous

  • @crabbyfish3691
    @crabbyfish3691 3 ปีที่แล้ว

    Just use pyautogui lmao