Python Scrapy Tutorial - 9 - Extracting data w/ CSS Selectors

แชร์
ฝัง
  • เผยแพร่เมื่อ 19 ต.ค. 2024
  • In this video we will scrape quotes from a website and select elements that need to be scraped using CSS Selectors. We will also learn about the tool called as Selector Gadget that is going to make your life so much easier!
    Summary
    1) Using CSS selectors
    Using scrapy shell
    response.css('title')
    response.css('title').extract()
    response.css('title::text').extract()
    response.css('title::text')[0].extract()
    response.css('title::text').extract_first()
    2) Selector gadget on quotes website
    3) Selector gadget on amazon
    Next video - Extracting data using XPATH
    • Python Scrapy Tutorial...
    Full playlist - • Python Web Scraping & ...
    Subscribe - / @buildwithpython
    Website - www.buildwithpython.com
    Instagram - / buildwithpython
    #python

ความคิดเห็น • 115

  • @Fmorrell100
    @Fmorrell100 3 ปีที่แล้ว +7

    Really great tutorial. He goes through it step by step in order, so you have a clear understanding. That helps a lot

  • @onksgk
    @onksgk 4 ปีที่แล้ว +6

    You just made my day! From last 2-3 days I am trying to learn web scraping but there are complicated videos on other channels .Today I watched your first 3 videos and then I got it you are going to kill it🔥🔥 and now suggesting that tool ...it became 💎💎. Thank you. You have one more subscriber.

  • @devang1956
    @devang1956 4 ปีที่แล้ว +3

    You just gave me a breather with the Chrome Extension. Amazing video series! Keep up the good work. You earned a subscribe :)

  • @Nirmal_rai
    @Nirmal_rai 10 หลายเดือนก่อน

    bro really you are so underrated. bro you are teaching so well that i ,a mechanical engineering is doing this like its nothing. keep the hard work on . i love your videos and your teching style

  • @MarsLanding91
    @MarsLanding91 4 ปีที่แล้ว

    Wow. I was going to wait until the last video to comment but I had to do it now. THANK YOU for these videos! They are SUPER helpful.

  • @jerinfrancis4509
    @jerinfrancis4509 5 ปีที่แล้ว

    oh my god ! how am I gonna pay you back. YOU JUST MADE MY DAY. Speechless. The chrome extension is damn good bro. Thank you so much for this particular video !!!!

  • @danielhoyos6788
    @danielhoyos6788 3 ปีที่แล้ว +3

    Why am I getting an empty list when scraping Amazon?

  • @asifmohammed1270
    @asifmohammed1270 4 ปีที่แล้ว +1

    I wish I found this video much earlier. Just saved a lot of time and effort.

  • @PMDJBMS
    @PMDJBMS 5 ปีที่แล้ว

    I'm really finding your work helpful for a research project I'm on in the UK. A big thank you for your excellent videos

  • @debruppaul8239
    @debruppaul8239 ปีที่แล้ว

    Bhiya I don't know how to thank you great job and thanks a lot,you just made selecting piece of cake,thanks again

  • @echezonaazubike8054
    @echezonaazubike8054 4 ปีที่แล้ว

    i have subscribed you nailed it bro i am Nigerian and we loved Indians

  • @linusjohansson3164
    @linusjohansson3164 3 ปีที่แล้ว +1

    How do you get the last command in pycharm? Up does not work here for me. I have to write response...etc all over again which is annoying.

  • @5caioc
    @5caioc 3 ปีที่แล้ว +1

    Incredible series!!!! Thanks a lot!! The extension you recommended is extremely helpful

  • @sgerodes
    @sgerodes 3 ปีที่แล้ว

    Your tutorial is pure magic. Thank you very much!

  • @rhn122
    @rhn122 4 ปีที่แล้ว

    Once again, great tutorial! Clear and straightforward!

  • @NirmalSilwal
    @NirmalSilwal 3 ปีที่แล้ว +2

    your explanations are amazing, very engaging and interesting stuffs

  • @zeeshanshani8896
    @zeeshanshani8896 3 หลายเดือนก่อน

    Hey, great tutorial bhai ! What i get from it is that by using shell command on the terminal we can dynamically scrape data like we do with python request and beautiful soup.
    Thanks for uploading them.

  • @RobertRoman
    @RobertRoman 4 ปีที่แล้ว +1

    This video is Gold! I'm excited to learn web scraping now :D

  • @nathanheath3756
    @nathanheath3756 5 ปีที่แล้ว +1

    Subscribed! Very helpful information ! definitely keep these videos coming!

  • @LV7agent
    @LV7agent 2 ปีที่แล้ว

    a really good hands-on tutorial, 10x alot

  • @simasj1
    @simasj1 5 ปีที่แล้ว

    Nice! The video is so clear, I think you should consider a lecturer carrier! You have a gift to explain complicated things very simply.

    • @simasj1
      @simasj1 5 ปีที่แล้ว

      P.S. www.buildwithpython.com does not work - it says "The account for this site no longer active.
      This content is not currently available."

    • @buildwithpython
      @buildwithpython  5 ปีที่แล้ว

      Yeah it's not up. I didn't know people were even checking it out!

  • @anupamasingh1239
    @anupamasingh1239 2 ปีที่แล้ว

    Hey, I'm getting error 404 while scraping the amazon website which you gave. I tied finding solution but was not able to fix it. Can you please help me out on this?

  • @babuji010
    @babuji010 5 ปีที่แล้ว +1

    Nicely explained 👍. Thanks.
    Have a question. It looks like the "response" object under "Available Scrapy objects:" is responsible for response.css. is that right?
    There is no "response" object in the list for the web link I try to work on. Any suggestions? Ideas? Please.

  • @web-dev-zargo
    @web-dev-zargo 11 หลายเดือนก่อน

    OMG! IT IS WONDERFUL!

  • @167tejaswini
    @167tejaswini 5 ปีที่แล้ว +3

    I have tried below ..but still not displaying here anything
    >>> response.css(".a-color-base.a-text-normal").extract()
    []
    >>> response.css(".a-text-normal::text").extract()
    []
    >>> response.css("a-text-normal").extract()
    []

    • @buildwithpython
      @buildwithpython  5 ปีที่แล้ว

      Did you try it on the example website I gave?

    • @ReasonToKeepGoing
      @ReasonToKeepGoing 5 ปีที่แล้ว

      Solved the issue in two different ways,
      response.css(".a-color-base.a-text-normal::text").getall()
      and
      response.css(".a-color-base.a-text-normal::text").extract()

    • @mihirthakur917
      @mihirthakur917 5 ปีที่แล้ว

      Facing the same. Worked with quotes to scrape but not with amazon.
      I tried it with Flipkart and it worked

    • @buildwithpython
      @buildwithpython  5 ปีที่แล้ว

      @@mihirthakur917 hey I have a separate video for Amazon in the same playlist

    • @CSSuccessGamer
      @CSSuccessGamer 4 ปีที่แล้ว

      amazon must have found this video and decided to block scrapers...

  • @VigneshSahoo
    @VigneshSahoo 4 ปีที่แล้ว

    Selector gadget is awesome. Thanks mate.

  • @morganv3732
    @morganv3732 5 ปีที่แล้ว +2

    Pure Gold. Thank you!

  • @sgerodes
    @sgerodes 3 ปีที่แล้ว

    The selector tool is magic

  • @jamezz2181
    @jamezz2181 4 ปีที่แล้ว

    how do you remove blank space like
    and spaces when it just has a bunch of them from it

  • @whayAl
    @whayAl 3 ปีที่แล้ว

    many thanks for all your teachings

  • @jiyarahman2673
    @jiyarahman2673 4 ปีที่แล้ว +1

    Hi, I am following the code as you guide, but I am getting a Empty list for response.css even for previous video I got empty value can you explain me why?

    • @andrejohnv
      @andrejohnv ปีที่แล้ว

      Did you get it now? I'm getting empty list lol.

    • @naveenkumardongre
      @naveenkumardongre ปีที่แล้ว

      same I too getting empty
      list

  • @umerimran3833
    @umerimran3833 2 ปีที่แล้ว

    Brother you're outstanding

  • @truverol8205
    @truverol8205 2 ปีที่แล้ว

    wow your tutorial is so great! good job

  • @cstech2364
    @cstech2364 2 ปีที่แล้ว

    Thank You So Much Sir 👍👍

  • @web_devs
    @web_devs 4 ปีที่แล้ว +2

    Cant scrape amazon... returns empty list
    >>> response.css(".acs-product-block__product-title .a-truncate-cut::text").extract()
    []
    any help..?

  • @imaduddinsheikh3546
    @imaduddinsheikh3546 3 ปีที่แล้ว +1

    Thank you so much for your Scrapy tutorials! However at 10:36, I tried running scrapy shell command on the Amazon website, and the response came back with a 503 code. How do I fix this? And, what's the issue behind it? I am running Windows 10.

    • @imaduddinsheikh3546
      @imaduddinsheikh3546 3 ปีที่แล้ว +1

      Nevermind, I fixed the issue. I reduced the concurrent requests in the settings.py file to 1(I also added a user agent for Chrome browser with the latest version in the same file)

    • @ItsBen27
      @ItsBen27 3 ปีที่แล้ว

      @@imaduddinsheikh3546 THANK YOU!!! Your comment saved me from a lifetime of searching for the fix!

  • @ubaidmanzoorwani7491
    @ubaidmanzoorwani7491 5 ปีที่แล้ว

    I am trying to scrap data from youtube but it is returning an empty list every time . please tell me what to do.

  • @da_ta
    @da_ta 4 ปีที่แล้ว

    very exceptional excellent work thanks for doing this

  • @hasnainahmed6706
    @hasnainahmed6706 4 หลายเดือนก่อน

    It is giving empty list on my pc at 11:12 please help me out.

  • @ThallaSampathKumar
    @ThallaSampathKumar ปีที่แล้ว

    yes it is fine but it is not working for all websites returning me an empty list

  • @Abdullahkbc
    @Abdullahkbc 3 ปีที่แล้ว

    this extension is perfect. thank u so much.

  • @harshadmanglani1309
    @harshadmanglani1309 5 ปีที่แล้ว

    The series is great, although there's something wrong with the quotestoscrape website, it gives me a twisted internet error, works for every other website though. Thanks.

  • @md.mahabuburrahman8544
    @md.mahabuburrahman8544 3 ปีที่แล้ว

    Wow great tutorial

  • @brendenandrews6965
    @brendenandrews6965 4 ปีที่แล้ว

    How to access the previous commands in the shell.. usually when I'm in the terminal I am able to access the previous command using the up button but in the shell I am not able to do the same as shown in the video.. can anyone help me with this..

  • @bhavyajain2034
    @bhavyajain2034 4 ปีที่แล้ว

    SIR, while running scrapy shell command, terminal is raising a ValueError : invalid hostname: 'http

  • @DarkScizor
    @DarkScizor 4 ปีที่แล้ว

    Hi there, I had a question. I wanted to parse the alt text off of an img. How would I go about this? I appreciate any help you can give!

  • @amitjamwal1985
    @amitjamwal1985 5 ปีที่แล้ว

    Very helpful videos. thanks a lot :)

  • @alisiraydemir
    @alisiraydemir 2 ปีที่แล้ว

    Just want to say Thank you!

  • @gurjeetkaur3626
    @gurjeetkaur3626 3 ปีที่แล้ว

    Helped a great....but after half of video .. view not clear

  • @amitkumar-yu6yz
    @amitkumar-yu6yz 4 ปีที่แล้ว

    great video man.ver very thanku

  • @TranLamYoutube
    @TranLamYoutube ปีที่แล้ว

    Amazon's source could change, I can't crawl data, elements are render from script, not from sample Html

  • @souilahmaher7188
    @souilahmaher7188 4 ปีที่แล้ว

    You're great instructor!

  • @emm5138
    @emm5138 5 ปีที่แล้ว

    Great video! Thanks a lot!

  • @bhavyajain2034
    @bhavyajain2034 4 ปีที่แล้ว

    will appreciate your help

  • @shaikhanuman8012
    @shaikhanuman8012 4 ปีที่แล้ว +3

    bro i am getting 503 error code how could i fix it please tell me brother

    • @teo-medesi
      @teo-medesi 4 ปีที่แล้ว

      Go out back, find the biggest stick you can find, keep hitting your pc until it works. I hope this helped!

    • @shaikhanuman8012
      @shaikhanuman8012 4 ปีที่แล้ว +1

      @@teo-medesi I tried brother due to that I bought new pc (😄 I fixed the error)

    • @shaikhanuman8012
      @shaikhanuman8012 4 ปีที่แล้ว +1

      @@teo-medesithanks brother for providing valuable knowledge

    • @teo-medesi
      @teo-medesi 4 ปีที่แล้ว +1

      @@shaikhanuman8012 Any time!

    • @shaikhanuman8012
      @shaikhanuman8012 4 ปีที่แล้ว

      @@teo-medesi tq sir

  • @giotsas
    @giotsas 5 ปีที่แล้ว

    Great video, just a small correction. In 09:00 you mention [1] is the first index of the list of authors. It's the second index.

  • @aakanshasingh9680
    @aakanshasingh9680 4 ปีที่แล้ว

    scrapy crawl quotes. -> not returning anything. Nothing is displayed on the terminal

    • @aakanshasingh9680
      @aakanshasingh9680 4 ปีที่แล้ว

      Basically, the function parse is not getting executed. Anything else written outside parse but inside the class is getting executed.

  • @Imperialcodex1
    @Imperialcodex1 หลายเดือนก่อน

    Thanks man. 💪

  • @redfeather22sa
    @redfeather22sa 3 ปีที่แล้ว

    Your very Good !!!!

  • @jss2754
    @jss2754 5 ปีที่แล้ว

    i have some question of this video. As you know ,scrapy is have a two ways for xpath that is css and xpath. i wonder why are u using css on your video .

    • @buildwithpython
      @buildwithpython  5 ปีที่แล้ว

      In the next video I use xpath. I just like CSS selectors

    • @jss2754
      @jss2754 5 ปีที่แล้ว

      @@buildwithpython thank u for comment!!

  • @Jgs8115
    @Jgs8115 5 ปีที่แล้ว

    Saved my neck thanks man

  • @hau_hau_happu_singh
    @hau_hau_happu_singh 4 ปีที่แล้ว +1

    I am continuously getting a null array, after using selector gadget.

    • @danielamorariu6722
      @danielamorariu6722 4 ปีที่แล้ว

      you're probably getting a 503 error, which means service is unavailable. I solved this by specifying a user agent in settings.py and disabling the cookies, also in settings.py . User agent can be Mozzila 5.0 etc etc ( check explanations here: www.scrapehero.com/how-to-fake-and-rotate-user-agents-using-python-3/)

  • @CSSuccessGamer
    @CSSuccessGamer 4 ปีที่แล้ว +1

    10:34 im getting 503 error from terminal for amazon

  • @Pandazaar
    @Pandazaar 5 ปีที่แล้ว

    hey I was trying to follow along this video and I think you can no longer use response.css, because it was removed I guess,
    the error I get is: AttributeError: 'function' object has no attribute 'css'

    • @buildwithpython
      @buildwithpython  5 ปีที่แล้ว

      Nope it's not removed. Don't think your scrap is installed properly.

    • @Pandazaar
      @Pandazaar 5 ปีที่แล้ว

      @@buildwithpython oh, i did cd quotetutorial before opening shell,my bad

    • @babuji010
      @babuji010 5 ปีที่แล้ว

      @@Pandazaar Hey, I am getting same error. Can you explain what went wrong? And solution pls. Thanks

    • @Pandazaar
      @Pandazaar 5 ปีที่แล้ว +1

      @@babuji010 just type "cd .." and then open the shell

  • @as-px2mv
    @as-px2mv 3 ปีที่แล้ว

    thanks a lot!

  • @Coney_island23
    @Coney_island23 ปีที่แล้ว

    great!

  • @jabiraziz1219
    @jabiraziz1219 ปีที่แล้ว

    Just Wowww

  • @anrm6
    @anrm6 3 ปีที่แล้ว

    You are god

  • @doubled9645
    @doubled9645 4 ปีที่แล้ว

    thx bro

  • @_thoneeer3220
    @_thoneeer3220 4 ปีที่แล้ว

    wow

  • @Bihari_Chaman
    @Bihari_Chaman 2 ปีที่แล้ว

    It is not a list its Array

    • @Yuri-xx2gi
      @Yuri-xx2gi 2 ปีที่แล้ว

      It's a list, this is not C, in python they're called like that

  • @SquaredbyX
    @SquaredbyX 4 ปีที่แล้ว

    Should be called a css De-selector

  • @vigneshsivasubramanian9193
    @vigneshsivasubramanian9193 4 ปีที่แล้ว

    is it only me or the entire headphones shakes and trembles when he presses his keys
    think they are feared of him
    please dont overuse it and give some rest for both u and your keyboard

  • @juann9880
    @juann9880 4 ปีที่แล้ว +1

    What if response =403, I can't extract anything?