Basic Webscraper : Get info from the web with Python

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ก.ค. 2024
  • A beginners tutorial for learning to scrape websites with Python.
    Test website: toscrape.com/
    -------------------------------------
    twitter / jhnwr
    code editor code.visualstudio.com/
    WSL2 (linux on windows) docs.microsoft.com/en-us/wind...
    -------------------------------------
    Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
    mouse amzn.to/2SH1ssK
    27" monitor amzn.to/2GAH4r9
    24" monitor (vertical) amzn.to/3jIFamt
    dual monitor arm amzn.to/3lyFS6s
    microphone amzn.to/36TbaAW
    mic arm amzn.to/33NJI5v
    audio interface amzn.to/2FlnfU0
    keyboard amzn.to/2SKrjQA
    lights amzn.to/2GN7INg
    webcam amzn.to/2SJHopS
    camera amzn.to/3iVIJol
    gfx card amzn.to/2SKYraW
    ssd amzn.to/3lAjMAy
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 27

  • @JohnWatsonRooney
    @JohnWatsonRooney  4 ปีที่แล้ว +5

    Questions about basic web scraping always pop up so I wanted to answer them and help out with this video.

  • @merttarm848
    @merttarm848 4 วันที่ผ่านมา +1

    thanks for the video, amazing introduction to webscraping

  • @daveys
    @daveys 8 หลายเดือนก่อน +1

    Worked all the way through this. Great tutorial, many thanks!!

  • @jimmyporter8941
    @jimmyporter8941 2 ปีที่แล้ว +3

    A great pragmatic intro to webscraping. Thanks!

  • @faisalrkhawaja
    @faisalrkhawaja 6 หลายเดือนก่อน

    Hi John. I have two challenges in my scraping project: 1) the products must first have a search term entered (e.g. the product name or category, etc.), 2) the results are spread over multiple pages (which your video did cover), but the results I need are divided over several tabs.

  • @Neil4Speed
    @Neil4Speed 4 ปีที่แล้ว +4

    Hi John, great tutorial as always. Only addition I would recommend is showing how to take it past the finish line and export to a CSV

    • @JohnWatsonRooney
      @JohnWatsonRooney  4 ปีที่แล้ว +1

      Yes of course, I have since covered this in my later videos! Thanks!

  • @powerquotes9492
    @powerquotes9492 2 ปีที่แล้ว +1

    John, your channel is amazing! Exactly what I was looking for. I'm gonna study all your videos and cancel my Udemy course too, as you have better content and for free.

  • @sasuwayne
    @sasuwayne 4 ปีที่แล้ว +1

    Thanks alot John! This video made things clear.

  • @aksontv
    @aksontv 4 ปีที่แล้ว +1

    Sir I have a question please, sir for title you used find_all and for price just find, please clear this point. thanks

  • @paloma6350
    @paloma6350 ปีที่แล้ว +1

    Super useful video, thanks John! New subscriber here

  • @spearchew
    @spearchew 2 ปีที่แล้ว +1

    A* tutorial - and besides the scraping, useful for learning about python more generally. For instance, before today I probably would have created three or four empty lists and appended to each of them individually... rather than simply appending a dictionary, which is much cleaner!

  • @ammaralzhrani6329
    @ammaralzhrani6329 3 ปีที่แล้ว

    How to save data that have scraped and organize and transfer to csv file?

  • @faisalrkhawaja
    @faisalrkhawaja 6 หลายเดือนก่อน

    Hi John. Total noob here. This is the first of your videos I have watched. It is super cool! Question: Target sites will often change the product list (adding or removing), and I may want to keep the data updated on my end at same time. Is there a way to inspect the landing page to see how many pages need to be scraped, and put that into the code as a reference to the last page number, as opposed to a hardcoded number?

  • @dwiatmokopurbosakti1193
    @dwiatmokopurbosakti1193 3 ปีที่แล้ว

    how to fix this error: GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

  • @xilllllix
    @xilllllix 2 ปีที่แล้ว +2

    i learned in 19 mins here what i learned in a 9-hour $89.99 udemy course lol!

  • @venkateshgolla8005
    @venkateshgolla8005 4 ปีที่แล้ว +1

    Excellent..your explanation is awesome.... If possible can you please make an another video, which includes to get data after clicking some buttons in web page(like Radio buttons, list box, button).

    • @JohnWatsonRooney
      @JohnWatsonRooney  4 ปีที่แล้ว +2

      Hi! Thanks for the comment. In my other video I do this using browser automation - How I use SELENIUM to AUTOMATE the Web with PYTHON. Pt1 th-cam.com/video/pUUhvJvs-R4/w-d-xo.html - around the 10 min mark. I will have more webscraping videos coming up too.

  • @mellyndaputri6697
    @mellyndaputri6697 2 ปีที่แล้ว

    how can I scrab book rating?

  • @shivasuresh5957
    @shivasuresh5957 2 ปีที่แล้ว +1

    Awesome video John. Thanks! I will now try to learn how to add the data to a .CSV. Would I be on the right path by using a Pandas data frame to do this?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 ปีที่แล้ว

      It’s definitely worth knowing how to do it with the CSV module, but yes use pandas - I do all the time

    • @powerquotes9492
      @powerquotes9492 2 ปีที่แล้ว

      Just add these lines to save everything to .CSV. It worked for me:
      import pandas as pd
      table = pd.DataFrame(book_list)
      table.to_csv('name_your_file.csv')

  • @carlashfield6388
    @carlashfield6388 3 ปีที่แล้ว +2

    Hi mate, here:
    for x in range(1,50):
    url = f'books.toscrape.com/catalogue/page-{x}.html'
    I'm getting: line 7
    url = f'books.toscrape.com/catalogue/page- {x}.html'
    ^
    IndentationError: expected an indented block
    What am I doing wrong?
    Thanks :)

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว

      Looks like you missed indenting the lines of code after the “for x in range” part. It needs to be indented to work (4 spaces or tab)