Web Scraping Using Scrapy Tutorial For Beginners: Learn Scrapy From Scratch

แชร์
ฝัง
  • เผยแพร่เมื่อ 3 ธ.ค. 2024

ความคิดเห็น • 79

  • @keshavchauhan9389
    @keshavchauhan9389 5 ปีที่แล้ว +42

    Man, is scrapy that hard to learn ?
    I have gone through many tutorials and scrapy documentation, and I can't understand anything when parse starts.
    Nobody explains what does what.
    This is by far the best tutorial I've found but parse method is still not completely explained.
    I lost it from where he puts period before the selector in the dictionary.

  • @noo-sho8500
    @noo-sho8500 5 ปีที่แล้ว +2

    Best tutorial on YT

  • @jiangxu3895
    @jiangxu3895 4 ปีที่แล้ว

    Hello, I just went through half of this tutorial and I want to give you a thumb up before I finishing the other half. Excellent tutorial!!

  • @japsimransingh9933
    @japsimransingh9933 4 ปีที่แล้ว

    One of the best tutorial for webscraping

  • @faiz8117
    @faiz8117 3 ปีที่แล้ว

    Awesome video Ahmed. I got to know about scrapy yesterday and I did lot of surfing. This is the best material I have got. You got to make more of these mate.

  • @kamaleshpramanik7645
    @kamaleshpramanik7645 3 ปีที่แล้ว

    Thank you very much. The info is provided beautifully.

  • @hayathbasha4519
    @hayathbasha4519 3 ปีที่แล้ว

    Hi,
    Please advice me on how to improve / speed up the scrapy process

  • @ujjwalchetan4907
    @ujjwalchetan4907 3 ปีที่แล้ว

    Great explanation techniques. Thank you sir👌🙏❤

  • @tech-letters
    @tech-letters 4 ปีที่แล้ว +6

    what a nice video! So far I only knew Beautifulsoup. I think I have to dig into scrapy a bit more :-)

  • @wgs3leed
    @wgs3leed 3 ปีที่แล้ว

    Thank you very much for your work . I really appreciate it. You have explained it in a way that easy to grasp. Please keep making videos

  • @Briansilasluke
    @Briansilasluke 3 ปีที่แล้ว

    It works fine for me, I don't understand what people are complaining about

  • @Saywhatohno
    @Saywhatohno 3 ปีที่แล้ว

    Great video!!! can you use scrapy to reach a specific webpage and print that page? And then essentailly save it in a specific directory?

  • @venkateshkota09
    @venkateshkota09 4 ปีที่แล้ว

    Hi, which python 3+ version support scrapy version in windows 10. Please help me..

  • @mariamasood1761
    @mariamasood1761 3 ปีที่แล้ว

    I'm using jupyter notebook launched through anaconda. At 18:07, when I run 'scrapy crawl jokes o- data.json' I get - 'Spider not found: jokes'. Please help!!

    • @humancode3402
      @humancode3402  3 ปีที่แล้ว

      When using Jupyter notebooks you need to execute your spider programmatically (docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script)

  • @soufianeelhyani8856
    @soufianeelhyani8856 5 ปีที่แล้ว +1

    شكراً!

  • @Skaxarrat
    @Skaxarrat 5 ปีที่แล้ว +1

    So much information in less than 25 minutes. Thanks.

  • @elyeshamad1440
    @elyeshamad1440 4 ปีที่แล้ว

    thank you so much for this video

  • @taimoor722
    @taimoor722 4 ปีที่แล้ว +5

    that import item giving me import error :
    ImportError: attempted relative import with no known parent package

  • @samanzahra8265
    @samanzahra8265 5 ปีที่แล้ว +2

    Which terminal you are using for writing the code in this video?

  • @giparekh
    @giparekh 3 ปีที่แล้ว

    Are there any alternate to Scrapy for C#? I would like to use web scraping for get data from a finance websites such as cmegroup, finanace.yahoo, bloomberg. I need to avoid Consent, cookies anf I am not robot forms. I am using MVC web application.

  • @tariqiqbal4798
    @tariqiqbal4798 4 ปีที่แล้ว

    I have a need to scrape data that is rendered in a table using java script. Is Scrapy a good framework to do that? I tried Xpath and CSS selectors and the result was a blank list.

  • @richlysakowski1415
    @richlysakowski1415 3 ปีที่แล้ว

    Getting an error message because ItemLoaders has changed.
    'lib\site-packages\scrapy\loader\__init__.py", line 6, in
    import itemloaders
    ModuleNotFoundError: No module named 'itemloaders'
    Where is the corrected code to get the final example to run?

  • @gauravkaushal7341
    @gauravkaushal7341 3 ปีที่แล้ว +1

    getting error at import scrappy in vs code, please tell what to do??

  • @neinu_1418
    @neinu_1418 3 ปีที่แล้ว

    where do i type the commands in 6:32

  • @asdfasdfasdfasdf6409
    @asdfasdfasdfasdf6409 5 ปีที่แล้ว

    awesome tutorial mate thanks a lot

  • @SonyPSPOfficial
    @SonyPSPOfficial 4 ปีที่แล้ว

    Thanks for your tutorial. I've started experimenting with scraping and it was a good way to start. In addition, how can you add multiple fields to scrape? For example, the product name and product price into a single scraping script?

  • @gowthamtech4109
    @gowthamtech4109 6 ปีที่แล้ว +1

    Hey, I need some help some industrial problem when using scrapy and splash.

  • @shadow_qa
    @shadow_qa 4 ปีที่แล้ว

    What VSCode theme are you using?

  • @ekagaurangadas
    @ekagaurangadas 4 ปีที่แล้ว

    Hi, got your course on Udemy, just started and liking it. Not related, but what do you use to edit the videos you uploaded in Udemy?

  • @fikri.abdoul
    @fikri.abdoul 4 ปีที่แล้ว

    useful, i like your tutorial. subbed!

  • @haseebahmed2039
    @haseebahmed2039 5 ปีที่แล้ว +2

    do you have course on udemy?

  • @emmettlawlor3548
    @emmettlawlor3548 4 ปีที่แล้ว

    Great video, should put more emphasis on the directory, that your crawler/spider goes into "spiders" file :)

  • @kaabee390
    @kaabee390 4 ปีที่แล้ว

    Where do you type the command lines at 5:15?

  • @strontium123
    @strontium123 4 ปีที่แล้ว

    The very best tutorial for scrapy I have come across. Thank you so much!

  • @pranavkhatri9564
    @pranavkhatri9564 5 ปีที่แล้ว

    what did you say to use in linux ? scripts by??

  • @ManishThakur-zv8jb
    @ManishThakur-zv8jb 4 ปีที่แล้ว +1

    from Demo_project.items import JokeItem
    ModuleNotFoundError: No module named 'Demo_project'
    anyone solve this problem

  • @buihuuuchoang7699
    @buihuuuchoang7699 3 ปีที่แล้ว

    I got some troubles. When the spider run, it can not read all data, it just duplicate 1 div class. Can you please help me

  • @slaxblake
    @slaxblake 4 ปีที่แล้ว

    if i want to get all parameters of a website? how could i do that?

  • @damodharn4573
    @damodharn4573 5 ปีที่แล้ว

    super

  • @ovidiurudi
    @ovidiurudi 5 ปีที่แล้ว

    Hi, Very good video, but do you have any idea why do you have one row with data and one row empty? Cause I have same issue. Thanks

    • @bradliu1891
      @bradliu1891 5 ปีที่แล้ว +1

      I have the same issue as well. It is OK you delete those empty rows from the .csv file.

    • @ovidiurudi
      @ovidiurudi 5 ปีที่แล้ว

      @RADIX it's an automatic process and I cannot do it manually. I fix it using export in json instead of csv.

    • @joejustin007
      @joejustin007 5 ปีที่แล้ว

      I believe it is a setting with the CSV file.

  • @kostasklimantakis6252
    @kostasklimantakis6252 5 ปีที่แล้ว

    awesme video!! but can anyone explain yield (line 12 - 16:09)?

    • @ekagaurangadas
      @ekagaurangadas 5 ปีที่แล้ว +2

      'yield' is a generator keyword, is like return, but instead of interrupting the flow of execution it will return the value and continue with the loop, on the other side of the call, the one calling this function will receive a generator that can be consumed with for loop or converted to list

  • @tanmaydeshpande2409
    @tanmaydeshpande2409 5 ปีที่แล้ว

    I want to scrape the date from news pages. Can you suggest me how can i do it?

    • @humancode3402
      @humancode3402  5 ปีที่แล้ว +1

      Hi Tanamy,
      This really depends if the website is using JavaScript to render the content or not. If it doesn't use JavaScript then you can follow the same approach however if it does use JavaScript then you can either use Splash or Selenium.

    • @tanmaydeshpande2409
      @tanmaydeshpande2409 5 ปีที่แล้ว

      @@humancode3402 I am actually creating a dataset for TIME SERIES ANALYSIS. So i am scraping the headlines and date. I am able to scrape headlines but not date. I am creating a function for this, so by passing URLS, it will print out the data. Mostly i am making use of HTML pages. Can you suggest me anything?

  • @krishna337
    @krishna337 3 ปีที่แล้ว

    When I run the exact same code, my data.csv has records but I still have the html tag present and not removed. I'm not sure if remove_tags is working for me correctly. Can anyone help ?
    First record in data.csv file for reference:
    "
    A child asked his father, ""How were people born?"" So his father said, ""Adam and Eve made babies, then their babies became adults and made babies, and so on."" The child then went to his mother, asked her the same question and she told him, ""We were monkeys then we evolved to become like we are now."" The child ran back to his father and said, ""You lied to me!"" His father replied, ""No, your mom was talking about her side of the family."" "

  • @indrajitakuli8060
    @indrajitakuli8060 4 ปีที่แล้ว

    How to get data from hidden link like:
    Link
    Please reply. I'm facing problem.
    Thanks...

    • @louiswallice3746
      @louiswallice3746 4 ปีที่แล้ว

      response.xpath('//a[@class="some-class"]/@href')

    • @indrajitakuli8060
      @indrajitakuli8060 4 ปีที่แล้ว

      @@louiswallice3746 I'm telling about blank href value. Please look at the a tag

    • @sadambloch
      @sadambloch 4 ปีที่แล้ว

      This is a main url Find this is coding it will be somewhere in starting

  • @rasmusm.9159
    @rasmusm.9159 6 ปีที่แล้ว +1

    I'm looking for someone who can make a few of these scripts. It looks like you don't have PMs enabled - if you are up for it please send me a message and I'll explain the work.

    • @isaacchikutukutu7558
      @isaacchikutukutu7558 6 ปีที่แล้ว +1

      Hi Rasmus, have you found the person for the task yet ?
      I'm available if you still do.

  • @aaronbaron6468
    @aaronbaron6468 4 ปีที่แล้ว

    I did the exact thing but my data json file is empty

    • @alexeykozlovsky5056
      @alexeykozlovsky5056 3 ปีที่แล้ว

      Same. Could you solve it?

    • @alexeykozlovsky5056
      @alexeykozlovsky5056 3 ปีที่แล้ว

      If you are now interested in this problem, I could solve it and maybe my solution will be useful for you. Probably you missed ' somewhere, e.g. you wrote "//div[@class='name]" instead of "//div[@class='name']". I fixed it and everything's fine now :)

  • @akromajones3385
    @akromajones3385 5 ปีที่แล้ว +2

    This didn't work, for me I recieved joke-text : null

    • @akromajones3385
      @akromajones3385 5 ปีที่แล้ว +1

      Got it in the end but explanation was poor and not very in depth.

    • @pranavkhatri9564
      @pranavkhatri9564 5 ปีที่แล้ว

      @@akromajones3385 hey I got an empty json file any solutions?

    • @xcaliburkahn9534
      @xcaliburkahn9534 3 ปีที่แล้ว

      @@pranavkhatri9564 open it in vs code and write ‘hello world!’
      Voila

  • @im4485
    @im4485 3 ปีที่แล้ว

    I need to find a video for beginners

  • @ABC-xn3td
    @ABC-xn3td 5 ปีที่แล้ว +3

    Great tutorial Sir unfortunately not proper for beginners. More for intermediate.

  • @ragafeb6681
    @ragafeb6681 5 ปีที่แล้ว +1

    It's a good video...but it's not for beginners...

  • @salsabilchedly939
    @salsabilchedly939 3 ปีที่แล้ว

    how can i scrap likes from facebook pages and also scrap the profiles who likes posts !! who can help me !!!!

  • @kurzackd
    @kurzackd 5 ปีที่แล้ว +2

    unknown command: crawl
    edit: fixed that. Now, KeyError: Spider not found! :D

    • @ohrimenko.taras1qt
      @ohrimenko.taras1qt 5 ปีที่แล้ว

      just use scrapy doc. Your problem are elementary =)
      (scrapy.readthedocs.io/en/latest/topics/commands.html#genspider)

  • @mohammadrezashariat1442
    @mohammadrezashariat1442 3 ปีที่แล้ว

    sooooooooooo much useless informations in the video, go straight and brief

  • @beelzebub3920
    @beelzebub3920 4 ปีที่แล้ว

    really disappointed no indian accent

  • @informativecontent4778
    @informativecontent4778 4 ปีที่แล้ว

    this isn't for beginners