Man, is scrapy that hard to learn ? I have gone through many tutorials and scrapy documentation, and I can't understand anything when parse starts. Nobody explains what does what. This is by far the best tutorial I've found but parse method is still not completely explained. I lost it from where he puts period before the selector in the dictionary.
Awesome video Ahmed. I got to know about scrapy yesterday and I did lot of surfing. This is the best material I have got. You got to make more of these mate.
I'm using jupyter notebook launched through anaconda. At 18:07, when I run 'scrapy crawl jokes o- data.json' I get - 'Spider not found: jokes'. Please help!!
When using Jupyter notebooks you need to execute your spider programmatically (docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script)
Are there any alternate to Scrapy for C#? I would like to use web scraping for get data from a finance websites such as cmegroup, finanace.yahoo, bloomberg. I need to avoid Consent, cookies anf I am not robot forms. I am using MVC web application.
I have a need to scrape data that is rendered in a table using java script. Is Scrapy a good framework to do that? I tried Xpath and CSS selectors and the result was a blank list.
Getting an error message because ItemLoaders has changed. 'lib\site-packages\scrapy\loader\__init__.py", line 6, in import itemloaders ModuleNotFoundError: No module named 'itemloaders' Where is the corrected code to get the final example to run?
Thanks for your tutorial. I've started experimenting with scraping and it was a good way to start. In addition, how can you add multiple fields to scrape? For example, the product name and product price into a single scraping script?
'yield' is a generator keyword, is like return, but instead of interrupting the flow of execution it will return the value and continue with the loop, on the other side of the call, the one calling this function will receive a generator that can be consumed with for loop or converted to list
Hi Tanamy, This really depends if the website is using JavaScript to render the content or not. If it doesn't use JavaScript then you can follow the same approach however if it does use JavaScript then you can either use Splash or Selenium.
@@humancode3402 I am actually creating a dataset for TIME SERIES ANALYSIS. So i am scraping the headlines and date. I am able to scrape headlines but not date. I am creating a function for this, so by passing URLS, it will print out the data. Mostly i am making use of HTML pages. Can you suggest me anything?
When I run the exact same code, my data.csv has records but I still have the html tag present and not removed. I'm not sure if remove_tags is working for me correctly. Can anyone help ? First record in data.csv file for reference: " A child asked his father, ""How were people born?"" So his father said, ""Adam and Eve made babies, then their babies became adults and made babies, and so on."" The child then went to his mother, asked her the same question and she told him, ""We were monkeys then we evolved to become like we are now."" The child ran back to his father and said, ""You lied to me!"" His father replied, ""No, your mom was talking about her side of the family."" "
I'm looking for someone who can make a few of these scripts. It looks like you don't have PMs enabled - if you are up for it please send me a message and I'll explain the work.
If you are now interested in this problem, I could solve it and maybe my solution will be useful for you. Probably you missed ' somewhere, e.g. you wrote "//div[@class='name]" instead of "//div[@class='name']". I fixed it and everything's fine now :)
Man, is scrapy that hard to learn ?
I have gone through many tutorials and scrapy documentation, and I can't understand anything when parse starts.
Nobody explains what does what.
This is by far the best tutorial I've found but parse method is still not completely explained.
I lost it from where he puts period before the selector in the dictionary.
Best tutorial on YT
Hello, I just went through half of this tutorial and I want to give you a thumb up before I finishing the other half. Excellent tutorial!!
One of the best tutorial for webscraping
Awesome video Ahmed. I got to know about scrapy yesterday and I did lot of surfing. This is the best material I have got. You got to make more of these mate.
Thank you very much. The info is provided beautifully.
Hi,
Please advice me on how to improve / speed up the scrapy process
Great explanation techniques. Thank you sir👌🙏❤
what a nice video! So far I only knew Beautifulsoup. I think I have to dig into scrapy a bit more :-)
Thank you very much for your work . I really appreciate it. You have explained it in a way that easy to grasp. Please keep making videos
It works fine for me, I don't understand what people are complaining about
Great video!!! can you use scrapy to reach a specific webpage and print that page? And then essentailly save it in a specific directory?
Hi, which python 3+ version support scrapy version in windows 10. Please help me..
I'm using jupyter notebook launched through anaconda. At 18:07, when I run 'scrapy crawl jokes o- data.json' I get - 'Spider not found: jokes'. Please help!!
When using Jupyter notebooks you need to execute your spider programmatically (docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script)
شكراً!
So much information in less than 25 minutes. Thanks.
thank you so much for this video
that import item giving me import error :
ImportError: attempted relative import with no known parent package
Which terminal you are using for writing the code in this video?
vs code
Are there any alternate to Scrapy for C#? I would like to use web scraping for get data from a finance websites such as cmegroup, finanace.yahoo, bloomberg. I need to avoid Consent, cookies anf I am not robot forms. I am using MVC web application.
I have a need to scrape data that is rendered in a table using java script. Is Scrapy a good framework to do that? I tried Xpath and CSS selectors and the result was a blank list.
Getting an error message because ItemLoaders has changed.
'lib\site-packages\scrapy\loader\__init__.py", line 6, in
import itemloaders
ModuleNotFoundError: No module named 'itemloaders'
Where is the corrected code to get the final example to run?
getting error at import scrappy in vs code, please tell what to do??
where do i type the commands in 6:32
awesome tutorial mate thanks a lot
Thanks for your tutorial. I've started experimenting with scraping and it was a good way to start. In addition, how can you add multiple fields to scrape? For example, the product name and product price into a single scraping script?
Hey, I need some help some industrial problem when using scrapy and splash.
What VSCode theme are you using?
Hi, got your course on Udemy, just started and liking it. Not related, but what do you use to edit the videos you uploaded in Udemy?
useful, i like your tutorial. subbed!
do you have course on udemy?
Great video, should put more emphasis on the directory, that your crawler/spider goes into "spiders" file :)
Where do you type the command lines at 5:15?
In scrapy shell
The very best tutorial for scrapy I have come across. Thank you so much!
what did you say to use in linux ? scripts by??
from Demo_project.items import JokeItem
ModuleNotFoundError: No module named 'Demo_project'
anyone solve this problem
I got some troubles. When the spider run, it can not read all data, it just duplicate 1 div class. Can you please help me
if i want to get all parameters of a website? how could i do that?
super
Hi, Very good video, but do you have any idea why do you have one row with data and one row empty? Cause I have same issue. Thanks
I have the same issue as well. It is OK you delete those empty rows from the .csv file.
@RADIX it's an automatic process and I cannot do it manually. I fix it using export in json instead of csv.
I believe it is a setting with the CSV file.
awesme video!! but can anyone explain yield (line 12 - 16:09)?
'yield' is a generator keyword, is like return, but instead of interrupting the flow of execution it will return the value and continue with the loop, on the other side of the call, the one calling this function will receive a generator that can be consumed with for loop or converted to list
I want to scrape the date from news pages. Can you suggest me how can i do it?
Hi Tanamy,
This really depends if the website is using JavaScript to render the content or not. If it doesn't use JavaScript then you can follow the same approach however if it does use JavaScript then you can either use Splash or Selenium.
@@humancode3402 I am actually creating a dataset for TIME SERIES ANALYSIS. So i am scraping the headlines and date. I am able to scrape headlines but not date. I am creating a function for this, so by passing URLS, it will print out the data. Mostly i am making use of HTML pages. Can you suggest me anything?
When I run the exact same code, my data.csv has records but I still have the html tag present and not removed. I'm not sure if remove_tags is working for me correctly. Can anyone help ?
First record in data.csv file for reference:
"
A child asked his father, ""How were people born?"" So his father said, ""Adam and Eve made babies, then their babies became adults and made babies, and so on."" The child then went to his mother, asked her the same question and she told him, ""We were monkeys then we evolved to become like we are now."" The child ran back to his father and said, ""You lied to me!"" His father replied, ""No, your mom was talking about her side of the family."" "
How to get data from hidden link like:
Link
Please reply. I'm facing problem.
Thanks...
response.xpath('//a[@class="some-class"]/@href')
@@louiswallice3746 I'm telling about blank href value. Please look at the a tag
This is a main url Find this is coding it will be somewhere in starting
I'm looking for someone who can make a few of these scripts. It looks like you don't have PMs enabled - if you are up for it please send me a message and I'll explain the work.
Hi Rasmus, have you found the person for the task yet ?
I'm available if you still do.
I did the exact thing but my data json file is empty
Same. Could you solve it?
If you are now interested in this problem, I could solve it and maybe my solution will be useful for you. Probably you missed ' somewhere, e.g. you wrote "//div[@class='name]" instead of "//div[@class='name']". I fixed it and everything's fine now :)
This didn't work, for me I recieved joke-text : null
Got it in the end but explanation was poor and not very in depth.
@@akromajones3385 hey I got an empty json file any solutions?
@@pranavkhatri9564 open it in vs code and write ‘hello world!’
Voila
I need to find a video for beginners
Great tutorial Sir unfortunately not proper for beginners. More for intermediate.
It's a good video...but it's not for beginners...
how can i scrap likes from facebook pages and also scrap the profiles who likes posts !! who can help me !!!!
unknown command: crawl
edit: fixed that. Now, KeyError: Spider not found! :D
just use scrapy doc. Your problem are elementary =)
(scrapy.readthedocs.io/en/latest/topics/commands.html#genspider)
sooooooooooo much useless informations in the video, go straight and brief
really disappointed no indian accent
this isn't for beginners