Class! Thanks for this vid! The parser on Go looks clearly faster than on Python, and if you still use a goroutine, then the difference in speed will probably be even greater. While Python can also use multi-process parsing.
@@kiraGuitarr did not find any clear winners, Go is easier to use when multithreading. In cases where python is slower it compensate by having faster production timez etc. My understanding is that every language is a tool and as with any tools there is no better one just most convenient for the job.
oh cool. I am still new to Go so i am trying to build some things with it. It jsut seems to make sense to me. I haven't tried Rust yet no, I found Go more appealing when I was looking for a second lang
@@JohnWatsonRooney I think Go is the easier of the 2 to learn, but Rust seems to have a larger community online it seems. Never the less a great language to pick up, will be interesting to see where these 2 languages are in 5-10 years time
Thanks for the video! Could you please show us how can scrap data from multiple website with a single go script?? [selector will be different for each website]
yes, you can create separate "collectors" for each website, and have their own parsing functions. It could get a bit messy but i would then split them up into separate files
Very good - I went from 40s to 8s on my site. Very impressive. Non Async to me would be too slow. One general issue i have is how to merge information for the same product when that info is spread about in different sections of the page? On a single page, how to effectively match up what belongs to what. I know one way would be to find the common parent class and use one h *colly.HTMLElement block but maybe there is another way with the structs?
Hi John, loving your videos. Just came across your channel yesterday. I've been scraping for about 20 years using a mixture of php/curl then python using string manipulation, so it's great to see how to do it more efficiently! Do you have a video rounding up what ide and tools you use? Nothing jumped out. Currently using visual studio but there may be something better?
He is good this guy. A reply from his other vid: "...it’s neovim - I swapped to it full time about 2 months ago after moving between it and PyCharm whilst I was learning the key bindings etc. I really like it"
I was using VSCode - really liked it, then started using PyCharm - works really well on python projects, just uses a ton of RAM and CPU. I still think VSCode is tops if you work on other projects too.
@@DietervanderWesthuizen I absolutely loved PyCharm but it took forever to boot up & I never really liked vs code so right when I was on the edge of going back to Atom(the 1st editor I've ever liked), I decided to give Vim a try... After a good amount of research I decided to just go with vanilla Vim & I loved it! Takes a little more effort to setup than a pre-configured version like neovim, but I preferred the safety of vanilla over all the pre-installed packages that come with the other "flavors"...
Sir please help me i need some help, can you make a video how to login a css website through python ... Import requests From bs4 import beautifulsoup help me sir
thanks! i havent directly compared it there but i would think not especially no - but I think it is much easier to add async to colly than writing with aiohttp. I am going to do some testing!
@John Watson Rooney Would be great to see a video comparing the speed of async go webscraper and an async python webscraper. I am starting a large webscraping project and would consider writing in Go if it gave a performance advantage over python.
Excellent one. I might start using Go now
It’s worth a look I think!
Class! Thanks for this vid! The parser on Go looks clearly faster than on Python, and if you still use a goroutine, then the difference in speed will probably be even greater. While Python can also use multi-process parsing.
Now I need to search for python vs go benchmarks for scraping... Hehe
Did you find some current benchmarks? I searched and only found outdated data and in those Go was a lot faster than most of langs.
@@kiraGuitarr did not find any clear winners, Go is easier to use when multithreading. In cases where python is slower it compensate by having faster production timez etc. My understanding is that every language is a tool and as with any tools there is no better one just most convenient for the job.
Great video. We use Go in the software engineering team at my bank. Still early days but its getting traction. John have you used Rust yet?
oh cool. I am still new to Go so i am trying to build some things with it. It jsut seems to make sense to me. I haven't tried Rust yet no, I found Go more appealing when I was looking for a second lang
@@JohnWatsonRooney I think Go is the easier of the 2 to learn, but Rust seems to have a larger community online it seems. Never the less a great language to pick up, will be interesting to see where these 2 languages are in 5-10 years time
What's the text editor you using?
Great video! I was wondering if you can make a golang scrapper which first logs in to a web page and afterwords scrapping the webpage !
Yes - the framework I’m using here (colly) can handle logging in. I’ve not tried it but there’s an example in their documentation!
Can you help setting up neovim as yours specially that popup window for running process
I believe it is the 'toggleterm' neovim plugin
Thanks for the video! Could you please show us how can scrap data from multiple website with a single go script?? [selector will be different for each website]
yes, you can create separate "collectors" for each website, and have their own parsing functions. It could get a bit messy but i would then split them up into separate files
@@JohnWatsonRooney ok. I will try. Thank you ❤️
Very good - I went from 40s to 8s on my site. Very impressive. Non Async to me would be too slow. One general issue i have is how to merge information for the same product when that info is spread about in different sections of the page? On a single page, how to effectively match up what belongs to what. I know one way would be to find the common parent class and use one h *colly.HTMLElement block but maybe there is another way with the structs?
Hi John, loving your videos. Just came across your channel yesterday. I've been scraping for about 20 years using a mixture of php/curl then python using string manipulation, so it's great to see how to do it more efficiently! Do you have a video rounding up what ide and tools you use? Nothing jumped out. Currently using visual studio but there may be something better?
He is good this guy. A reply from his other vid: "...it’s neovim - I swapped to it full time about 2 months ago after moving between it and PyCharm whilst I was learning the key bindings etc. I really like it"
I was using VSCode - really liked it, then started using PyCharm - works really well on python projects, just uses a ton of RAM and CPU. I still think VSCode is tops if you work on other projects too.
@@DietervanderWesthuizen I absolutely loved PyCharm but it took forever to boot up & I never really liked vs code so right when I was on the edge of going back to Atom(the 1st editor I've ever liked), I decided to give Vim a try... After a good amount of research I decided to just go with vanilla Vim & I loved it! Takes a little more effort to setup than a pre-configured version like neovim, but I preferred the safety of vanilla over all the pre-installed packages that come with the other "flavors"...
@@Reese_414 thanks will have a look. I use Vim and Nano on Linux but struggling to get the hang of all the shortcuts on Vim. Will have a look again
So cool 😎😎
Can you show us how do you Ajax api scrapping with golang ?
Does anything have to be installed/set-up in advance?
Install go, which is simple see their main website and run one “go get” command to install colly, that’s it
which ide do you use? regards
This is Neovim, using a slgihtly tweaked "basic ide" config by chris@machine. similar to TJ's basic nvim configs. I think its great.
@@JohnWatsonRooney thanks
and you could try Rust for web scraping as well
I second this
Sir please help me i need some help, can you make a video how to login a css website through python ...
Import requests
From bs4 import beautifulsoup
help me sir
🥇First
Hello John, Great video! Is Colly faster than asyncio + aioHttp?
thanks! i havent directly compared it there but i would think not especially no - but I think it is much easier to add async to colly than writing with aiohttp. I am going to do some testing!
@@JohnWatsonRooney Amazing! That would be an interesting video... 👀
@John Watson Rooney
Would be great to see a video comparing the speed of async go webscraper and an async python webscraper. I am starting a large webscraping project and would consider writing in Go if it gave a performance advantage over python.