The Ultimate Introduction to Web Scraping and Browser Automation

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ม.ค. 2025

ความคิดเห็น • 203

  • @RealSoundNow
    @RealSoundNow 7 ปีที่แล้ว +10

    This was one of the most masterful tutorials I've seen on the web. Truly well done.

  • @SogMosee
    @SogMosee 8 ปีที่แล้ว +50

    This was incredible. My mind is blown.

    • @glowingexit
      @glowingexit 7 ปีที่แล้ว +3

      I know dude. That was like the sickest performance ever. I didn't even know that this was possible. Thank you, Jordan. You like so encouraged me to learn web scraping wow.

  • @greatgoats
    @greatgoats 6 ปีที่แล้ว

    You made it relevant, relatable and instead of consuming and consuming endless hours of videos I actually started. Thank you.

  • @icvalkyrie
    @icvalkyrie 5 ปีที่แล้ว +64

    "I'ts 2016"
    huh weird hearing it from today

    • @glinskiadam
      @glinskiadam 3 ปีที่แล้ว

      yes

    • @haroldsolomon2110
      @haroldsolomon2110 3 ปีที่แล้ว

      I dont mean to be so off topic but does any of you know of a trick to get back into an instagram account..?
      I was stupid lost my login password. I would love any tips you can give me

    • @zekeashton2999
      @zekeashton2999 3 ปีที่แล้ว

      @Harold Solomon instablaster =)

  • @fares57
    @fares57 8 ปีที่แล้ว +88

    That's a guy you don't wanna mess with!

    • @doug4328
      @doug4328 6 ปีที่แล้ว +3

      One guy google doesn't wanna fuck with

  • @0xbitbybit
    @0xbitbybit 7 ปีที่แล้ว

    Nicely done, that was actually a pleasure to watch someone coding, at a non glacial pace and getting shit done, that is actually useful. Awesome vid!

  • @BeXSive
    @BeXSive 8 ปีที่แล้ว +3

    I subscribed after the AWS rails overview, looking forward to going through other videos you have, keep it up tons I'd love to learn

  • @shashanko
    @shashanko 7 ปีที่แล้ว

    I hardly ever comment on TH-cam videos...but this got me hooked and this bloke got me to type this: "Good on ya' mate! Fair Dinkum"

  • @seohelpers5013
    @seohelpers5013 6 ปีที่แล้ว +3

    I really like the way you teach scraping and web automation. Loved it!!!

  • @mickelodiansurname9578
    @mickelodiansurname9578 7 ปีที่แล้ว

    well presented friend, and you clearly not only know yer stuff, which is important, but you damn sure know how to explain it simply with no nonsense. Subscribed

  • @tommychen3708
    @tommychen3708 4 ปีที่แล้ว

    One of the best tutorials ive ever seen

  • @BruchinKr8t
    @BruchinKr8t 7 ปีที่แล้ว +3

    wow I was searching to find out how to compare food prices in different shopping store sits, this is a lot of info to digest

  • @abdulsalamfarhat4999
    @abdulsalamfarhat4999 6 ปีที่แล้ว

    i'm facing this issue on 4:48
    Errno::ECONNREFUSED: Failed to open TCP connection to 127.0.0.1:9515 (Connection refused - connect(2) for "127.0.0.1" port 9515)
    from /home/**/.rbenv/versions/2.5.1/lib/ruby/2.5.0/net/http.rb:939:in `rescue in block in connect'

  • @10tmccarthy
    @10tmccarthy 8 ปีที่แล้ว +11

    This video blew my mind, thank you so much man!!

  • @fiorettivideos
    @fiorettivideos 8 ปีที่แล้ว +1

    Awesome video man, for the last week I've been struggling to scrape data from wikipedia tables (damn wikipedia) and your video gave me a new ideas to approach my problem!

  • @norcaljohnny
    @norcaljohnny 7 ปีที่แล้ว

    I can not remember a YT video that had me interested for 28min! Great Tut Jordan! :D

  • @lawlietnick
    @lawlietnick 7 ปีที่แล้ว +1

    You're the man. Love how you explain all the stuff.

  • @poshpeanut6029
    @poshpeanut6029 8 ปีที่แล้ว +17

    Hi Jordan.
    If I want to learn coding and scraping what courses and books would you recommend?

  • @Shaquelkothari
    @Shaquelkothari 5 ปีที่แล้ว

    Thank you for this video you have no idea how much you have opened my mind and this will be used on my project thank you and i honestly hope your life gets better and better

  • @hellmutmatheus2626
    @hellmutmatheus2626 7 ปีที่แล้ว +42

    Sir, where is your patreon?

  • @YogeshDesai1293
    @YogeshDesai1293 7 ปีที่แล้ว

    Thank you. I am not a Ruby developer but got the required knowledge out of this video. It's a great video Dude.

  • @amulpatel
    @amulpatel 8 ปีที่แล้ว

    fucking incredible... please keep doing these and keep them under 30mins

  • @AceOfSpade888
    @AceOfSpade888 7 ปีที่แล้ว +5

    Excellent Intro! I'm really impressed!!! Your choice of Ruby, which I haven't worked with much previously, proves to me I'm missing out of some SERIOUS tools in my my toolbox! THANKS!!!

  • @harshal13
    @harshal13 7 ปีที่แล้ว

    Excellent video on web scrapping! Irrespective of the language you want to use for web-scrapping, one should see this video what you could do in web scrapping and you can try similar things in python, java script etc etc. however after seeing this video I am now very much encouraged to learn Ruby.

  • @prakerr9155
    @prakerr9155 7 ปีที่แล้ว

    you are awesome!! so fluent in coding and clear at explaining.

  • @MrPaglynn
    @MrPaglynn 7 ปีที่แล้ว

    Its insane how clever this is

  •  5 ปีที่แล้ว

    This is fun if I don't need to deal with a lot. Actually in the office when we are angry we curse each other like "I hope you do web scraping and regular expressions for the rest of your life!"

  • @leeroyescu
    @leeroyescu 6 ปีที่แล้ว +1

    18:02 Is there an equivalent trick for React apps?

  • @Privacy-LOST
    @Privacy-LOST 7 ปีที่แล้ว

    If my tech college teachers were like this guy I wouldn't have skipped classes and learned to code by myself.

  • @porroapp
    @porroapp 8 ปีที่แล้ว

    Very cool intro to scraping. Your content is awesome!

  • @i.mahdihosseini
    @i.mahdihosseini 7 ปีที่แล้ว

    Best web scraping video ever! Thanks!

  • @Hexarmin
    @Hexarmin 6 ปีที่แล้ว

    Hey thanx! Could you please tell me your set up.. in the beginning.. where do you type into? Is that a program/ terminal..or? I’m still new, trying to get into it for my journalism.

  • @akashdeshbhratar6704
    @akashdeshbhratar6704 6 ปีที่แล้ว

    Bro I am waiting for your further videos. You are insane really. 🍾

  • @nemnoton
    @nemnoton 8 ปีที่แล้ว

    Wow, advanced stuff, very good!

  • @ouza1430
    @ouza1430 6 ปีที่แล้ว

    i can not run the rails console... can you help, i keep getting report fixnum is deprecated , bignum is deprecated

  • @usgrant631
    @usgrant631 8 ปีที่แล้ว +1

    Anyone else get an issue with browser.html breaking their console? So that every subsequent command returns "(END) (pry) output error: Interrupt"?

    • @DarkSolidity
      @DarkSolidity 3 ปีที่แล้ว

      Did you ever figure this problem out? It’s the wall I’m currently hitting.

  • @TheTylerScott14
    @TheTylerScott14 8 ปีที่แล้ว

    So for the Phantom browser, do you need to do brower.close when its done scraping to free up memory? or that's a non issue?

    • @Krisler12
      @Krisler12 8 ปีที่แล้ว +1

      Smart websites could sense PhantomJS. The headers aren't exactly like chrome.
      One good example is flashscore.com. How you deal with that ? How to scrape it ? Thanks !

  • @JordanShackelford
    @JordanShackelford 7 ปีที่แล้ว

    I'm using selenium with python and chrome webdriver. Whenever I load the page i'm trying to scrape/automate, I get a popup by the address bar telling me the website wants to store files on my device and it asks me if I want to allow it. Anybody know how to do that?

  • @geo2465
    @geo2465 8 ปีที่แล้ว

    Hi! thanks for the videos, I was trying to do scraping to craigslist but the block my api address I can't search their site, there is any way to avoid this issue?

  • @remusomega
    @remusomega 7 ปีที่แล้ว

    How is scraping then inserting a Javascript API a threat? Whatever gets executed would only be seen on my client side. Whats the use case of this type of attack?

  • @verky56
    @verky56 5 ปีที่แล้ว

    How do they find out that you accessed sensitive data? Do they track this type of activity?

  • @Sorazal23
    @Sorazal23 7 ปีที่แล้ว

    Amazing value dude, he's a talented tutor too

  • @bogdanpop4512
    @bogdanpop4512 8 ปีที่แล้ว

    Cool video. Had no ideea you could to your advantage like that. Keep it up!

  • @tooshlong
    @tooshlong 5 ปีที่แล้ว

    Lol this was all gobbledygook. Yet addicted to watching it all. I still don't know why. Sigh. Back to googling, "scraping for thickos".
    Gonna bookmark this for when I understand more cos mind blown.

  • @chainer22
    @chainer22 7 ปีที่แล้ว +1

    Thank you very much for this video, Jordan! What are the options for storing the data thats scraped? Heroku? MongoDB?

  • @meegz149
    @meegz149 7 ปีที่แล้ว +12

    You Mr. roboted the fuck out of the NBA webpage.

  • @ZennerBear
    @ZennerBear 7 ปีที่แล้ว

    how did you exit at 6:31 ?

    • @ZennerBear
      @ZennerBear 7 ปีที่แล้ว +1

      Thanks!

    • @DarkSolidity
      @DarkSolidity 3 ปีที่แล้ว

      @@ZennerBear I'm trying to figure that out also, did you get it figured out?

  • @jandynotaloca
    @jandynotaloca 3 ปีที่แล้ว

    Is there anything like watir for C++?

  • @7mattallmighty
    @7mattallmighty 7 ปีที่แล้ว

    What's the programming software that he is using. Installed rails with ruby and doesn't look the same..

  • @gambit6431
    @gambit6431 6 ปีที่แล้ว

    installed everything and somehow how lost.... tmux gave me problems with install. any pointers? i installed all the gems and was trying to get to the tmux window you are in and "farting soudn"

  • @lexclock
    @lexclock 7 ปีที่แล้ว

    excuse how can use those functions of ruby with a web page?....in the case of js, you would put in the script section ....how can I link the html with a ruby?..can I?. thanks :)

  • @norcaljohnny
    @norcaljohnny 7 ปีที่แล้ว

    Going to look now but... do you have something on proxy to get m3u8?

  • @warrenhall1750
    @warrenhall1750 7 ปีที่แล้ว

    Fun to watch a super-coder in action. Quite a programmer/analyst.

    • @TheDJLiquify
      @TheDJLiquify 7 ปีที่แล้ว

      Yeah, check out his ethereum videos. Pretty fun.

  • @hendroyohanes4295
    @hendroyohanes4295 6 ปีที่แล้ว

    I did my webscrap project with php curl and it's pretty easy to work with. But these ruby's libraries looks promising!

  • @FrankenPC
    @FrankenPC 8 ปีที่แล้ว

    So, lets say I've got the array from a table I want, how do I write it out to a local CSV file for say pickup by a warehouse? Windows for me.

  • @gastonpeila5271
    @gastonpeila5271 7 ปีที่แล้ว

    Thanks so much!! i was looking for something like this for some weeks...

  • @cgongoram
    @cgongoram 7 ปีที่แล้ว +1

    Hello man, please tell me how do you get control over ruby console after you stored nokogiri parse to the DOC variable. I cant quit viewing the nokogiri variable content... Thanks, you made agreat job with this tutorial

    • @DarkSolidity
      @DarkSolidity 3 ปีที่แล้ว

      I'm having this problem also. did you get it figured out?

  • @apx622
    @apx622 7 ปีที่แล้ว

    Thanks for your tutorials!
    Question: Say you've written a web scraper (like in one of your videos) that runs great from CLI - how would you then deploy that so it runs 24/7 in the cloud?
    And what would the tech stack / tools/libraries on that look like?

  • @sparshbijawat200
    @sparshbijawat200 8 ปีที่แล้ว

    Which is the "NAnana" in your video?
    And does what level of HTML one should know to do web scraping?

  • @phantasmlab9850
    @phantasmlab9850 8 ปีที่แล้ว

    Good Vídeo, Don't stop for Ruby videos

  • @Luftbubblan
    @Luftbubblan 7 ปีที่แล้ว

    Cool stuff dude :)
    Damn, i would never take you for the skillset you have.

  • @carrier_pigeon214
    @carrier_pigeon214 8 ปีที่แล้ว

    Absolutely great video.

  • @cgtull
    @cgtull 7 ปีที่แล้ว

    Im unable to get ruby on rails working correctly on windows. Behind a proxy and have my http_proxy variable set.
    run bundle install
    The dependency byebug (>= 0) will be unused by any of the platforms Bundler is installing for. Bundler is installing for x86-mingw32 but the dependency is only for ruby. To add those platforms to the bundle, run `bundle lock --add-platform ruby`.
    Fetching source index from rubygems.org/
    Retrying fetcher due to error (2/4): Bundler::HTTPError Could not fetch specs from rubygems.org/

    • @zjevander9739
      @zjevander9739 7 ปีที่แล้ว

      after adding all the gems to your gem file run Bundler with:
      bundle install --full-index
      (note the two "-" (hyphens) before the word "full")

  • @ezscootrr
    @ezscootrr 7 ปีที่แล้ว

    I still have a lot to learn. This iw very helpful. Thanks dude.

  • @TodorImreorov
    @TodorImreorov 8 ปีที่แล้ว

    cool tutorial! I did a similar thing for site crawling and data input automation. But I did it in a very lame way - using autohotkey with the COM object, running IE as the browser. Needless to say, I wanted to get away from both and do it with less overhead - your solution seems to be it. But I dont know rails very well- where to even start. Would be nice if there was a similar tutorial - but using python

  • @ThePandaGuitar
    @ThePandaGuitar 7 ปีที่แล้ว

    This is gold. Thanks a lot!

  • @tomrafter9432
    @tomrafter9432 7 ปีที่แล้ว

    I'm getting => # instead of url'=data' title='data'
    Also, when I use browser.goto("google.com") I get Errno::ECONNREFUSED: Failed to open TCP connection
    Any ideas?

    • @moshohet
      @moshohet 7 ปีที่แล้ว

      Hi, I got the connection refused error too. Have you managed to solve it?

    • @moshohet
      @moshohet 7 ปีที่แล้ว

      I found the the solution. You, like me, must have downloaded and installed chromedriver v2.9, I found a post from stackoverflow about someone who had the same problem like we did. The solution he found was to delete chromedriver v2.9, and instead download and use chromedriver v2.24 I did it and it works like a charm so far. You also need to remember to add the path to where you have placed chromedriver.exe to your system's environment variables. Happy Scraping :)

    • @edenweb370
      @edenweb370 7 ปีที่แล้ว

      Thanks for that info, do you have a link for instructions on how to remove chromedriver from /usr/local/bin/ I have no idea how to do it

    • @edenweb370
      @edenweb370 7 ปีที่แล้ว

      +Moses Shohet

  • @AnsikMahapatra
    @AnsikMahapatra 7 ปีที่แล้ว

    Thanks a ton! This tutorial helped me a lot. Cheers!

  • @br1900s
    @br1900s 7 ปีที่แล้ว

    This was great. Do some of these websites track scrapers and try to stop them?

  • @rahatsaquib4190
    @rahatsaquib4190 7 ปีที่แล้ว

    How can you web scrape a website that has DDOS in place?

  • @kvy55
    @kvy55 6 ปีที่แล้ว

    thanks that was very useful information you made my life easy

  • @mattotoole4327
    @mattotoole4327 8 ปีที่แล้ว

    Great tutorial. Good job.

  • @daweiliu6452
    @daweiliu6452 7 ปีที่แล้ว

    amazing! very cool and useful

  • @milanshrestha7802
    @milanshrestha7802 7 ปีที่แล้ว

    which language is best for web scraping

  • @j.l.9922
    @j.l.9922 8 ปีที่แล้ว

    How old are you, how did you learn RUBY and JavaScript in depth if you did? I am stuck with beginner courses on these languages which knowing how far they can go highly depends on your ability to master them... and I don't yet.

    • @samm4885
      @samm4885 8 ปีที่แล้ว +1

      Hi I am 15 and I can tell you that Ruby was useless, do not waste your time like i did, learn javaascript in-depth and learn functions and algorithms, then i suggest looking into c#

    • @j.l.9922
      @j.l.9922 8 ปีที่แล้ว

      Sam M thanks! I just ordered 3 javascript books from Amazon. You're awesome, I can tell you have allot of potential, many Hedge-Funds would probably like to have great programmers like you in their team! I have an MIB and some professional background and I read everywhere how guys like you are in demand! Keep going man!

    • @ilyadol
      @ilyadol 8 ปีที่แล้ว

      Node.js is going to be huge.

  • @touchesoftwares9604
    @touchesoftwares9604 6 ปีที่แล้ว

    Wow this seems lot more easier than Python
    i am using this for email building , local business extractor

  • @nastastic
    @nastastic 8 ปีที่แล้ว

    Hi Jordan, Love the vid by the way. I'm a bit stuck if anyone could help everytime I bash browser = Watir::Brower.new(:chrome) I get "NameError: uninitialized constant Watir
    from (pry):2:in `__pry__'" any idea what I'm doing wrong? or am i missing a part of the ruby config? Watir gem or driver not installed properly? Thanks

    • @nastastic
      @nastastic 8 ปีที่แล้ว

      Update Could not find gem 'Watir' in any of the gem sources listed in your Gemfile or available on this machine. I followed the steps by adding gem 'watir' in the gem file. still can't fix this:(:(

  • @fares57
    @fares57 7 ปีที่แล้ว

    btw, What is this terminal?

  • @JM-fp3gf
    @JM-fp3gf 7 ปีที่แล้ว +2

    Good video, I SMASHED that like button

  • @insanemarcwhite
    @insanemarcwhite 7 ปีที่แล้ว

    what console font is that ?

  • @funguy29
    @funguy29 7 ปีที่แล้ว

    scholar and gentleman...

  • @sushantbhandari7948
    @sushantbhandari7948 4 ปีที่แล้ว

    This guy is a fkin legend.

  • @Stl71
    @Stl71 5 ปีที่แล้ว

    Nice tutorial. I'm not familiar with Ruby, any advice for Java?

    • @lagz89
      @lagz89 5 ปีที่แล้ว

      Do it in ruby and make a web service consumed by java

  • @ayrtondumas6968
    @ayrtondumas6968 8 ปีที่แล้ว

    Is Ruby the best language to perform web scraping ?

    • @le3ronjam3s43
      @le3ronjam3s43 8 ปีที่แล้ว

      Do you know of any weaknesses of python ? comparing it to ruby ?

  • @solvm1652
    @solvm1652 8 ปีที่แล้ว +1

    Killer tut! Sooo good.

  • @ifudiscusswithmeurprobably7273
    @ifudiscusswithmeurprobably7273 7 ปีที่แล้ว

    +1 Speed good. Content good. Explanation good. I like your style. I still missed more information about deployment. Could I build a GUI to wrap that code and configure it to run like TUE 2PM to 3 PM?

  • @VinodChandaliya
    @VinodChandaliya 8 ปีที่แล้ว

    It was an amazing video based on web scraping, Just one question,
    is it possible to gather data from e-commerce site through web scraping?

  • @abiodunolorungbemi5532
    @abiodunolorungbemi5532 2 ปีที่แล้ว

    Thanks for the knowledge share, please can you teach me how to scrape soccer odds from oddsportal.

  • @Schimshamity
    @Schimshamity 6 ปีที่แล้ว

    This was dope. Thanks.

  • @engelshentenawy
    @engelshentenawy 7 ปีที่แล้ว

    This is actually awesome , ( I need to know some more ruby thou )

  • @devvvvvvvvvvvv
    @devvvvvvvvvvvv 8 ปีที่แล้ว

    Interesting. So is this like headless chrome or is that something different?

  • @ryanheffel
    @ryanheffel 6 ปีที่แล้ว

    This was sincerely incredibly helpful. I'm also a huge NBA fan and want to do a site for scouting. Is email the best way to contact you for business?

  • @jaigohil4963
    @jaigohil4963 6 ปีที่แล้ว

    Bro you are awesome!

  • @Hexarmin
    @Hexarmin 6 ปีที่แล้ว

    May I ask which program you are writing all of this in?

    • @Hexarmin
      @Hexarmin 6 ปีที่แล้ว

      Never-mind.. just figured it out. Writing in Rails... Link to atom...Watched the video three times. Like I said: “I’m still new.”

  • @alessandrodimich4914
    @alessandrodimich4914 7 ปีที่แล้ว

    Great video!

  • @jakwire
    @jakwire 6 ปีที่แล้ว

    you make this look so easy -- #badass

  • @N0__Name__
    @N0__Name__ 7 ปีที่แล้ว

    guys im gonna start with this video, so can someone tell me what IDE is he using?

  • @galustbayburcyan1083
    @galustbayburcyan1083 8 ปีที่แล้ว

    it's very cool. thank you for the video
    Is it posible to scrap coefficient from Bet365 Mobile version?? (standart website` bet365.com is Flesh site), i tried to do that by your example, but it did not work.

  • @soufianta8374
    @soufianta8374 3 ปีที่แล้ว

    Hoo. Ruby is underrated !! A lot of strong librairies seems to be there !! Ok. Question: is ruby still a excellent start point to learn web development and coding in general? It’s a high level language like python but python seems to be so far in popularity than ruby !! Thanks

  • @sebastianpalma5051
    @sebastianpalma5051 8 ปีที่แล้ว

    Have you tried Capybara for scraping with Ruby?, btw really good video.