I spent hours trying to find nested divs in a website I needed to scrape today and I didn't know you could just copy the xpath from the inspector. This video is a godsend lol.
Sir can you make more videos about web scraping using selenium. I like the way you teach, I love your channel from Philippines. Simple, Clear and Short.
Great video. I am going to use it to get bond prices. There are so many relationships to monitor that the only possible way to do it is using a Python program. Thanks so much.
Note: AttributeError: 'WebDriver' object has no attribute 'find_element_by_xpath' Reason:- driver.find_element_by_xpath() has been removed from selenium. Instead we can now use : - driver.find_element("xpath", '//*[@id="username"]') Hope it helps.
00:57 - why install it for a separate user account only? Sorry for the n00b question but I'm guessing this is a security concern? Are you wanting to ensure that admin accounts cannot run it?
Not sure if you ever use Selenium on a Linux machine (RPi in my case), but I've found... it's needlessly difficult to get started. After a bunch of research, I found what might not be the best answer, best practices, etc... but it seems to work. First you [sudo apt-get chromium-chromedriver] instead of a regular chromedriver, since finding a suitable chromedriver for Chromium seems to be a tedious process from questionable sources. Then, a stack overflow response mentioned reconfiguring your webdriver options to set your Chrome path to actually point at your Chromium browser. I like it because it's hacky, and so am I. Perhaps worth noting: it was a low-ranking stack overflow response, but it was relatively understandable to a layperson such as myself. I wondered if you knew a better way, or knew any reasons that I should not use this method. Or possibly this is helpful information to you? In any case, wanted to pass it along.
I did once wrestle with it to get it to work well on a linux server, involving creating fake displays and everything. I copioed so many commands that I had no idea what they really were.. But it did work. I don't use Selenium much now but if I have to I go headless and use Helium (this works on WSL2). Only downside it won't load up not in headless mode so no browser popping up to see whats going on. check it out if you havent done already: github.com/mherrmann/selenium-python-helium
@@JohnWatsonRooney I've started using Helium since you did the tutorial a while back. Kind of a magic module, though I haven't been able to make executables out of py files that run helium, so I'll sometimes go back to selenium if needed. In terms of running selenium on Linux, in my case I'm just using the pi as a day-to-day machine so I don't have to worry about virtual displays, but I did run into that chromedriver issue from my comment above, so I figured I'd mention it. I'm taking a Data Science course, and someone was setting up a scraper with Selenium, which led me down this rabbit hole.
I wonder if I can use this to scrape videos from a site that locks when you open the developer tools? Like, on the site if you open them it will hit a breakpoint and then navigate away.
Hello, I have a question for you. Now I have some problems. of the request for a login web page, but the server responded with another page. I believe the server may be experiencing too many concurrent access statuses. Question: How can we solve this problem? I would like to ask for ideas sir.
My handsome John, before I watched this, I was trying install chrome webdriver using books. It took me 2 hours on google to find the default path. What should I say? I should have watched your video sooner.
I know I am too late as I have just came across with this video. You can add the followwing code before opening the URL. This code helps you to configure the chrome options for the webdriver: options = webdriver.ChromeOptions() options.add_experimental_option("detach", True) driver = webdriver.Chrome(options=options)
erv2 $ sudo apt install -y chromium-chromedriver Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package chromium-chromedriver how do i get this on my rpi 5 arm raspian 64bit?
So far, I was able to use this tutorial to get through part way of a job interview coding challenge. The challenge: - using selenium, - log into a website, - where occasionally visiting the website results in an error, - and grab the contents of a table, - which only appears after a loading element, - which also occasionally results in an error, - and output the data into a CSV. So far, I have been able to log into the website just by using this Pt.1. However, I only now am noticing that there are errors that they have planted in the challenge, and I am unable to use xpath for the table since they have made duplicate ids (same id value) for multiple elements just to screw with me. On top of that, I am currently logging the loading data even though I have the implicitly_wait(20), so am going to have to look into that. Hopefully Pt. 2 gets me closer to getting hired! Cheers
Glad its helped you! depending on the site you might find that after logging in, you can call the driver.source and parse the data with BS4 - this could be quicker and easiser.
@@JohnWatsonRooney thanks a lot! I was able to get the job done with Selenium and pandas. I had to punt on fault-tolerance sadly. Do you think you’ll ever make content regarding error handling with web scraping?
@@jonathanhammond5563 Oh thats great. Yes I am planning on doing a Data Cleaning and Error Handling video but its such a wide subject its taking some time to get the ideas together
@@JohnWatsonRooney that is amazing! I will definitely be learning from that one. Not sure if this helps, but my employer stumped me with three/four errors: - sometimes the page failed to go directly to the login. I wasn’t sure how to refresh the page and kept getting an infinite loop. - sometimes the error (basically a div) would appear immediately after logging in, before the “loading” sequence. This seemed again like a refresh might have been the solution. - occasionally after the “loading” sequence finished, an error would also appeared there. All three appeared to be reload related, but I wasn’t sure how to fix them. - finally, an “error” of sorts was when multiple elements had the same id. This was also by design simply to be a pain in the neck. Maybe some of those ideas will help! Either way, you’ve been a big help and I will tell many people about your channel now that I know about it and it’s amazing quality. Have a great week
How do you actually starts to scrape after login with selenium, is that possible? And where is part 2 of this video? I'm looking for some login scrape content, however I'm having hard time with this link: th-cam.com/video/cV21EOf5bbA/w-d-xo.html Because the request for login/security that I found in network is not working =/
Once you login in you can either scrape elements with Selenium or get the entire page source with driver.page_source because sometimes it's not visible if you use requests
*.find_element_by_xpath()* no longer supported in the latest version. you need to use *.find_element("xpath", ' ')* So it would instead look like *driver.find_element("xpath", '//*[@id="username"]').send_keys('tomsmith')*
I spent hours trying to find nested divs in a website I needed to scrape today and I didn't know you could just copy the xpath from the inspector. This video is a godsend lol.
XPath will work but consider it a last resort. It is fragile and will likely break if the order of tags on the page changes.
@@LifePointeChurch1616 A unique ID is best if you have one on the page.
I had a webscraper project for the covid quarantine and this is the first tutorial out of 5 that was helpful! Thanks a bunch!
Glad you found it useful!
Hey John, Whenever I have to do anything with web scraping or interacting with web browsers I always watch your videos.
Keep up the good work!
Great video. We need another video on how to avoid bot detection with selenium
This tutorial saved me in ways you cannot comprehend.
Thanks I’m glad I could help!
Nice simple and helpful one.. thanks for keeping the video so easy to learn..🤗
Great content, love you vids can't wait for more. I am starting python and some of your videos have helped me keep up the good work
Glad I could help! More videos to come!
Great videos John.
Sir can you make more videos about web scraping using selenium. I like the way you teach, I love your channel from Philippines. Simple, Clear and Short.
Great video. I am going to use it to get bond prices. There are so many relationships to monitor that the only possible way to do it is using a Python program. Thanks so much.
Came here for something like implicitly_wait, wasn't disappointed! Thanks for the video.
thanks for videos mr John
My pleasure!
Note: AttributeError: 'WebDriver' object has no attribute 'find_element_by_xpath'
Reason:- driver.find_element_by_xpath() has been removed from selenium.
Instead we can now use : - driver.find_element("xpath", '//*[@id="username"]')
Hope it helps.
thanks!
Thanks Buddy
how about send_key icant find them too ?
Simple and useful. Thanks.
didn't know it could be so easy! Now I have to give this a try :)
This stuff is really gold !! Thanks for this video
Thanks John it really helped me
00:57 - why install it for a separate user account only? Sorry for the n00b question but I'm guessing this is a security concern? Are you wanting to ensure that admin accounts cannot run it?
Thanks for your useful video
Thank you John, great work, again
Hello sorry if this a dumb question but what do i do if the script opens the webpage but keeps closing it right aftrer all together with the browser?
Thanks man. Very useful. Hopeful to get a scrapping gig.
Ver y good explained, thank you
Thank you soo much for this helpful content.
Good job. You've just got a new subscriber!
Brilliant! Thanks so much
I really enjoy learning skills from your tutorial. May I share your video and take notes on my medium article? I will label where it is come from.
Of course, please do!
thanks for the video!!
Not sure if you ever use Selenium on a Linux machine (RPi in my case), but I've found... it's needlessly difficult to get started. After a bunch of research, I found what might not be the best answer, best practices, etc... but it seems to work.
First you [sudo apt-get chromium-chromedriver] instead of a regular chromedriver, since finding a suitable chromedriver for Chromium seems to be a tedious process from questionable sources. Then, a stack overflow response mentioned reconfiguring your webdriver options to set your Chrome path to actually point at your Chromium browser. I like it because it's hacky, and so am I.
Perhaps worth noting: it was a low-ranking stack overflow response, but it was relatively understandable to a layperson such as myself. I wondered if you knew a better way, or knew any reasons that I should not use this method. Or possibly this is helpful information to you? In any case, wanted to pass it along.
I did once wrestle with it to get it to work well on a linux server, involving creating fake displays and everything. I copioed so many commands that I had no idea what they really were.. But it did work. I don't use Selenium much now but if I have to I go headless and use Helium (this works on WSL2). Only downside it won't load up not in headless mode so no browser popping up to see whats going on. check it out if you havent done already: github.com/mherrmann/selenium-python-helium
@@JohnWatsonRooney I've started using Helium since you did the tutorial a while back. Kind of a magic module, though I haven't been able to make executables out of py files that run helium, so I'll sometimes go back to selenium if needed.
In terms of running selenium on Linux, in my case I'm just using the pi as a day-to-day machine so I don't have to worry about virtual displays, but I did run into that chromedriver issue from my comment above, so I figured I'd mention it. I'm taking a Data Science course, and someone was setting up a scraper with Selenium, which led me down this rabbit hole.
big fan bro
how we automate in case of when we see capture functionality before login, please guide
Hi John, just wanted to know is there any way to scrap hidden div tags/elements using playwright, beautifulsoup etc?
Thanks
Thank you so much sir
I wonder if I can use this to scrape videos from a site that locks when you open the developer tools?
Like, on the site if you open them it will hit a breakpoint and then navigate away.
Super useful
Keep it up! :)
thanks!
When getting the url and the Chrome browse appears, how to solve the question asking for connection to an account?
Thank you!
just subscribed and wondering where I can find the new stuff or updated one thanks
havent used selenium yet. can it be used to interact with programs at work to make the job easier by automating some task?
It can control a web browser so as long as your work programs are online then yes it absolutely can
@@JohnWatsonRooney no there local to the business. maybe i can ask permission from the company software developers
Brother i used the same code but it doesn't put the credentials there. There is no error as well. Any solutions?
So how 's that the browser doesn't get closed after just popping up. I need a few more commands for that.
Hello, I have a question for you. Now I have some problems. of the request for a login web page, but the server responded with another page. I believe the server may be experiencing too many concurrent access statuses.
Question: How can we solve this problem? I would like to ask for ideas sir.
how will you ask for an input to accommodate OTPs while logging into a URL ? The OTP is different every time to be hard-coded :-(
do you know, how to automate multiple choice questions using selenium python? where the questions change every time the course is opened
Hey the //*[@id etc is underlined and says expected expression
in which ide is this done? can we do in Vscode?
Man thank you very much
Great ! Thanks
My handsome John, before I watched this, I was trying install chrome webdriver using books. It took me 2 hours on google to find the default path. What should I say? I should have watched your video sooner.
Can I crawl video from the web? Is there any tutorial video?
Does anyone know if there is a way to use selenium but on an already existing tab (so i don't have to sign in on the thing it opens)
After executing the first line of code, Chrome closes immediately. Any idea on how to keep it open?
I know I am too late as I have just came across with this video. You can add the followwing code before opening the URL. This code helps you to configure the chrome options for the webdriver:
options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(options=options)
Mb you can continue Selenium series with some other videos?))
Sure I’ll look into doing more
@@JohnWatsonRooney thank you!
I was able to run the entire code except the last print function. can someone help?
erv2 $ sudo apt install -y chromium-chromedriver
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package chromium-chromedriver how do i get this on my rpi 5 arm raspian 64bit?
So far, I was able to use this tutorial to get through part way of a job interview coding challenge.
The challenge:
- using selenium,
- log into a website,
- where occasionally visiting the website results in an error,
- and grab the contents of a table,
- which only appears after a loading element,
- which also occasionally results in an error,
- and output the data into a CSV.
So far, I have been able to log into the website just by using this Pt.1. However, I only now am noticing that there are errors that they have planted in the challenge, and I am unable to use xpath for the table since they have made duplicate ids (same id value) for multiple elements just to screw with me. On top of that, I am currently logging the loading data even though I have the implicitly_wait(20), so am going to have to look into that. Hopefully Pt. 2 gets me closer to getting hired! Cheers
Glad its helped you! depending on the site you might find that after logging in, you can call the driver.source and parse the data with BS4 - this could be quicker and easiser.
@@JohnWatsonRooney thanks a lot! I was able to get the job done with Selenium and pandas. I had to punt on fault-tolerance sadly. Do you think you’ll ever make content regarding error handling with web scraping?
@@jonathanhammond5563 Oh thats great. Yes I am planning on doing a Data Cleaning and Error Handling video but its such a wide subject its taking some time to get the ideas together
@@JohnWatsonRooney that is amazing! I will definitely be learning from that one. Not sure if this helps, but my employer stumped me with three/four errors:
- sometimes the page failed to go directly to the login. I wasn’t sure how to refresh the page and kept getting an infinite loop.
- sometimes the error (basically a div) would appear immediately after logging in, before the “loading” sequence. This seemed again like a refresh might have been the solution.
- occasionally after the “loading” sequence finished, an error would also appeared there. All three appeared to be reload related, but I wasn’t sure how to fix them.
- finally, an “error” of sorts was when multiple elements had the same id. This was also by design simply to be a pain in the neck.
Maybe some of those ideas will help! Either way, you’ve been a big help and I will tell many people about your channel now that I know about it and it’s amazing quality.
Have a great week
@@jonathanhammond5563 thanks really appreciate the suggestions!
cant install from command line and u didnt put a link to site
this is for linux, then how we have to do for windows
I used selenium to click a page
I need to use Chrome cache and cookies
But Chrome works as a test software
please guide me
Selenium is a nightmare to get started with. No solution is working for me.
It’s not very beginner friendly although very powerful. Check out my videos on helium and playwright - very good alternatives!
How do you actually starts to scrape after login with selenium, is that possible? And where is part 2 of this video?
I'm looking for some login scrape content, however I'm having hard time with this link: th-cam.com/video/cV21EOf5bbA/w-d-xo.html
Because the request for login/security that I found in network is not working =/
Once you login in you can either scrape elements with Selenium or get the entire page source with driver.page_source because sometimes it's not visible if you use requests
*.find_element_by_xpath()* no longer supported in the latest version. you need to use *.find_element("xpath", ' ')*
So it would instead look like *driver.find_element("xpath", '//*[@id="username"]').send_keys('tomsmith')*
Thank you for this very useful video!