Love how you kept this video short and consise, already spent 3 hours on tutorials scraping with requests and bs4 all to discover i need to scrap with selenium for my particular site anyway
Quick sharing of my story before I say huge, enormous THANK YOU 😊 🙏 I started a new job a few months ago and I had to do web scraping and stuff which was so new and terrifying to me. thanks to your videos I managed to go for it and do tasks which used to be completely beyond my understanding. So, John, THANK YOU MILLION TIMES for your efforts and work put into this channel
Hey John, I hardly ever comment on videos, but I wanted to let you know that this was exactly what I was looking for! And only 11 minutes? Great job, and thank you!
Homeboy! with this simple video you helped me sort out almost every question I had after hours of useless content! 10 out of 10! You just gained a new sub
Thank you John for this, extremely helpful stuff. You explain everything so well, it makes me very excited to practice along. Also please consider recording another video showing a more complicated use case of browser automation using Selenium. Cheers !!
Wonderful! Thanks man! I especially liked how you started from each video 'catalog' and then iterated inside for each video. In the future, would love to see more complicated examples for example, how to go to a window, collect data, close the window and things like this...
Rookie mistake: when creating the CSV file from the pandas dataframe, I accidentally put .py instead of .csv and the results ended up overwriting the entire script since it had the same name hahahaha. Luckily I was able to Command-Z it to undo.
Your videos are so easy to follow and learn from thank you a lot for them. I follow along the selenium and scrapy series and that was just the jump start I needed to star scraping all that I needed . Again Thanks
Awesome video! One note: find_elements_by* methods are deprecated by selenium. The new method is find_element(), with an additional argument specifying the source and type (e.g. 'xpath', 'class name, etc). Additionally, I can't seem to scrape more than 30 elements with this method, is there a reason why?
thanks John for sharing this videos and managing your time to do them, these are really helpful. I'm looking for all selenium videos you created, but found just 2 of them. I wish you help us again and create some new ones about selenium in a higher level.... thanks again
Good videos on Selenium. Question - did you make that additional vid about doing headless Selenium on a Linux server? You mentioned that would part 3 of the Selenium vids. Cheers!
mine is exactly the same, and no results nothing 0, and I even did import time and added some time so the page would open because it is a heavy data page and did not work, seems like this just wont work for me
Really clear, however "find" in my code gets only visible elements, but if I scrool i can retrieve more elements.... how can I scroll till the limit to get as much elements as possible? thanks
Went through this today. Well explained again. Using VSCode. When browser (FFox) opens there is a google account dialogbox. Caused script to timeout before TH-cam page loaded. To overcome this did import time and time.sleep(5) after driver.get(url).
Hello John! Thank you for all these videos. I've been working on some webscaping projects and literally been on your channel all day! I will recommend this to all my friends to subscribe! Quick question, for a website like glassdoor...what would you use for the XPATH for the company title and position? I can't seem to figure it out.
Great, straight to the point tutorial. When I include the "." in front of the xpath I get the following error: Message: no such element: Unable to locate element. When I remove the dot I only get the information from the first video. Do you know why this might be? Thanks
I thought this would be the solution to my problem but it does not seem so. I am having trouble with Zillow and indeed and neither selenium nor Beautifulsoup works for me.
how would you go about grabbing the number of comments under each video? would you need to click on one video and somehow grab the html tag for comment and pass it through a loop? I've tried doing something similar with beautiful soup and im running into a wall each time I attempt it
When i set the variables in for loop (ex title=video.find_element_by_xpath('.//*[@id="video-title"]').text) it only returns for the first video. IF I run the for loop and just say print(video.find_element_by_xpath('.//*[@id="video-title"]').text) I get all the videos. Whats messing up when I set them to the variable title? I copied yours word for word
This was so helpful man, THANK YOU. I didn't need the " . " in front, maybe they have changed that since this was posted. I did have to remove " /span " to stop a printout of my selected item, printing for each option though. ...Basically it printed the same name for each name in the list with " /span" still attached at end... I watched maybe 6 different videos, Thanks again! UPDATE: I had to put a break after the for loop to stop the repeated print out. Still learning 🙌
Could you help me out? I copied the skript exactly ike in the skript but i only get one result. Howewer if i write in the loop "Print (video.text)" all the Informationen and more are coming. I tried to delete the "span" you mentioned but it dosent work for me :( - Beginner aswell here...
@@cyber_chrisp Sure: The loop is: " videos= driver.find_elements(By.CLASS_NAME, 'style-scope ytd-grid-renderer') for video in videos: title = video.find_element(By.XPATH, '//*[@id="video-title"] ') views = video.find_element(By.XPATH, '//*[@id="metadata-line"]/span[1] ') when = video.find_element(By.XPATH, '//*[@id="metadata-line"]/span[2] ') print(title.text, views.text, when.text)" If i run this loop. It gives me only the result of one video. But if i replace the "Print(title..." with "Print(videos.text)", it will give me all the raw data
Hi John, I am attempting to use this to scrape prices and titles of Target products, but when I attempt to do so, I get the following error message: Traceback (most recent call last): File "", line 2, in title = games.find_elements_by_xpath('.//*[@id="mainContainer"]/div[4]/div[2]/div/div[2]/div[3]/div[2]/ul/li[1]/div/div[2]/div/div/div/div[1]/div[1]').text AttributeError: 'list' object has no attribute 'find_elements_by_xpath' could you please point me in the right direction?
hello, i just found your channel and subscribed. On channels that have modal pop-ups for GPDR consent, etc is there any way to use requests or do I have to use selenium? When I use requests, the function never returns.
If the request is actively blocking the content on the page then unfortunately requests won’t work as we can interact with the page. Maybe selenium is the best bet
How can I use Python to search for a list of serial numbers in a document column, employ a search toggle (similar to the TH-cam search toggle), and subsequently extract the results obtained for each serial number?
Hello, Thanks for the nice tutorials. In a video you showed about the "requests_html" library to scrape dynamic content, so how feasible is it in comparison of Selenium.( I understand the Power of Selenium), but I have few questions: 1) can we scrape big sites like TH-cam, amazon with reauests_html, and get the dynamic data ? 2) how does requests_html render js in the background ? You passed sleep=1 in the html session. So is requests_html also using selenium in the background to render the dynamic js data ? Please answer my queries if you know, it can help me figure out things. Thanks very much.
Hi! Sure hopefully this helps - Selenium is a tool designed for testing websites by automating a broswer, such as chrome or firefox, but because it controls this browser we can use it to scrape data (by loading up the page). requests_html is designed for scraping and uses chromium (open source version of google chrome) to render dynamic content and give us access to the data. we can't use it for automation, but we can use it to scrape the sites you mentioned.
beautiful, You made me all revised in counted minutes ot time . Thanks John. Wish you can make a video to crawl through a lsrge list of URLs using supplied by an exl sheet.
Hey John, many thanks, this video helped me a lot. I have a question, I'm trying to scrape clothes stores from google search and get the Name, address, and Contact Number. The way to do this, will be much different from how you did from youtube scrape?
title = video.find_elements_by_xpath('.//*[@id="video-title"]').text AttributeError: 'list' object has no attribute 'text' any ideas why that happens ? I have to add [0] before .text to throw a result.
Hi, I have one problem, actually why I'm getting no such element exception. I inspected the elements. I tried XPath, CSS selector, id to get that element but still getting that exception. Can you help me with why that exception occurs?
I'm trying to scrape a webpage that loads a table in steps of 10 entries as you scroll down the page. This method doesn't load the entire html script for me. There's 2000 entries in the table. How do I force it to load all entries?
Hi Glenn, you can get Selenium to scroll down the page for you, check out this stack overflow link and try some of their suggestions: stackoverflow.com/questions/20986631/how-can-i-scroll-a-web-page-using-selenium-webdriver-in-python
@@JohnWatsonRooney Thanks! This solution kinda worked for me. The page I'm trying to scrape apparently had infinite scrolling so I went ahead with the given solution. But the results that loaded depended on the sleep time value. Like, I couldn't always have all results loaded but sometimes it did. I guess, it also depends on my network. Tried to look for a workaround but I gave up. Nevertheless, I'm good.
A big thankyou sir for helping out with this! I was able to follow through your video step by step, however, the output gave only 30 entries and the page that I used as a URL has more than 100 videos. What could be the reason of it, sir?
Hey John! Good video, very comprehensible and well explained. But I have a question: How do I scrape many different dynamic Websites (10-15) with selenium in parallel? Because there is realtime-data I want to scrape, it has to be simultaneous and not one after the other.
hi! good question. you can run several instances of Selenium if you wanted too however it will be resource heavy as its just running lots of browsers at the same time. I'd check if you really need to use Selenium - this is video on a potential alternative method th-cam.com/video/DqtlR0y0suo/w-d-xo.html
this video is a bit older now but i use exclusively CSS selectors now for everything. XPath is good, but I jsut prefer the way css selectors look and feel. if that makes sense!
hey John, thanks for your video, just a quick question : when I trying to print print(title,views,when), I only got one output instead of all. do you know whats wrong with my code videos = driver.find_elements_by_class_name('style-scope ytd-grid-renderer') for video in videos: title = video.find_element_by_xpath('.//*[@id="video-title"]').text views = video.find_element_by_xpath('.//*[@id="metadata-line"]/span[1]').text when = video.find_element_by_xpath('.//*[@id="metadata-line"]/span[2]').text
Hello, and many thanks! Is there a limit (by TH-cam or otherwise) regarding the number of videos which data can be obtained (I meant, for example, you have much more than 12 videos, could it be possible to extract the data for all of them, regardless of the quantity, or is there a limit?
Great video, I was following along for my url and get a successful run in my for loop but no output is shown even though I have it printing my output. Have any suggestions for this?
When I include the . with the xpath I get an error message (Message: no such element: Unable to locate element). However when I dont use the . I get the same results for each element I am looping through.
This is good stuff... Can you get into more in depth stuff like controlling scrolls and a bit more complex js rendered page.. sites like Netflix and stuff
@@JohnWatsonRooney along the lines of Yatish's question, do you know why mine only returned 8 results? What aspect of this script tells itself to stop scraping data? It doesn't end with an error per se, but it stops short of reporting all results. Is this in fact a scrolling issue? i.e. - is 'drive.find_elements_by_class_name' a function that only can scrape what the human eye could see on the page? BTW this is once again a fantastically explained and very helpful video. Thanks John!
Could you perhaps do a video of how to use scrapy AND selenium 🤔 to scrape dynamic websites. I had a hiccup in a technical interview where I was asked to scrape a dynamic website and he wanted me to use scrapy with selenium. I've used both of them separately but never together. And I never got a call back. But it still has me wondering now how that would work.
Hey John, great video as usual! Do you have any tips on how to scrape a table without any class attributes? ( I tried using this code and find by xpath as top line before the for loop but it only prints out the first element) Thanks a lot !
Great stuff. Have you ever incurred error connection timeouts when scraping different sites within a for loop? If so, how did you code/account for bypassing that site to the next one?
I've found that selenium can get terribly slow while locating elements especially when you try to locate by xpath. Finding by class_name or tag name are seemingly faster. Still though it took about 15-20 mins to process 2000 entries and write to file.
It is slow unfortunately, but sometimes the only option other than doing the task manually. Have you tried helium? It’s a selenium wrapper so won’t be faster but can be easier code to write
@@JohnWatsonRooney Thanks for letting me know about Helium! I watched your video on it and tried it out and it actually does seem faster (and easier for sure) than selenium (in case of my task) even though the underlying calls are to the selenium API itself. I guess it does it more efficiently than the script I wrote using selenium directly.
If anybody has problems with your video titles/views printing, try using time.sleep(1) after driver.get(url). I assume this allows the webpage to fully load before it tries scraping for your elements. This fixed my issue
I made this video when I had 4 Subs. Last week rolled over 10k, thank you all!
75k+ now haha
@@hunterbshindi2310 yeah can't quite believe it!
@@JohnWatsonRooney 91.5k Keep it up brother😁
Love how you kept this video short and consise, already spent 3 hours on tutorials scraping with requests and bs4 all to discover i need to scrap with selenium for my particular site anyway
Quick sharing of my story before I say huge, enormous THANK YOU 😊 🙏 I started a new job a few months ago and I had to do web scraping and stuff which was so new and terrifying to me. thanks to your videos I managed to go for it and do tasks which used to be completely beyond my understanding. So, John, THANK YOU MILLION TIMES for your efforts and work put into this channel
Thank you! I’m glad I was able to help, and good luck with your new job!
I am a total beginner in coding and I find your videos very helpful, thank you
this is by far the best tutorial on selenium and a cool tip on panda!! Thanks @John
Hey John, I hardly ever comment on videos, but I wanted to let you know that this was exactly what I was looking for! And only 11 minutes? Great job, and thank you!
Thank you very much!
Seriously, this is a fabulous video. You explained it very well. Please do full series of web scraping.😊
Homeboy! with this simple video you helped me sort out almost every question I had after hours of useless content! 10 out of 10! You just gained a new sub
Thank you John for this, extremely helpful stuff. You explain everything so well, it makes me very excited to practice along. Also please consider recording another video showing a more complicated use case of browser automation using Selenium. Cheers !!
Agreed, this is the best, clear, practical and to the point tutorial out there. Thanks so much!
What a short and quick explanation of scraping with Selenium. You rock! Found this my searching, and just hit the subscribe button. Thank you!
Thanks, glad you liked it!
man, how happy when I found your channel, very clear explains, this is the true gold mine for me!!! Thank you!
Thanks!
Great content to learn web scraping. Not discovered by many yet, will be a huge hit
thank you
Wonderful! Thanks man!
I especially liked how you started from each video 'catalog' and then iterated inside for each video.
In the future, would love to see more complicated examples for example, how to go to a window, collect data, close the window and things like this...
Thanks for the feedback! I agree a more advanced video would be a great idea
Super helpful! Allowed me to finish my project for work! Thank you!
Great Video as always John, appreciate you taking the time
Thank you John - exactly what I needed and well explained in this short format. Excellent!
great explanation of each point, Sir please make a full series of web scraping including some advance stuff, thanks
Rookie mistake: when creating the CSV file from the pandas dataframe, I accidentally put .py instead of .csv and the results ended up overwriting the entire script since it had the same name hahahaha. Luckily I was able to Command-Z it to undo.
Undo is an essential part of my toolkit!
Great content as always. Short. Precise and clear..
Thank you so much John. Appreciate a lot 🙏🙏🙏💜
Your videos are so easy to follow and learn from thank you a lot for them. I follow along the selenium and scrapy series and that was just the jump start I needed to star scraping all that I needed . Again Thanks
The best selenium tutorial I've seen online
Can I make a private enquiry?
that period before the xpath is such a good tip!
You gave us a very clear idea of how can we us selenium.
Thank you brother
So easy to understand, BEST selenium video !
Awesome video! One note: find_elements_by* methods are deprecated by selenium. The new method is find_element(), with an additional argument specifying the source and type (e.g. 'xpath', 'class name, etc). Additionally, I can't seem to scrape more than 30 elements with this method, is there a reason why?
Facing the same issue
same
Hi bro i m getting error as empty dataframe while scraping amazon reviews using this code how can i resolve this ?
youre amazing, i spent 3 hours trying to figure out webscrapping but u helped me solve the issue with just a short video... IM SUPER THANKFUL
Damn! From 4 to 2.44k subscribers. Really good work !
this type of content is very rare sir...thank you sir
This video made me love Python even more ....subscribed !
thanks John for sharing this videos and managing your time to do them, these are really helpful. I'm looking for all selenium videos you created, but found just 2 of them. I wish you help us again and create some new ones about selenium in a higher level.... thanks again
Sir, your videos are amazing.Really helped me clearing many many doubts in scraping.Thank you so much.May God bless you.!
Thank you!
@@JohnWatsonRooney sir, if you can make a video on how to scrape a web page with infinite scroll as we move down..it'll be really helpful !
thank you again for not making selenium imtimidating.
Good videos on Selenium.
Question - did you make that additional vid about doing headless Selenium on a Linux server? You mentioned that would part 3 of the Selenium vids. Cheers!
I did! That one actually never got made - I am working on something else though that will do the same thing
mine is exactly the same, and no results nothing 0, and I even did import time and added some time so the page would open because it is a heavy data page and did not work, seems like this just wont work for me
Thank you for sharing this video! I'm able to start my own scraping project from a website that has grid results like in your example!
You only had 4 subscribers? Nice to see your progression
Making some progress!
simple, effective, direct.
amazing job.
This is really helpful, thank you so much.
Good Work.
Perfect level for a beginner. Easy to follow and understand. Learned a lot, thanks.
Thank you so much, sir. That pandas dataframe technique is really helpful. I will share this video with my friends.
hi, i am getting this error AttributeError: 'list' object has no attribute 'find_element_by_xpath'
how do i solve this?
Really clear, however "find" in my code gets only visible elements, but if I scrool i can retrieve more elements.... how can I scroll till the limit to get as much elements as possible? thanks
Went through this today. Well explained again. Using VSCode. When browser (FFox) opens there is a google account dialogbox. Caused script to timeout before TH-cam page loaded.
To overcome this did import time and time.sleep(5) after driver.get(url).
That's exaaaaaactly what I was looking for. Thank you mate!
Hello John! Thank you for all these videos. I've been working on some webscaping projects and literally been on your channel all day! I will recommend this to all my friends to subscribe!
Quick question, for a website like glassdoor...what would you use for the XPATH for the company title and position? I can't seem to figure it out.
Great, straight to the point tutorial.
When I include the "." in front of the xpath I get the following error: Message: no such element: Unable to locate element. When I remove the dot I only get the information from the first video.
Do you know why this might be?
Thanks
getting the same error when I tries to scrape other sites. Any idea why ?
I thought this would be the solution to my problem but it does not seem so. I am having trouble with Zillow and indeed and neither selenium nor Beautifulsoup works for me.
Brother u deserve 1 million subs
Great tutorial. Can you also make a video about following the links and extracting data from inside the link?
How long would this work for typically? Do websites change their divs often enough that would break this script?
Most established websites don’t change that often so usually good to go for a while, it’s just something to be aware of!
how would you go about grabbing the number of comments under each video? would you need to click on one video and somehow grab the html tag for comment and pass it through a loop? I've tried doing something similar with beautiful soup and im running into a wall each time I attempt it
When i set the variables in for loop (ex title=video.find_element_by_xpath('.//*[@id="video-title"]').text) it only returns for the first video. IF I run the for loop and just say print(video.find_element_by_xpath('.//*[@id="video-title"]').text) I get all the videos. Whats messing up when I set them to the variable title? I copied yours word for word
I’m having the same issue, did you manage to find a fix?
Love this! ❤ please make more vedios on latest python modules
This was so helpful man, THANK YOU. I didn't need the " . " in front, maybe they have changed that since this was posted. I did have to remove " /span " to stop a printout of my selected item, printing for each option though. ...Basically it printed the same name for each name in the list with " /span" still attached at end... I watched maybe 6 different videos, Thanks again! UPDATE: I had to put a break after the for loop to stop the repeated print out. Still learning 🙌
Hey thanks! I’m glad you got it sorted
Could you help me out? I copied the skript exactly ike in the skript but i only get one result. Howewer if i write in the loop "Print (video.text)" all the Informationen and more are coming. I tried to delete the "span" you mentioned but it dosent work for me :( - Beginner aswell here...
@@MrHi114 are you able to share the section of the code?
@@cyber_chrisp Sure: The loop is: " videos= driver.find_elements(By.CLASS_NAME, 'style-scope ytd-grid-renderer')
for video in videos:
title = video.find_element(By.XPATH, '//*[@id="video-title"] ')
views = video.find_element(By.XPATH, '//*[@id="metadata-line"]/span[1] ')
when = video.find_element(By.XPATH, '//*[@id="metadata-line"]/span[2] ')
print(title.text, views.text, when.text)"
If i run this loop. It gives me only the result of one video. But if i replace the "Print(title..." with "Print(videos.text)", it will give me all the raw data
Hi John, I am attempting to use this to scrape prices and titles of Target products, but when I attempt to do so, I get the following error message:
Traceback (most recent call last):
File "", line 2, in
title = games.find_elements_by_xpath('.//*[@id="mainContainer"]/div[4]/div[2]/div/div[2]/div[3]/div[2]/ul/li[1]/div/div[2]/div/div/div/div[1]/div[1]').text
AttributeError: 'list' object has no attribute 'find_elements_by_xpath'
could you please point me in the right direction?
Good Sir! What if I also wanted to extract the "href" from the video title, how can I do that?.
Are there any significant differences in performance in finding elements between xpath, class, id, etc.?
Thanks for all of your videos!
hello, i just found your channel and subscribed. On channels that have modal pop-ups for GPDR consent, etc is there any way to use requests or do I have to use selenium? When I use requests, the function never returns.
If the request is actively blocking the content on the page then unfortunately requests won’t work as we can interact with the page. Maybe selenium is the best bet
How can I use Python to search for a list of serial numbers in a document column, employ a search toggle (similar to the TH-cam search toggle), and subsequently extract the results obtained for each serial number?
OMG that little .text is exactly what I was looking for. Thank you sir you solved my problem
Hello, Thanks for the nice tutorials.
In a video you showed about the "requests_html" library to scrape dynamic content, so how feasible is it in comparison of Selenium.( I understand the Power of Selenium), but I have few questions:
1) can we scrape big sites like TH-cam, amazon with reauests_html, and get the dynamic data ?
2) how does requests_html render js in the background ? You passed sleep=1 in the html session. So is requests_html also using selenium in the background to render the dynamic js data ?
Please answer my queries if you know, it can help me figure out things.
Thanks very much.
Hi! Sure hopefully this helps - Selenium is a tool designed for testing websites by automating a broswer, such as chrome or firefox, but because it controls this browser we can use it to scrape data (by loading up the page). requests_html is designed for scraping and uses chromium (open source version of google chrome) to render dynamic content and give us access to the data. we can't use it for automation, but we can use it to scrape the sites you mentioned.
@@JohnWatsonRooney Thanks for the information. 😊
When you right clicked and copied the path it was like watching the person who discovered fire. #gamechanger 👏👏👏
Sir can I know which type of data that client usually ask for.
beautiful, You made me all revised in counted minutes ot time . Thanks John. Wish you can make a video to crawl through a lsrge list of URLs using supplied by an exl sheet.
Great suggestion!
How do you click on link elements such as "
When I used xpath , in the loop . It keeps on printing the first element instead of moving forward
Thank you for this very useful video!
Hey John, many thanks, this video helped me a lot. I have a question, I'm trying to scrape clothes stores from google search and get the Name, address, and Contact Number. The way to do this, will be much different from how you did from youtube scrape?
title = video.find_elements_by_xpath('.//*[@id="video-title"]').text
AttributeError: 'list' object has no attribute 'text'
any ideas why that happens ? I have to add [0] before .text to throw a result.
Hey John, How long did it take for you to learn Web Scraping?(In Hours)
Hi, I have one problem, actually why I'm getting no such element exception. I inspected the elements. I tried XPath, CSS selector, id to get that element but still getting that exception. Can you help me with why that exception occurs?
Thanks for the enlighten video. Well done!
I'm trying to scrape a webpage that loads a table in steps of 10 entries as you scroll down the page. This method doesn't load the entire html script for me. There's 2000 entries in the table. How do I force it to load all entries?
Hi Glenn, you can get Selenium to scroll down the page for you, check out this stack overflow link and try some of their suggestions:
stackoverflow.com/questions/20986631/how-can-i-scroll-a-web-page-using-selenium-webdriver-in-python
@@JohnWatsonRooney Thanks! This solution kinda worked for me. The page I'm trying to scrape apparently had infinite scrolling so I went ahead with the given solution. But the results that loaded depended on the sleep time value. Like, I couldn't always have all results loaded but sometimes it did. I guess, it also depends on my network. Tried to look for a workaround but I gave up. Nevertheless, I'm good.
Very helpful explanation!! Question: What if I wanted to get data within that link? (description or comments as example)
Hello, I am having issue when i run the code everything seems to be working fine but dataframe is empty. Any reason why this would happen?
A big thankyou sir for helping out with this! I was able to follow through your video step by step, however, the output gave only 30 entries and the page that I used as a URL has more than 100 videos. What could be the reason of it, sir?
Hey John!
Good video, very comprehensible and well explained.
But I have a question: How do I scrape many different dynamic Websites (10-15) with selenium in parallel?
Because there is realtime-data I want to scrape, it has to be simultaneous and not one after the other.
hi! good question. you can run several instances of Selenium if you wanted too however it will be resource heavy as its just running lots of browsers at the same time. I'd check if you really need to use Selenium - this is video on a potential alternative method th-cam.com/video/DqtlR0y0suo/w-d-xo.html
how do you determine when to use find elements by xpath vs find elements by class name?
this video is a bit older now but i use exclusively CSS selectors now for everything. XPath is good, but I jsut prefer the way css selectors look and feel. if that makes sense!
hey John, thanks for your video, just a quick question : when I trying to print print(title,views,when), I only got one output instead of all.
do you know whats wrong with my code
videos = driver.find_elements_by_class_name('style-scope ytd-grid-renderer')
for video in videos:
title = video.find_element_by_xpath('.//*[@id="video-title"]').text
views = video.find_element_by_xpath('.//*[@id="metadata-line"]/span[1]').text
when = video.find_element_by_xpath('.//*[@id="metadata-line"]/span[2]').text
print(title,views,when)
Hi john, I have been trying this method on Facebook, but I can't seem to get it to work
Hello, and many thanks! Is there a limit (by TH-cam or otherwise) regarding the number of videos which data can be obtained (I meant, for example, you have much more than 12 videos, could it be possible to extract the data for all of them, regardless of the quantity, or is there a limit?
Great video, I was following along for my url and get a successful run in my for loop but no output is shown even though I have it printing my output. Have any suggestions for this?
when you print the data you got around 10 video details. But how to get all video details? Please answer
Very good video.
A question.
How to block the browser exe screen?
Thank you so much John.
Thank you sooo much for this tutorial! Can you also do LinkedIn profile scraping..?
When I include the . with the xpath I get an error message (Message: no such element: Unable to locate element). However when I dont use the . I get the same results for each element I am looping through.
This is good stuff... Can you get into more in depth stuff like controlling scrolls and a bit more complex js rendered page.. sites like Netflix and stuff
Sure! I’m a planning a more advanced version of this vid to come.
@@JohnWatsonRooney along the lines of Yatish's question, do you know why mine only returned 8 results? What aspect of this script tells itself to stop scraping data? It doesn't end with an error per se, but it stops short of reporting all results. Is this in fact a scrolling issue? i.e. - is 'drive.find_elements_by_class_name' a function that only can scrape what the human eye could see on the page? BTW this is once again a fantastically explained and very helpful video. Thanks John!
Guys if you're having trouble, maybe it's the chromedriver.
Great video. very helpful.
Why script keeps returning an empty list what could I be doing wrong?
Could you perhaps do a video of how to use scrapy AND selenium 🤔 to scrape dynamic websites. I had a hiccup in a technical interview where I was asked to scrape a dynamic website and he wanted me to use scrapy with selenium. I've used both of them separately but never together. And I never got a call back. But it still has me wondering now how that would work.
Sure - I’ve done it with Scrapy and playwright, and have a video. There’s a package called scrapy-selenium I think that helps connect them together
Hey John, great video as usual! Do you have any tips on how to scrape a table without any class attributes? ( I tried using this code and find by xpath as top line before the for loop but it only prints out the first element)
Thanks a lot !
Same problem I'm facing
Great stuff. Have you ever incurred error connection timeouts when scraping different sites within a for loop? If so, how did you code/account for bypassing that site to the next one?
You could have a tracking of how many attempts were made per each site, and after some treshold delete them from queue completely
I've found that selenium can get terribly slow while locating elements especially when you try to locate by xpath. Finding by class_name or tag name are seemingly faster. Still though it took about 15-20 mins to process 2000 entries and write to file.
It is slow unfortunately, but sometimes the only option other than doing the task manually. Have you tried helium? It’s a selenium wrapper so won’t be faster but can be easier code to write
@@JohnWatsonRooney Thanks for letting me know about Helium! I watched your video on it and tried it out and it actually does seem faster (and easier for sure) than selenium (in case of my task) even though the underlying calls are to the selenium API itself. I guess it does it more efficiently than the script I wrote using selenium directly.
Also sites could make REDISIGN. And you will have to rewrite all... And there is another approach, which could help with some subset of redisigns.
Can anyone please explain why import error for web driver occurs for me?
Nice video but how can I get the results into excel with the link of the thing you are trying to scrape?
very simple explain thank, I gone try tomorrow
Awesome thanks for the feedback!
If anybody has problems with your video titles/views printing, try using time.sleep(1) after driver.get(url). I assume this allows the webpage to fully load before it tries scraping for your elements. This fixed my issue
Great tip thanks for sharing