I agree! It is better since playwright provides more functionalities. Basically simulating everything we can do on a browser, and it's lighter than Selenium. Thank you for your interest!
How can I scrape more data in each hotels? because there is a lot more data when you click every hotel listed in the search like most popular facilities, address etc
Your videos are very interesting, so I followed you. It’s a shame that your last video was uploaded 11 months ago. This channel has potential to grow a lot more!
Also, you may face the issue of clicking on the accept cookie button, otherwise the banner will prevent you from scrapping some data. Especially if you want to go to the next page because the cookie banner hides the next page button. If you want the script to run automatically, you must automate the accept cookie button. In selenium, this is the script: driver.find_element(By.XPATH, '//button[contains(text(), "Accept")]').click() But you should wait for the browser to fully load in order for the cookie banner to pop up and then click the accept button.
I am getting a timeout error with Amin's code and I think that's the reason why the output files are not being generated although it runs and print the number of results. Would you know why that is? (just asking because I noticed you clearly have a better domain on this than I do)
@@flaviacittadini5017 I was having this problem, then i noticed it was because of the check in and out date, which were on april, and we are on july, so the url wasnt working. You gotta change the dates
Maybe because this is older and they have changed somethings but I'm getting an error trying to scrape with the price. When I comment out the code for the line of price it works just fine.. but of course that is a very important piece. How can I work around this?
Hey, implementing proxies in your code depends on the provider. Usually you send a code or proxy numbers with the header of the request. Will make a special video about it. Proxy providers usually have documentation/ code examples, check that out.
Still didn't implement that yet! Will do it in future for sure. Right now you can add like a list of cities to the script and loop over them. The city is currently static in the URL (Paris). You need to make it dynamic
You need to go through multiple pages (deal with pagination). You have 2 options: 1- tell Playwright to click on the next button bellow each time (google how to click buttons using playwright. Very easy) 2- Booking as of now Booking.com use the "&offset=" in the URL for pagination purposes. If you go to page 2, you would find that the URL is the same as page 1, the only difference is that: "&offset=25" is added, and for page 3 "&offset=50" and so on. Loop over multiple pages since now we just need the first URL, and add "&offset= ..." each time, and scrape data. Hope it helps!
Hi there, really amazing tutorial, thank you so much for this! I've got a bit of an issue, though: Whenever I launch the script, it never creates the Excel/CSV files. It prints out the amount of hotels within the console, though. But I think it crashes after that, because it also doesn't close the browser window. Do you know what might cause this issue?
Not sure if I understood, but this script will get you data from the first page only. You need to add a pagination mechanism in place, Will do that in future!
This is a better way than bs4. Good job
I agree! It is better since playwright provides more functionalities. Basically simulating everything we can do on a browser, and it's lighter than Selenium. Thank you for your interest!
I swearrr this helped me so much! Thank youuuu
How can I scrape more data in each hotels? because there is a lot more data when you click every hotel listed in the search like most popular facilities, address etc
Thank you so much for sharing such valuable information. You are Genius.👏👏
Your videos are very interesting, so I followed you. It’s a shame that your last video was uploaded 11 months ago. This channel has potential to grow a lot more!
Thanks a lot @rrvbin6354 will be back very soon 🙏
Also, you may face the issue of clicking on the accept cookie button, otherwise the banner will prevent you from scrapping some data. Especially if you want to go to the next page because the cookie banner hides the next page button. If you want the script to run automatically, you must automate the accept cookie button. In selenium, this is the script:
driver.find_element(By.XPATH, '//button[contains(text(), "Accept")]').click()
But you should wait for the browser to fully load in order for the cookie banner to pop up and then click the accept button.
I am getting a timeout error with Amin's code and I think that's the reason why the output files are not being generated although it runs and print the number of results. Would you know why that is? (just asking because I noticed you clearly have a better domain on this than I do)
Good observations! Will update the code to that.
Will fix the code shortly, was away from TH-cam for a while.
@@AminBoutarfi hello! just found out your video today, thank you for helping us! did you already fix this cookie error?
@@flaviacittadini5017 I was having this problem, then i noticed it was because of the check in and out date, which were on april, and we are on july, so the url wasnt working. You gotta change the dates
Hi! this is an amazing tutorial! I have a one quick question, why only 30 hotels are scraped?
Can you tell us how to scrape the Stars rating? So the number of stars a hotel has?
Can we scrappe the reviews?
Could you do it to search rental cars ?
You legend! Thank you!
Maybe because this is older and they have changed somethings but I'm getting an error trying to scrape with the price. When I comment out the code for the line of price it works just fine.. but of course that is a very important piece. How can I work around this?
The booking page asks me to log in everytime so the script doesn't work. Any solutions ? Thank you!
Hi! Sorry, I am a REAL beginner. You included a Proxy in the comments but never mentioned it in the video. What should I do with that one?
Hey, implementing proxies in your code depends on the provider. Usually you send a code or proxy numbers with the header of the request. Will make a special video about it.
Proxy providers usually have documentation/ code examples, check that out.
Hey really nice tutorial, thanks :)
PS : How do you do when you want to scrape several city at the same time?
Still didn't implement that yet! Will do it in future for sure. Right now you can add like a list of cities to the script and loop over them. The city is currently static in the URL (Paris). You need to make it dynamic
@@AminBoutarfi how do you make it dynamic? my goal would be to enter a precise location and search the hotels within X km. I'm struggling with that
So how can we scrape for more data??
You need to go through multiple pages (deal with pagination). You have 2 options:
1- tell Playwright to click on the next button bellow each time (google how to click buttons using playwright. Very easy)
2- Booking as of now Booking.com use the "&offset=" in the URL for pagination purposes. If you go to page 2, you would find that the URL is the same as page 1, the only difference is that: "&offset=25" is added, and for page 3 "&offset=50" and so on. Loop over multiple pages since now we just need the first URL, and add "&offset= ..." each time, and scrape data.
Hope it helps!
Hi! Could you make it? I tried the second option that Amin suggested but I can only scrap 2 pages at a time and then it will timeout :/
@@AminBoutarfi how can i get to know how many pages are for a specific location, so I can loop for a specific number of pages
brillant one
Hi there, really amazing tutorial, thank you so much for this!
I've got a bit of an issue, though:
Whenever I launch the script, it never creates the Excel/CSV files.
It prints out the amount of hotels within the console, though.
But I think it crashes after that, because it also doesn't close the browser window.
Do you know what might cause this issue?
Hello Amin, can you teach how to scrape data from booking flight website to Excel? Thank you! 😊
Great idea! Will that in future video
Thank you! I'll wait for that. 😊
I'm the first commenter. I really like this video.
Thank you! I really appreciate it!
How can we scrape all hotel URLs?
Not sure if I understood, but this script will get you data from the first page only. You need to add a pagination mechanism in place, Will do that in future!
@@AminBoutarfi I mean scrape galery photos inside each hotel and other data