But if the raspberry "proxy" is on the same local network isn't it using the same public ip address to do webscrapping ? Shouldn't this raspberry proxy be on a network with a different ISP provided public ip so that it can be leveraged correctly to avoid throttling ?
That's a great question! It's possible to still have some success with web scraping even when using a proxy on the same local network, especially if the websites being scraped aren't actively blocking or throttling requests based on IP addresses. However, for more reliable and efficient web scraping, it's generally recommended to use a proxy with a different public IP address. This helps to distribute requests effectively and reduces the likelihood of being blocked or throttled by websites. If you've had luck with your setup on your local network, it might be because the websites you're scraping haven't implemented strict IP-based blocking or throttling measures. Just keep in mind that as your scraping activities increase or as you target different websites, you may encounter limitations or restrictions.
Great tutorial! I would love to connect a few of these to a single RPI to monitor my filament containers. Do you know if that is possible? And, if so, maybe where there is a howto on something like that?
Hey man, no I do not know on the top of my head how to do that. You probably need more sensor apparatus than just a simple Pi. If you would like to discuss this in detail feel free to book a consulting slot on the buy me coffee link found on my TH-cam profile. Sounds involved yet interesting!
I loved to use tail -f comand to the squid log. so I get the real time ip and connections the client was connecting to. it was very helpful to diagnose what IP's i need to block in the firewall
Great tutorial! I was wondering if we could use 3-4 4G LTE modems and rotate the IPs whenever there's a block. Do you have any suggestions on how to achieve this? Please provide some guidance.
Wow you definitely could sounds like an interesting project. You would have to have some error handling in your Python code I have done something similar. I do not think it is so complicated to do that
How can you adjust the squid.conf to allow remote use of the proxy? If I set the proxy I made as my browser proxy on my laptop it works great when I’m connected to the same wifi network but if I’m at a friends house on their wifi and try using the proxy as my browser proxy, it will not prompt for user authentication hence disallowing remote connection.
I have an internet connection that has a proxy and its IP is 192.168.49.1:8000, in order to connect to the internet I must configure this data, how do I configure the same on the raspberry pi4, I have not been able to use the internet via wifi, the raspberry pi4 connects to the wifi and assigns an IP automatically but does not browse because I have not configured this data as would be done when it is in client mode.
But if the raspberry "proxy" is on the same local network isn't it using the same public ip address to do webscrapping ? Shouldn't this raspberry proxy be on a network with a different ISP provided public ip so that it can be leveraged correctly to avoid throttling ?
That's a great question! It's possible to still have some success with web scraping even when using a proxy on the same local network, especially if the websites being scraped aren't actively blocking or throttling requests based on IP addresses. However, for more reliable and efficient web scraping, it's generally recommended to use a proxy with a different public IP address. This helps to distribute requests effectively and reduces the likelihood of being blocked or throttled by websites. If you've had luck with your setup on your local network, it might be because the websites you're scraping haven't implemented strict IP-based blocking or throttling measures. Just keep in mind that as your scraping activities increase or as you target different websites, you may encounter limitations or restrictions.
Great tutorial! I would love to connect a few of these to a single RPI to monitor my filament containers. Do you know if that is possible? And, if so, maybe where there is a howto on something like that?
Hey man, no I do not know on the top of my head how to do that. You probably need more sensor apparatus than just a simple Pi. If you would like to discuss this in detail feel free to book a consulting slot on the buy me coffee link found on my TH-cam profile. Sounds involved yet interesting!
I loved to use tail -f comand to the squid log. so I get the real time ip and connections the client was connecting to. it was very helpful to diagnose what IP's i need to block in the firewall
Thanks for the info!
Great tutorial! I was wondering if we could use 3-4 4G LTE modems and rotate the IPs whenever there's a block. Do you have any suggestions on how to achieve this? Please provide some guidance.
Wow you definitely could sounds like an interesting project. You would have to have some error handling in your Python code I have done something similar. I do not think it is so complicated to do that
How can you adjust the squid.conf to allow remote use of the proxy? If I set the proxy I made as my browser proxy on my laptop it works great when I’m connected to the same wifi network but if I’m at a friends house on their wifi and try using the proxy as my browser proxy, it will not prompt for user authentication hence disallowing remote connection.
You can do these sorts of things easily with Tailscale. I recommend looking into that.
Can you make 2 VM raspberry. One is act like client and the other one is a proxy server squid
Inshallah
I have an internet connection that has a proxy and its IP is 192.168.49.1:8000, in order to connect to the internet I must configure this data, how do I configure the same on the raspberry pi4, I have not been able to use the internet via wifi, the raspberry pi4 connects to the wifi and assigns an IP automatically but does not browse because I have not configured this data as would be done when it is in client mode.
I am not sure my friend
Did you try manually setting a static IP in the router admin page?
@@JamminJosh7 I have done that before yes