How I run my Python scripts everyday in the cloud

John Watson Rooney

มุมมอง 5 662

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 ก.ย. 2024
Check Out ProxyScrape here: proxyscrape.co...
➡ WEB
johnwr.com
➡ COMMUNITY
/ discord
/ johnwatsonrooney
➡ PROXIES
proxyscrape.co...
➡ HOSTING (Digital Ocean)
m.do.co/c/c7c9...
If you are new, welcome. I'm John, a self taught Python developer working in the web and data space. I specialize in data extraction and automation. If you like programming and web content as much as I do, you can subscribe for weekly content.
⚠ DISCLAIMER
Some/all of the links above are affiliate links. By clicking on these links I receive a small commission should you chose to purchase any services or items.
This video was sponsored by ProxyScrape.

ความคิดเห็น • 39

@hugohoyzer2202 หลายเดือนก่อน ⁺⁷
super slick! been following you since a year now. your content and skill is evolving constantly. its hard to be able to keep up with you :D
@JohnWatsonRooney หลายเดือนก่อน
hey thanks a lot! really appreciate it
@skillswithsid 8 วันที่ผ่านมา ⁺¹
Loving the content John, keep em coming.
@graczew หลายเดือนก่อน ⁺³
Hehe finally get that "devops" for scrap the world. Good job as always. If I get this film year ago this will save me many hours. However this is perfect guide for someone who want to run scraper in cloud.
@JohnWatsonRooney หลายเดือนก่อน
Hah yeah I know, thanks
@oludelehalleluyah6723 หลายเดือนก่อน ⁺²
Wow... I've learnt something new that I'll use with another project
@Levy957 หลายเดือนก่อน ⁺³
love your content!
@milosZcr หลายเดือนก่อน ⁺¹
Cool. I think you (or your viewers) may also want to explore Functions product that D.O .offers. I see that you can also use cron-like features to run the python (or other languages) code.
Just like GCP (cloud functions) there is a free layer, so you can run many runs before they start to charge you. But, in this case, for the log file thing, I guess you would also have to buy D.O. storage to store that file there. I see its $5 / mo for 250GB, which would be good if all functions one manages need much more than the basic droplets 10GB.
I have to say the method of doing this a bit more manually like you are showing us, does offer much more flexibility about what you can do, and how to do it.
I am more a GCP guy, so I will sure also try your method, depending on the use case I have.
Thanks for the good info, as usual.
Cheers
@cursoderobotica 28 วันที่ผ่านมา ⁺¹
Good video!! Thank you for share!! 😁👍
@guruware8612 หลายเดือนก่อน ⁺¹
14:20 - about that 2>&1...
1 is the file descriptor for stdout (which is the default when using some command >> outfile)
descriptor 2 is stderrr where the errors (usually) go,
"usually" because some coders are too lazy, they output everything to stdout by using printf(...) instead of using fprintf(stderr, ...) for error output
by 2>&1 you are redirecting stderr to stdout and errors go to you log - messages AND errors
you could do something like 2 > error.log to get a separate error-logfile
@JohnWatsonRooney หลายเดือนก่อน
thank you for clarifying, I knew I wasn't quite right with that, and also that I just did it because that's how I was shown!
@majestif หลายเดือนก่อน ⁺¹
Many thanks for an amazing tutorial! It would be great to see how to push logs to a remote service like Sentry.
@JohnWatsonRooney หลายเดือนก่อน
Cool suggestion, I'll have a look and see if I can do a follow up
@azamatbagatov6858 หลายเดือนก่อน ⁺¹
Great video! The videos you make are so much more transparent than other coding channels who just assume familiarity with this and that technology. (I’ve been watching for years and this is the first time I’ve commented!) Been running very similar workflows to this and Digital Ocean seems so much easier and cheaper than EC2, etc. Ansible is also great for automating the command lines, installs, etc. and gives full reproducibility if you want to set up multiple similar instances. One question: where do you recommend keeping your object store and/or SQL database for storing the scraped data? On something backed up by Digital Ocean, or your local machine, or a server in your homelab, or…? Cheers!
@JohnWatsonRooney หลายเดือนก่อน
thanks! for my own projects I have most stuff on my home server, but for other work I usually just use a managed DB on Digital Ocean. It;s just easier to not have to think about it or worry.
@oludelehalleluyah6723 หลายเดือนก่อน ⁺¹
I've really learnt a lot from you...
Thank you
@JohnWatsonRooney หลายเดือนก่อน
Thanks for watching
@oludelehalleluyah6723 หลายเดือนก่อน
@@JohnWatsonRooney I'll love if you can share your neovim setup...
There are a lot out there but I'm sure if you make one, it'll be different
@thghtfl 29 วันที่ผ่านมา ⁺¹
thanks for the video mate! one question after watching this though: why didn’t you wrap all that up in a docker composer to set up your environment quickly?
@JohnWatsonRooney 29 วันที่ผ่านมา ⁺¹
I wanted to keep it as simple as possible for those who haven't got this far yet
@atulraaazzz2931 29 วันที่ผ่านมา ⁺¹
Good sir🎉🎉
@MrZinchyk หลายเดือนก่อน
Question: if you frequently work with similar tasks, why not make a Bash script?
@Divyv520 24 วันที่ผ่านมา ⁺¹
Hey john , really nice video ! I was wondering if I could help you with more Quality Editing in your videos and also make a highly engaging Thumbnail and also help you with the overall youtube strategy and growth ! Pls let me know what do you think ?
@JohnWatsonRooney 24 วันที่ผ่านมา
Hey as much as I’d like an editor etc my channel doesn’t earn enough to pay for that I’m afraid
@Grizzler231 29 วันที่ผ่านมา ⁺¹
How to combine scrapy with selenium
@JohnWatsonRooney 29 วันที่ผ่านมา
There’s a scrapy selenium package on pypi, and on the GitHub repo one of the issue reports shows how to update it to work
@naradakandawala4278 26 วันที่ผ่านมา
Great ❤
@talhairfan4492 หลายเดือนก่อน ⁺¹
What about Selenium bots?
@JohnWatsonRooney หลายเดือนก่อน
I usually use selenium grid hosted and remote connect to it but it’s not something I do very often. You can run headless chrome on a vps too, via docker or similar
@talhairfan4492 หลายเดือนก่อน
@@JohnWatsonRooney Thanks, there are many websites that I cannot scrape sometimes in headless approach. There are no APIs or any hidden JSONs either to scrape them from. An open browser is the only solution, seems to me.
@abiodun6897 หลายเดือนก่อน
i use screen to manage multiple instances
@Archepter หลายเดือนก่อน
My man, cron jobs to run scripts exist since the 80's , way before the cloud. I have never heard anyone running daily scripts from their PC manually, wth?!?!?
@nuel_d_dev หลายเดือนก่อน
lol...so you don't know people convert their scripts to bat files and run cron locally 😅
@personofnote1571 หลายเดือนก่อน ⁺¹
You never run locally? Just debug straight in prod? Now THAT is from the 80s 😂
@Archepter หลายเดือนก่อน ⁺¹
@@nuel_d_dev I'm assuming you aren't running your product on your home laptop, of course I mean on whatever servers you have available.
@thghtfl 29 วันที่ผ่านมา
⁠⁠@@personofnote1571read again, he never runs them manually
@sitrakaforler8696 หลายเดือนก่อน ⁺¹
haah yes !
@RonnyNussbaum หลายเดือนก่อน
“Every day”, not “everyday”.

ต่อไป

เล่นอัตโนมัติ

The most important Python script I ever wrote