Note to anyone writing sql for production. Don't use select *. It's convenient when you are trying analyze a database table or view, but it's not meant to be used in production queries. I have seen too many bugs because some dev decided to use select *. On same note don't use cross joins either unless you know the tables being joined will not get over a few hundred rows and even then I would never use them in production.
So glad I found this channel. I'm a backend dev, so I'm interested in content on building systems with high volume and data management. But so much on youtube is about frontend dev only. Your content is refreshing, keep doing what you're doing 💙
I once spent 3 months rewriting some software specific code for a bank and reduced the run time from around 7 hrs to under 3 minutes along with huge cost savings. After my first quick review of the code I'd told them the saving and speed up would be dramatic, but when I first demonstrated the improvements, by starting the processing at the very start of a meeting, they flat out didn't believe me when I told them it had completed. They were convinced I had cheated. In reality I'm no 10x programmer, not even close, but their previous code had clearly been written by someone with little experience and no idea about performance whatsoever. It was a well paid gig, but I'd rather have been offered a percentage of their savings!
This is one of the most informative video, that could save you from a lot of troubles, you should make more videos on something like these where we learn best way to optimize or avoid or mistakes.
You mean "which programming language one shouldn't consider learning in 2023 if your goal is to get a job". Because if your goal is to... actually write computer programs, the answer is not necessarily the same.
Second one should get shat on too. Come up with your own startup ideas. If it were worth anything to you, no one would publish a video on it and tell thousands of people.
Unoptimized SQL queries are the cancer of the whole software development, but that completely normal nowodays. Whole companies run on AWS etc. and do not know how much more they pay without optimization.
Same problem happened to me when I used Snowflake. I didn't use the cluster key efficiently and used too much expensive aggregation (like string and window function) and instead of one query can fetch them all, I splitted to a bunch of small/expensive queries. So instead of 1 query which last about 5-6 minutes, I nuked the table by creating 600 to 1k2 queries which lasts 10 - 30s, in the end we received a bill which costs 15k USD, which is a big punch in a face even when I worked at a big corp where budget at this cost could be forgiven. In the dev phase, we didn't expect that the user requests and didn't. Luckily Snowflake somehow still can handle that large amount of request and we still survive that month
we had the same issue with BigQuery our problem was with transferring all our DB data (from all our databases, each of them might be from 5TB to 100TB) and because of a small mistake and an extra query, we were charged more than 12k$ in the first 12 hours after releasing the integration It might be critical even for the medium business (like our Scandinavian e-commerce).
Things like this is why I dislike the focus on fullstack. Fullstack is great for mvp's and such, but eventually (which occurs far sooner than the bean counters think) comes quick.
Hey, Lewis! I was watching your video, fully interested, when suddenly my wife had to cover her ears, claiming that a very high pitched beep was hurting her ears. I went back and forth and we confirmed it was coming from the audio of your video. I decided to do some further investigation, so I downloaded the audio and ran it through a spectral analizer. Turns out that precisely on the 1:17 mark the audio emits a short high pitched beep in the 11200Hz frequency (with smaller peaks also in the 14000 and 18200Hz frequencies). It is loud. Loud enough to hurt animals and people with hearing sensitivity. If TH-cam allows you to do so, I'd strongly suggest that you check the source of the noise, cut/EQ it out and reupload the sound. It would certainly make it a lot less painful for a lot of people (and their pets). A not very obvious but much need step towards acessibility. Hope it makes a difference. Thank you for your amazing work!
Can we talk about how GCP cost estimates are complete dog shit and how a multibillion dollar company can’t show realtime usage? Shit I’d settle for the past hour usage. It takes 12-24 hours for usage to show up. It’s bullshit.
Interesting video. But I disagree about the 'Select *'. At least in the ms SQL. 'Where' on what is important, but all stands on good database normalisation, proper indexation... Select just tells the query what to bring back and it is better to only get precise data that is required. Cheers
Hi friends, I'm quite inexperienced to everything Lewis mentioned in this video. 😅 Is this video about Cloud Engineering cuz I'm really wanna get myself into Cloud stuff. I've built a few api, several apps for Android, and iOS. Just wondering, where should I start? To learn web dev first? or dive head first into Cloud?
@@CodingWithLewis Is learning frontend required too in this case? Also for backend, which lang do u recommend? I wanna shift from Java (Spring) but I don't really see any compiled lang other than Go. JS, php, python, Go, a recommendation would do best. Thanks in advance, Lewis.
Cool, this is the kind of work I'm helping my colleagues with on a daily basis. Not on the same scale though. First time I'm watching one of your videos, I'll definitely come back :)
This is why I run my own servers for development. A lot of that software is available to be installed on your own local machines. It's crazy how much companies are spending to paywall their own data.
Beginner programmer here: I'm taking my first class on databases and have been running queries in SQL to preview data, since this obviously isn't a smart move, how are you supposed to do it?
@250CC I wasn't really trying to say that using SQL was a bad idea but he made it sound like using queries was a bad idea to preview data as it was inefficient. I was wondering if there was other ways of previewing data that are more efficient
Preprocess the data before inserting. Index all your tables based on the most commonly queried. Use time windows (past day, past week) instead of queries against all time. Bigquery infact offers time based indexing by default. Also there's a "LIMIT 100" statement the query window will add by default so that once your query acquires the selected number of entries it will return.
Depending on your application needs SQL might not be the thing you want but during there’s nothing wrong with peeking at your data; as you mention you’re a beginner BigQuery, clustering and all that is a few steps into the future already so don’t be too eager to dive in there yet… good luck :)
So they query the data they needed? wow what a great revelation. Second, if you need such workload, the best you can do is make your own, it should be way cheaper, specially 'coz the high cost investment can be paid by Shopify that has lots of money. Clouds don't make sense when you can make your own.
Creating worldwide data centers isn't a great investment unless you were someone like Amazon (who already does AWS). So using the Cloud is a much better solution for Shopify. Building their own solution wouldn't be a great use of engineering time.
@@CodingWithLewis False 100%. For the data Set of 260 TB, like 10 HDDs, you only need a single server, in a Rack, connected to the Internet, standard Security; you don't need a Data Center; your perspective is totally delusional and favors lazyness, the reason I'm not subbed to you anymore
@@CodingWithLewis exactly. To add to it, I don't think Shopify have any competitive reason to do that... It's pretty easier to say let's build it from scratch than actually building it.
Saying this is a single $1,000,000 queries is very misleading it's a single query run 2.6million times that cost a million which is a far cry from a single query.
Just a pre-watch statement since this is going into my watch later playlist, but wanted to say that the thumbnail and the video title is spot on. As a self-funded, fintech app developer, it is literally impossible for me to push my codebase into full functionality testing since it relies on extremely large datasets from various data brokers as well as the cloud hosting. The app(s) would be great tools imho, but sadly, the lack of funding them is, well, kind of heart-breaking since I can't do anything with them, meaning months and months of research and coding all for nothing, although it's just as well considering if they work as intended, they would break the US financial system since the core of the apps is being an AI-powered prediction machine for the US stock market.
TLDR: Database admin added a clustered index to prevent full table scans.
And developers should be thinking about this stuff as a matter of course, especially at scale.
Thank you.
Well fancy meeting you here! 😊
@@JohnKerrashVirgo We must be both suckers for clickbait-y video titles and thumbnails.
@@JJKebab9 gotta do something in my lunch break 😝
Note to anyone writing sql for production. Don't use select *. It's convenient when you are trying analyze a database table or view, but it's not meant to be used in production queries. I have seen too many bugs because some dev decided to use select *. On same note don't use cross joins either unless you know the tables being joined will not get over a few hundred rows and even then I would never use them in production.
So glad I found this channel. I'm a backend dev, so I'm interested in content on building systems with high volume and data management. But so much on youtube is about frontend dev only. Your content is refreshing, keep doing what you're doing 💙
Check out Designing Data intensive applications!
@@jppbkm I am 😊 I also second the recommendation to any future readers
I heard recently about someone doing the same tweak in a different company which saved them about 300-400k a year. Its quite impressive.
I once spent 3 months rewriting some software specific code for a bank and reduced the run time from around 7 hrs to under 3 minutes along with huge cost savings. After my first quick review of the code I'd told them the saving and speed up would be dramatic, but when I first demonstrated the improvements, by starting the processing at the very start of a meeting, they flat out didn't believe me when I told them it had completed. They were convinced I had cheated. In reality I'm no 10x programmer, not even close, but their previous code had clearly been written by someone with little experience and no idea about performance whatsoever. It was a well paid gig, but I'd rather have been offered a percentage of their savings!
This is one of the most informative video, that could save you from a lot of troubles, you should make more videos on something like these where we learn best way to optimize or avoid or mistakes.
I would love to!
what programing language should i learn to cost my company 1,000,000 dollars?
This is why SQL DBMSes have the “EXPLAIN” command.
Suggestions:
Make a video on which programming language one shouldn't consider learning in 2023
Make a video on best tech startup ideas
Deal
First one might get shat on by the community but I would also like the second one.
You mean "which programming language one shouldn't consider learning in 2023 if your goal is to get a job". Because if your goal is to... actually write computer programs, the answer is not necessarily the same.
Second one should get shat on too. Come up with your own startup ideas. If it were worth anything to you, no one would publish a video on it and tell thousands of people.
My favourite tech TH-cam channel atm! 😊 love your sense of humour and how informative the vids are. Keep it up mate!
Use LIMIT command in your queries, and if you're not familiar with the tables you run, first check out the how many columns and what type they are
Really informative video for us programmers, thanks a lot! 💪
This "optimization" is in one of the first paragraphs in BigQuery docs.
Literally a few months ago we optimized a big query aggregation process to use time windows instead of all time. Saved the company about $10k a month
Unoptimized SQL queries are the cancer of the whole software development, but that completely normal nowodays. Whole companies run on AWS etc. and do not know how much more they pay without optimization.
Lol at 9:30 I can totally attest to that. My comment section is the same. 😂 Great video!
So, basically they didn't have any indices on the table and were doing the full table scan? Nice.
The Symbol on the very right @ 5:41 is kinda........ wrong oop
Same problem happened to me when I used Snowflake. I didn't use the cluster key efficiently and used too much expensive aggregation (like string and window function) and instead of one query can fetch them all, I splitted to a bunch of small/expensive queries. So instead of 1 query which last about 5-6 minutes, I nuked the table by creating 600 to 1k2 queries which lasts 10 - 30s, in the end we received a bill which costs 15k USD, which is a big punch in a face even when I worked at a big corp where budget at this cost could be forgiven. In the dev phase, we didn't expect that the user requests and didn't. Luckily Snowflake somehow still can handle that large amount of request and we still survive that month
we had the same issue with BigQuery
our problem was with transferring all our DB data (from all our databases, each of them might be from 5TB to 100TB)
and because of a small mistake and an extra query, we were charged more than 12k$ in the first 12 hours after releasing the integration
It might be critical even for the medium business (like our Scandinavian e-commerce).
I got a big query ad while watching this
Shopify has learned from their bugs! amazing video very informative
Also have to give props to the amazing engineers over there for sharing their discoveries!
Shopify loses that kind of money in between couch cushions every week.
Ive no clue who is the best cs youtuber, for me it's either you or fireship
Fireship is the 🐐
Things like this is why I dislike the focus on fullstack. Fullstack is great for mvp's and such, but eventually (which occurs far sooner than the bean counters think) comes quick.
Found your Channel today
And the content is amazing.
I subscribed to both your channel and news letter 🙌
Great long video. I found you through YT Shorts. I have to say the pacing on this longer format needs a bit more work, but great work.
Would love your feedback :D
Ugh, shopify would've loose a lot of money
Loose
Very Informative 🙏. Thanks
Also if you create an infinite loop in the cloud it will infinitly scale up to consume an infinite amount of money.
ChatGPT cameo @02:00
Can you make a video about what are the bests programing languages for freelancers and why!
Thanks in advance
Thanks Snyk for sponsoring today's video: snyk.co/lewis
What company would you like me to do next!? 🤔
Hey, Lewis!
I was watching your video, fully interested, when suddenly my wife had to cover her ears, claiming that a very high pitched beep was hurting her ears. I went back and forth and we confirmed it was coming from the audio of your video.
I decided to do some further investigation, so I downloaded the audio and ran it through a spectral analizer. Turns out that precisely on the 1:17 mark the audio emits a short high pitched beep in the 11200Hz frequency (with smaller peaks also in the 14000 and 18200Hz frequencies).
It is loud. Loud enough to hurt animals and people with hearing sensitivity. If TH-cam allows you to do so, I'd strongly suggest that you check the source of the noise, cut/EQ it out and reupload the sound.
It would certainly make it a lot less painful for a lot of people (and their pets). A not very obvious but much need step towards acessibility.
Hope it makes a difference. Thank you for your amazing work!
Well done super informative!
Great video Lewis, they just get better
Thank you so much 😀
@@CodingWithLewis anytime
Video contains ultrahigh frequency audio.
Ok but, where's the "on line code that costs 1 million dollars"?
bro i love your videos i wanna become as good as a programmer as you when i grow up keep it up please i love your videos
Nice video, but bro, you need to get some acoustic absorbers and diffuser. It's too much room noise in your mic
BigQuery has no storage cost? 65 PB of data in itself cost you 1M$ in storage cost per month.
Got a shopify ad for this video lmao
🤣
I love your explanation. I hope you will make a tutorial on SQL complex queries like left join, inner join.
Can we talk about how GCP cost estimates are complete dog shit and how a multibillion dollar company can’t show realtime usage? Shit I’d settle for the past hour usage. It takes 12-24 hours for usage to show up. It’s bullshit.
if they used Nosql database instead of sql database then what is the difference between cost
They didn't give too much information regarding the differences!
Yep, i saw this before. I costed few jobs at where I work today.
This is always because of change of scope deep in the dev cycle
So basically 'THE CLOUD' is one big money making machine for these cloud companies (Amazon, Google, etc....)?
Interesting video. But I disagree about the 'Select *'. At least in the ms SQL. 'Where' on what is important, but all stands on good database normalisation, proper indexation...
Select just tells the query what to bring back and it is better to only get precise data that is required. Cheers
My thought too, but I guess we're probably talking about queries that are more complex than a single SELECT ... FROM ... WHERE.
U should do "Telegram" next. I think Telegram have the most mysterious tech stack
Hi friends, I'm quite inexperienced to everything Lewis mentioned in this video. 😅 Is this video about Cloud Engineering cuz I'm really wanna get myself into Cloud stuff. I've built a few api, several apps for Android, and iOS. Just wondering, where should I start? To learn web dev first? or dive head first into Cloud?
It's about infrastructure mostly! In your scenario, I would say to keep building out some great API's and then learn how to migrate them to the cloud.
@@CodingWithLewis Is learning frontend required too in this case? Also for backend, which lang do u recommend? I wanna shift from Java (Spring) but I don't really see any compiled lang other than Go. JS, php, python, Go, a recommendation would do best. Thanks in advance, Lewis.
@@CC-bl7yf Node.js is the most popular out of all of them and probably the easiest to learn and use.
Great vid bro
Cool, this is the kind of work I'm helping my colleagues with on a daily basis. Not on the same scale though.
First time I'm watching one of your videos, I'll definitely come back :)
Are you REALLY thinking that humans (engineers) should be seen as monkeys at 1:09 and 11:55? This is so disrespectful and rude.
1:08 A team of engineers consisting of monkeys scratching their heads. LOL
Who the hell built the first query and went "That's good enough"?!
10:21 they were really doing that? Amazing they just get better
somebody noticed this strange symbol at 5:42 hmmm
Isn't this same as indexing in traditional rdbms
Amazing content that you provide! thanks a lot
My pleasure!
This is why I run my own servers for development. A lot of that software is available to be installed on your own local machines.
It's crazy how much companies are spending to paywall their own data.
Hey I subscribed to your newsletter but didn't get any post after registration
Beginner programmer here:
I'm taking my first class on databases and have been running queries in SQL to preview data, since this obviously isn't a smart move, how are you supposed to do it?
@250CC I wasn't really trying to say that using SQL was a bad idea but he made it sound like using queries was a bad idea to preview data as it was inefficient. I was wondering if there was other ways of previewing data that are more efficient
@@knflux9840 there's nothing wrong with probing data if you're doing that in some staging / dev environment, not production one
Preprocess the data before inserting. Index all your tables based on the most commonly queried. Use time windows (past day, past week) instead of queries against all time. Bigquery infact offers time based indexing by default. Also there's a "LIMIT 100" statement the query window will add by default so that once your query acquires the selected number of entries it will return.
Depending on your application needs SQL might not be the thing you want but during there’s nothing wrong with peeking at your data; as you mention you’re a beginner BigQuery, clustering and all that is a few steps into the future already so don’t be too eager to dive in there yet… good luck :)
I love your Videos and your animations
Thanks so much!
So they query the data they needed? wow what a great revelation. Second, if you need such workload, the best you can do is make your own, it should be way cheaper, specially 'coz the high cost investment can be paid by Shopify that has lots of money. Clouds don't make sense when you can make your own.
Creating worldwide data centers isn't a great investment unless you were someone like Amazon (who already does AWS). So using the Cloud is a much better solution for Shopify. Building their own solution wouldn't be a great use of engineering time.
@@CodingWithLewis False 100%. For the data Set of 260 TB, like 10 HDDs, you only need a single server, in a Rack, connected to the Internet, standard Security; you don't need a Data Center; your perspective is totally delusional and favors lazyness, the reason I'm not subbed to you anymore
@@CodingWithLewis exactly. To add to it, I don't think Shopify have any competitive reason to do that... It's pretty easier to say let's build it from scratch than actually building it.
Amazing 👏.
Can you make a video explaining the cloud and some services they offer, thank you.
Spend on QA, QC have independent unit testing team, don't cut cost
Thanks for sharing
You said you linked a playlist below at the end of your video however I did not see one :\
Ajax now that's a term I've not heard for a long long time.
this bad boy will cost you everything DROP database
Wasn't it an "inframe"?
Hey how do you edit your videos?
Well presented
What is big query? Me who got an ad for it
What if they did linear search
Amazing
This whole video sounds like an ad
sir if you were to suggest me what to master then what would be your suggestion: AI/ML or web3/blockchain
please don't deny my question.
I think only my company devs write shitty query.
interesting love ya videos
hollywood running in the background?
Good catch!
Saying this is a single $1,000,000 queries is very misleading it's a single query run 2.6million times that cost a million which is a far cry from a single query.
Cover Robin Hood or Dropbox
Just a pre-watch statement since this is going into my watch later playlist, but wanted to say that the thumbnail and the video title is spot on. As a self-funded, fintech app developer, it is literally impossible for me to push my codebase into full functionality testing since it relies on extremely large datasets from various data brokers as well as the cloud hosting. The app(s) would be great tools imho, but sadly, the lack of funding them is, well, kind of heart-breaking since I can't do anything with them, meaning months and months of research and coding all for nothing, although it's just as well considering if they work as intended, they would break the US financial system since the core of the apps is being an AI-powered prediction machine for the US stock market.
9:56 I just bought a cheap Linux machine to run my servers lol
dùng mấy video để tìm động lực tìm hiểu và coding
This seems too obvious.
Don't use Kafka it's Java crap and it will fail in production .... Nats streaming
ya lost me at the sponsor
Jonny harries
Airbnb
5:41 as a German I approve
nice.,
Too basic
👍
This is so obvious it's not even funny
you run too much ad
lol
358th
early woo
Welcome