Prepared statements should not be used because they're faster. They should be used because they're much safer. The speed increase is just a free bonus.
@@tipeon I have been reliably informed that Oracle have imprisoned a Night Elf in their compiler and whenever it sees a java.sql.Statement, he screams "YOU ARE NOT PREPARED", teleports to it and starts wailing. It may be true, but I haven't used a PreparedStatement since Blizzard released Heroes of the Storm, so...
Our source has a lots of unprepared statements ❤ I gave up any security concerns in my company 😂 at least we just have a small number of accounts. It’s not perfect if someone is targeting us to exploit data but I haven’t written that bullshit
slight addition to COPY. you can use \copy from client if you don't have access to store input files on server. i.e. You can locally stream csv to server.
Great info which was valuable to me, and your channel is top notch anyway. And to add some humor into the mix: Beware, if you have more microservices than users, your system probably does not need indexing or the rest 🙂 stay health, stay fresh, and good luck out there
I want to be from ur first subscribers so when u reach a million in the next year i will comment i was here when he was getting started (i was here at 5k)
Your index explaination is not entirely correct. Postgres does offer hash-based indexes which are a lot closer to your explaination but the default index type (which you used in your creation example) is a B-Tree index, the data structure is very different. Paritions don't do anything meaningful to speed up writes, they would only speed up reads. Instead of scanning a whole table for a record, you only need to scan the single partition (assuming you're querying a single key) where you know your record lives in. It's the same concept as database sharding, but on one machine instead of multiple.
Thanks for your comment! I'm not sure what you're referencing but yes, btree is the default index which uses a binary tree for its lookup table. Partitions can speed up writes usually when asscosited with b tree indexes. One such factor is because of b tree balancing, which on partitioned table is usually less intensive than on the entire data set. Another increase in performance is when performing deletes associated with the partition column, as deleting the partition rather than deleting the rows of a table prevents rebalancing from taking place. This is common in time series data and dramatically improves write performance.
@@xetera Depends on the workload. Apps do silly things, like asking for the current user interface theme some 25 million times per hour. The properties table with 20 rows will not benefit much from an index, but 25 million prepared statements and a query cache will have a nice impact.
Thank you! I appreciate the feedback. For this video I used the Electrovoice RE20. I also recorded in a soundtreated room this time as well which made a difference!
Stored procedures are wonderful, but prepared statements have the advantage of being more dynamic in nature. Imagine e.g. a web page displaying a list with multiple columns each with different filters and sorting options. It would be a nightmare to implement with a stored procedure, but using a prepared statement you can dynamically build the necessary query.
@@Lightbeerer Kinda disagree here. Especially with web pages, the connections/sessions are very short and only exists for the short time the page is rendered. And if you only execute the query once per request or paging request, preparing the statements make it slower. And you can absolutely implement dynamic filtering/sorting etc with an SP... and with a lot less SQL injection dangers...
@@tipeon That's not what Connection pooling is for and I would consider this bad design. Connection pooling is for mitigating the connection overhead, but you're not supposed to assume that the connection is pooled or in any state.. you should assume it's a new or in some sense resetted connection. So you would have to first ask the server if there's already a prepared statement in the session and if not, recreate it. That would make things slow again. But it's probably not even possible, because you couldn't reconnect to your prepared statement after you reconnect. Which API allows you to reconnect to an already prepared statement on the server once you let go of the statement object you held? I'm not aware of any. So for this to work you'd need to implement your own connection pooling and keep track of the statement.. and that's an even worse idea.
people are not using indexes in an SQL service? also what people should learn about indexes if you create an index on a column the db will search faster after it if you have 3 where conditions, for example, then you need to create an index for those 3 colum combination for speed
Definitely at some of the places I've worked at. Indexes are kind of interesting, they're not very useful for small data sizes, and there's the risk of over optimizing for them.
>if you have 3 where conditions, for example, then you need to create an index for those 3 colum combination for speed Query planners are smart - if you have very large data sets you can do multi-column indexes and make sure the set reduction is in the correct order, but in my experience even up to a few billion records just having b-trees on each column individually is enough.
@@dreamsofcode Mysql is faster, but lack feature. So they both not good. Mongodb is pretty god, unfortunately it's not SQL which is what most people need. That's why companies should make better databases. Something that plug and play and fast.
It would be brilliant to have some database that is fast by default, but I’m afraid that is not possible in every use-case. Every choice in a database is a tradeoff. (Indexes for instance makes some reads a lot faster, but every write a lot slower…) I think the main selling point for PostgreSQL is that it is relatively easy to change these tradeoffs after you build your application.
13 And no man hath ascended up to heaven, but he that came down from heaven, even the Son of man which is in heaven. 14 And as Moses lifted up the serpent in the wilderness, even so must the Son of man be lifted up: 15 That whosoever believeth in him should not perish, but have eternal life. 16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. 17 For God sent not his Son into the world to condemn the world; but that the world through him might be saved. 18 He that believeth on him is not condemned: but he that believeth not is condemned already, because he hath not believed in the name of the only begotten Son of God. 19 And this is the condemnation, that light is come into the world, and men loved darkness rather than light, because their deeds were evil. 20 For every one that doeth evil hateth the light, neither cometh to the light, lest his deeds should be reproved. 21 But he that doeth truth cometh to the light, that his deeds may be made manifest, that they are wrought in God. (Jn.3:13-21)
I discovered your channel a few days ago; it's been amazing. Keep up the excellent work.
Thank you! I appreciate that a lot
Prepared statements should not be used because they're faster. They should be used because they're much safer.
The speed increase is just a free bonus.
This is a great point.
As a Java developer, I don't even remember the last time I used an unprepared statement.
@@tipeon I have been reliably informed that Oracle have imprisoned a Night Elf in their compiler and whenever it sees a java.sql.Statement, he screams "YOU ARE NOT PREPARED", teleports to it and starts wailing.
It may be true, but I haven't used a PreparedStatement since Blizzard released Heroes of the Storm, so...
While that is correct, I am pretty sure prepared statements were initially developed for performance.
Our source has a lots of unprepared statements ❤ I gave up any security concerns in my company 😂 at least we just have a small number of accounts. It’s not perfect if someone is targeting us to exploit data but I haven’t written that bullshit
Suggestion for next sql video is "how to vectorize sql database for fast searching"
This is a great suggestion!
@@dreamsofcodeHave you done it
slight addition to COPY. you can use \copy from client if you don't have access to store input files on server. i.e. You can locally stream csv to server.
Can you explain this in a better way?
Thanks!
Thank you so much!
Love your videos. Just one small request, please share which theme(terminal, text editor) you are using in somewhere in your description. Thank you
Great info which was valuable to me, and your channel is top notch anyway.
And to add some humor into the mix: Beware, if you have more microservices than users, your
system probably does not need indexing or the rest 🙂 stay health, stay fresh, and good luck out there
Really well made video. Staying here for more!
Any book recommendations on how to optimize PostgreSQL?
1. the art of postgresql
2. sql antipatterns
Great Video. Thank you!
I want to be from ur first subscribers so when u reach a million in the next year i will comment i was here when he was getting started (i was here at 5k)
Haha that would be awesome
Your index explaination is not entirely correct. Postgres does offer hash-based indexes which are a lot closer to your explaination but the default index type (which you used in your creation example) is a B-Tree index, the data structure is very different.
Paritions don't do anything meaningful to speed up writes, they would only speed up reads. Instead of scanning a whole table for a record, you only need to scan the single partition (assuming you're querying a single key) where you know your record lives in. It's the same concept as database sharding, but on one machine instead of multiple.
Thanks for your comment! I'm not sure what you're referencing but yes, btree is the default index which uses a binary tree for its lookup table.
Partitions can speed up writes usually when asscosited with b tree indexes.
One such factor is because of b tree balancing, which on partitioned table is usually less intensive than on the entire data set. Another increase in performance is when performing deletes associated with the partition column, as deleting the partition rather than deleting the rows of a table prevents rebalancing from taking place.
This is common in time series data and dramatically improves write performance.
@@dreamsofcode great point, when dealing with 100 000 000+ rows, I need to try this!
Bravo, subbed!
Who considered Mongo 'fancy'!? I thought everyone had got over the NoSql silliness.
"You can lower the roof and feel the wind in your hair", I love Dreams of Code, I love PostgreSQL
Thank you for the detailed explanation
suppose if i did use the copy command for inserting all rows from a csv, will this not affect my Indexing ?
I have found the channel ! awesome thank you
Seems kinda wild that the first point is prepared statements. It's not even a drop in the bucket for performance optimizations compared to indexing
@@xetera Depends on the workload. Apps do silly things, like asking for the current user interface theme some 25 million times per hour. The properties table with 20 rows will not benefit much from an index, but 25 million prepared statements and a query cache will have a nice impact.
there is no link to your code in the video description. very interesting video!
Oh shoot! Thank you for letting me know. I'll fix that now.
@ThePaulCraft Fixed! Thank you again.
Nice and informative vid.
Care to share what mic you are using? Sounds very nice
Thank you! I appreciate the feedback.
For this video I used the Electrovoice RE20. I also recorded in a soundtreated room this time as well which made a difference!
@@dreamsofcode Ah, that explains! I am looking to upgrade my gear but RE20 is really out of my budget 😂 Thanks for the reply
@@xucongzhan9151 It's pricey! I think the Shure SM57 is pretty decent as well and much cheaper, I use that one whenever I travel!
Why would you use preared statements instead of stored procedures? They are automatically "prepared" and don't need to be recreated in every session
Stored procedures are wonderful, but prepared statements have the advantage of being more dynamic in nature. Imagine e.g. a web page displaying a list with multiple columns each with different filters and sorting options. It would be a nightmare to implement with a stored procedure, but using a prepared statement you can dynamically build the necessary query.
@@Lightbeerer Kinda disagree here. Especially with web pages, the connections/sessions are very short and only exists for the short time the page is rendered. And if you only execute the query once per request or paging request, preparing the statements make it slower. And you can absolutely implement dynamic filtering/sorting etc with an SP... and with a lot less SQL injection dangers...
@@Lightbeerer of course it's all trade-offs but especially for web pages, preparing doesn't make sense if you don't call the query multiple times.
With connection pooling, prepared statements make sense because the connections are actually long lived.
@@tipeon That's not what Connection pooling is for and I would consider this bad design. Connection pooling is for mitigating the connection overhead, but you're not supposed to assume that the connection is pooled or in any state.. you should assume it's a new or in some sense resetted connection. So you would have to first ask the server if there's already a prepared statement in the session and if not, recreate it. That would make things slow again. But it's probably not even possible, because you couldn't reconnect to your prepared statement after you reconnect. Which API allows you to reconnect to an already prepared statement on the server once you let go of the statement object you held? I'm not aware of any. So for this to work you'd need to implement your own connection pooling and keep track of the statement.. and that's an even worse idea.
Awesome video
Thank you 🙏
Which language would I write a postgreSQL extension in? PL/SQL? ECMA? Python?
SQL and C code are typically used for creating an extension. Mainlg SQL code and C if you need something more powerful!
I would like to see content possible and good way to implement multi tenant on postgres
🔥any good resources to learn more?
There's very little out there really on optimizing PostgreSQL. If it's something of interest I can dedicate some more videos into optimization!
@@dreamsofcode yes pls!!!
@@dreamsofcode YES that would be really helpful
I am wondering how did you insert 20 million row into a table, where did you get that data from
I just randomly generated it using a mock data library in Go.
I have restricted my studies to data manipulation tasks. There is a lot to take a look on data definition and control yet!
Thanks for the video, very good content and well edited, I'd just recommend putting more dynamism in your voice to match the pacing
Thanks for the tips!
Mahn, please do the dadbod plugins for NvChad
I'm from the Oracle world, a lot of familiar concepts
What tool are you using the terminal looks so good
alacrity
you should add a timestamp for the copy statement part of the video
Nice! Now I don’t have to use web3 and store my data on crypto and pay per request and have huge latencies and non acid transactions. 😂
Joined as a sub , excellent content especially on read replicas
people are not using indexes in an SQL service?
also what people should learn about indexes
if you create an index on a column the db will search faster after it
if you have 3 where conditions, for example, then you need to create an index for those 3 colum combination for speed
Definitely at some of the places I've worked at.
Indexes are kind of interesting, they're not very useful for small data sizes, and there's the risk of over optimizing for them.
>if you have 3 where conditions, for example, then you need to create an index for those 3 colum combination for speed
Query planners are smart - if you have very large data sets you can do multi-column indexes and make sure the set reduction is in the correct order, but in my experience even up to a few billion records just having b-trees on each column individually is enough.
MOOOORE
A good database should be fast by default. if something requires deep knowledge to make it fast, it's a nerdy database.
Which databases would you consider "fast by default"?
So everything in programming is nerdy then cuz you need to learn to make things work.
wait, that indeed makes some sense...
@@dreamsofcode Mysql is faster, but lack feature. So they both not good. Mongodb is pretty god, unfortunately it's not SQL which is what most people need. That's why companies should make better databases. Something that plug and play and fast.
It would be brilliant to have some database that is fast by default, but I’m afraid that is not possible in every use-case. Every choice in a database is a tradeoff. (Indexes for instance makes some reads a lot faster, but every write a lot slower…)
I think the main selling point for PostgreSQL is that it is relatively easy to change these tradeoffs after you build your application.
chef
mongodb is web scale
Prepared statements STILL don't work with pgbouncer and most other db proxies. No thanks.
MongoDB for speed? PostgreSQL is a faster document db than mongo and it’s not even the main focus.
13 And no man hath ascended up to heaven, but he that came down from heaven, even the Son of man which is in heaven.
14 And as Moses lifted up the serpent in the wilderness, even so must the Son of man be lifted up:
15 That whosoever believeth in him should not perish, but have eternal life.
16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life.
17 For God sent not his Son into the world to condemn the world; but that the world through him might be saved.
18 He that believeth on him is not condemned: but he that believeth not is condemned already, because he hath not believed in the name of the only begotten Son of God.
19 And this is the condemnation, that light is come into the world, and men loved darkness rather than light, because their deeds were evil.
20 For every one that doeth evil hateth the light, neither cometh to the light, lest his deeds should be reproved.
21 But he that doeth truth cometh to the light, that his deeds may be made manifest, that they are wrought in God.
(Jn.3:13-21)
Thanks!
Thank you so much for the support. It's really appreciated!!!