Optimize DELETE CASCADE

Vedran Kesegic

มุมมอง 1 296

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 23 ก.ย. 2024
Optimizing DELETE CASCADE is not easy. The best tips to optimize SQL Server data processing without blocking other users. Optimizing SQL is what I do every day. If you want me to optimize your SQL Server, just drop me a message. Even without code change, lot of things can be improved with a relatively small effort, for someone who knows the internals and tricks.
blog.datamaster.hr

ความคิดเห็น • 15

@DanielLiuzzi 3 ปีที่แล้ว ⁺²
So the recommendation would be to use FK constraints (to enforce referential integrity) but not to rely on them for deleting/updating data, to avoid performance issues in the long run. I must admit I did not know this; I always assumed this was safe as long as the child table has an index supporting the FK constraint. The more you know! Thanks for this great explanation. 👍
@VedranKesegicSQL 3 ปีที่แล้ว ⁺¹
Thanks Daniel for insightful comment, you summed it up really nice!
@blackisblack22 ปีที่แล้ว
There is a table with a PK over identity ID column, do you recommend that every nonclustered index should INCLUDE this ID column for direct reference to PK?
@milosbalaban5524 2 ปีที่แล้ว ⁺¹
Very nice explanation, good content Vedran! Would like to see more of your videos! :)
@kresovrkic2562 3 ปีที่แล้ว ⁺¹
Nice and educational. To be honest we are doing quite opposite and using a lot cascade delete, but with indexes on FK columns in child tables which usualy contains not more then 100 rows per parent row, so "prolazimo lišo :)".
However, if the database allows access not only through a single application (developers, admins, other applications, etc.), stored procedures are not the ultimative solution that ensures the consistency of complex data objects when deleting data from a parent table.
@VedranKesegicSQL 3 ปีที่แล้ว
There is no protection against admin, it needs to be trusted :) Everybody else (all apps) can be managed and forced to use stored procs through permission restrictions. Devs deleting data on prod.... scary.
@jimmymac601 3 ปีที่แล้ว ⁺¹
How would you handle a deeply nested parent-child table where parent-child relationship is in the same table ? In other words the children could be parents and their children could also be parents. The depth would be unkown. I tried determing the maximum level in the tree and deleting from the bottom up but performance is poor. Is there a way to improve performance ?
@VedranKesegicSQL 3 ปีที่แล้ว
I guess the situation is a delete of a hierarchy implemented as a self-referencing table. That is deleting from one table. It is always good to split into two processes: (1) finding rows to delete and (2) delete them in batches of eg. 1000 rows per loop iteration.
Phase 1 would be a hierarchical query or a loop that traverses hierarchy gathering PK into a temp table. It should also store a "level" where 1=root, 2=root children, like a floor number of a pyramid where top level is 1 and grows towards bottom.
Phase 2 is a delete loop. It starts with IDs at max level, deleting TOP(1000) until entire level is cleared. Then level is decreased by 1 and same thing repeats for that upper level until all levels are cleared. There are NOT many levels for sure. A hierarchy with more than 1000 levels is not very probable, and 1000 iterations is nothing, super-fast. This delete-by-level approach makes sure you delete bottom-up without violating FK constraints, but is much more faster than deleting per-parent as it grabs much more rows at once, entire level basically.
Moreover, deleting in batches is efficient (fast) and does NOT block entire table. Therefore other users can work without being blocked (except sometimes, but even that is limited to max 1-2 seconds blockage, and you can decrease batch size to decrease the blockage time while keeping it high enough eg. 100+ rows for efficient deletion speed).
Also, a clustered index is very important to achieve a good delete speed. Pack the rows you delete dense together in a clustered index. For example, such hierarchy table would have a PK(identity) created as non-clustered, and then a clustered index created with a 2-column unique key (ParentId, Id). That would keep all children physically packed into small number of pages which makes delete much much faster than in a table that has clustered PK(identity) - a sad default on so many tables I see. And you immediately have a FK column covered with an index. It helps to have very little or none non-clustered indexes. Also avoid triggers and "cascade" option on foreign keys as that would make deletion slower.
@jimmymac601 3 ปีที่แล้ว ⁺¹
@@VedranKesegicSQL Thank you. This is the approach I have taken with one exception. I have used a table variable in lieu of a temp table. And also since this is a design I have inherited I do have the dreaded clustered Identity primary key on every table. What I am missing is the 2-column unique key that covers the FK relationship.
I will try that.
Thank you very much for the suggestion.
@VedranKesegicSQL 3 ปีที่แล้ว
@@jimmymac601 Don't use transactions. If interrupted, rerun and it will continue where it left off. Because less than 5000 rows per batches is used it won't block other users at least not long. Therefore it is not as important how long it takes, it is a background task that does not bother users. Just let it run. It will finish eventually. To speed it up, you might be looking at execution plan (spot full scans of big tables, inefficient nested loops with >10 000 iterations, lookups as opportunity for covering index etc), reduce non-clustered indexes, make sure to SPOT if any triggers/child FK exist, what are wait types, etc.
@blackisblack22 ปีที่แล้ว
@@VedranKesegicSQL If it’s supposed to avoid FK’s then how could we guarantee data integrity?
@VedranKesegicSQL ปีที่แล้ว
@@blackisblack22 What do you mean by "avoid FK"? FK is there all the time, it is not avoided.

ต่อไป

เล่นอัตโนมัติ