SCD Type 1 and Type 2 using SQL | Implementation of Slowly Changing Dimensions
ฝัง
- เผยแพร่เมื่อ 14 ต.ค. 2024
- In this video we will learn how to implement scd type 1 and type 2 using SQL. This will help you build logics using update and insert statement.
script:
CREATE TABLE product_stg(
Product_id INT,
Product_Name VARCHAR(50),
Price DECIMAL(9,2)
);
CREATE TABLE product_dim(
Product_id INT primary key,
Product_Name VARCHAR(50),
Price DECIMAL(9,2),
last_update date
);
create TABLE product_dim(
product_key int identity(1,1) primary key,
Product_id INT,
Product_Name VARCHAR(50),
Price DECIMAL(9,2),
start_date date,
end_date date
);
Zero to hero(Advance) SQL Aggregation:
• All About SQL Aggregat...
Most Asked Join Based Interview Question:
• Most Asked SQL JOIN ba...
Solving 4 Trick SQL problems:
• Solving 4 Tricky SQL P...
Data Analyst Spotify Case Study:
• Data Analyst Spotify C...
Top 10 SQL interview Questions:
• Top 10 SQL interview Q...
Interview Question based on FULL OUTER JOIN:
• SQL Interview Question...
Playlist to master SQL :
• Complex SQL Questions ...
Rank, Dense_Rank and Row_Number:
• RANK, DENSE_RANK, ROW_...
#sql #dataengineer
you wont believe ,i was just learning the same concept from your python course today itself in the morning
Great stuff.Must learn one by every data enthusiast.
Great way of explaining SCD types
Great Ankit, thanks. I am completely new to this concept and its very useful
@ankit bansal: Great job on explaining the concept. qq: Instead of making the end date as forever, will it make sense to keep it as NULL & include another column such as is_current_value which would be a boolean field. When someone wants to track the history in the report, an analyst can simply put the condition for start_date, end_date IS NULL and is_current_value = 'n' to take a look at the previous record or they could query on the start_date, end_date IS NOT NULL and is_current_value ='y'. You could even use an OR operator in the query with the structure I'm proposing. Using forever as the end_date is frowned upon in the data warehousing world IMHO.
If there was a way to love your videos and not just like.. Learning a lot Ankit. Thanks
Cheers 🥂
Very good information and thanks for the content. How to create staging tables in the first place?
Hi Ankit, great explanation,
how to handle scenario in scd2 type two, when there are insert, update and delete all together in staging for the same record.
Assuming we are using cdc to keep track of changes and using cdc info to update the dim tables
then you need to create one more temp_table while running script by keeping where timestamp in stg_table > max(timestamp) in dim_table to get the only changed records to temp_stg table
now data is in temp_table (which has only latest records)
dim_table has old records as of now (we did not performed any transformations yet)
now follow anikt procedure to keep history track
great video ! need more data modelling and data engineering videos man !
Thank you for creating such quality content.
I have a question,
is it possible to implement such SCD2 using merge ? (where update and insert are involved to maintain history, same example as described in video).
Thanks in advance.
It can be done but merge operation can have performance issues.
@ankit Bhaiya, Instead of doing manual work by query, We can create insert update trigger also, that will be a good automation work.
What you say brother.☺
That will be too much load because it will trigger for each row.
my question is if we connect the data in power bi desktop so we need to manually do this scd 2 or it will automatically updated
Best video.. Thanks !! If possible pls make videos on SQL performance tuning or launch course.
Yaay just yesterday only I learned this thanks
In SCD1 when first insert is completed we emptied the stg table.. How can we do changed to update dim without empty the stg after first insert
sir, can we implement scd-1 via merge statement. i mean to ask is merge statement is nothing but the scd-1 only?
Hi Ankit sir will you start any data engineering course ?
Great Explanation !
Thanks for the video Ankit
Awesome Bro..
Superb explanation 👌 👏 👍
Needed this video but 6months ago... Bt we did it together in office with a friend that time 😀😺 using sql
Sir which one is first video I learn to this course I start my career plz help me
th-cam.com/video/ejdIgYPfcV4/w-d-xo.html
Thank you Ankit Bro
Can't we use merge to perform the SCD2 implementation?
Performance is not good with merge.
Thanks @ankit
Million Thanks
1ST TABLE IS UPSERT NOT TRANCATE LOAD RIGHT?
Thank you so much Ankit ❤😊
My pleasure 😊
Can’t we use merge statement instead of using two separate insert and update statements???
Performance not good with merge
❤❤❤
you have written ELT as extract tranform and load. It's extract load and tranform
Can't we implement it using Merge statemnet
What if same record comes in staging table,how to handle it?@ankit
That is the case of copy records. We can check if the key and value are the same then ignore them
Not sure what you have not implemented by using merge statement
for mysql query is slightly changed:
set @updated_date='2024-01-20';
UPDATE product_type1_dim a, product_stg b
SET a.price = b.price, a.last_update = @updated_date
WHERE a.product_id = b.product_id ;
bro keep the pace slow. You speak too fast
Ok next time. You can reduce speed from settings.
Thank you Ankit