Quick and simple way to explain dbt incremental models. Thanks for that! Any idea how to avoid those duplicates when doing the full-refresh you showed at the end?
Ultimately, these are not duplicates but a historization of all changes, and I prefer a historization in raw data. However, you could adapt the SQL in dbt so that only the most recent record is inserted per batch_id (max invoiced_at per batch_id).
Great content! Really appreciated all the tutorials regarding dbt! We are using incremental model for a while and the challenge we have is there are some tables with `id` unique identifier without any up-to-date date. Incremental model works to add only new records but fails to update previous records if there has been a change in any of the columns because we do not have any field like updated_at. Do you have any suggestions how can we solve this? For example, insert new records with new ids and update if there has been a change in a ´X´ column. Thank you!
Hello Kahan, I am following you since the beginning, now I am up to the dbt, please correct me if I'm wrong, we can do the same transformation with AWS Lambda Functions, by triggers or schedule, if the answer is yes, what would you recommend on AWS is it DBT or Lambda. And if we want to use dbt, please let me know where to host DBT project, to run it as cron job from github actions. can we host dbt on github and execute as a cronjob. Thank you so much in advance.
Looking for help with your team's data strategy? → www.kahandatasolutions.com
Looking to improve your data engineering skillset?→ bit.ly/more-kds
How to achieve incremental insert in dbt without allowing duplicates base on specific columns?
Any best practices for handling late arriving data? How do handle this along with having an incremental model in place?
Quick and simple way to explain dbt incremental models. Thanks for that!
Any idea how to avoid those duplicates when doing the full-refresh you showed at the end?
Ultimately, these are not duplicates but a historization of all changes, and I prefer a historization in raw data. However, you could adapt the SQL in dbt so that only the most recent record is inserted per batch_id (max invoiced_at per batch_id).
Great content, just as always
Much appreciated!
Can we use HAVING in the {{% if is_incremental %}} macro?
How to handle deletion in incremental model
Great content! Really appreciated all the tutorials regarding dbt! We are using incremental model for a while and the challenge we have is there are some tables with `id` unique identifier without any up-to-date date. Incremental model works to add only new records but fails to update previous records if there has been a change in any of the columns because we do not have any field like updated_at. Do you have any suggestions how can we solve this? For example, insert new records with new ids and update if there has been a change in a ´X´ column.
Thank you!
Did you find a solution ?
What if my dbt model is about grouping or joining sources? Сan i make it incremental?
Hello Kahan, I am following you since the beginning, now I am up to the dbt,
please correct me if I'm wrong, we can do the same transformation with AWS Lambda Functions, by triggers or schedule, if the answer is yes, what would you recommend on AWS is it DBT or Lambda.
And if we want to use dbt, please let me know where to host DBT project, to run it as cron job from github actions. can we host dbt on github and execute as a cronjob.
Thank you so much in advance.
Could you please make a video on how to read static files from s3 and use it create model in dbt, thank you
you need to use dbt see. Do some research on it. I hope that helps