Incremental Refresh with your Warehouse without a date in Microsoft Fabric

Guy in a Cube

มุมมอง 21 605

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 2 ต.ค. 2024
Want to incrementally refresh your data without using a date for your Synapse Data Warehouse in Microsoft Fabric? Patrick gives you a pattern you can leverage!
Alex Powers' Blog
CHANGE (IN THE HOUSE OF LAKES)
itsnotaboutthe...
📢 Become a member: guyinacu.be/me...
*******************
Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
🎓 Guy in a Cube courses: guyinacu.be/co...
*******************
LET'S CONNECT!
*******************
-- / guyinacube
-- / awsaxton
-- / patrickdba
-- / guyinacube
-- / guyinacube
-- guyinacube.com
**Gear**
🛠 Check out my Tools page - guyinacube.com...
#MicrosoftFabric #Synapse #GuyInACube

ความคิดเห็น • 17

@lassenikolajsen6131 8 หลายเดือนก่อน ⁺¹⁷
You have to be sure that people cannot make changes in the orders to actually use this method.. as much as I would like this to be true in most cases it is not. This method will not include changes in existing orders if the id is lower than the max id. This is only usefull with transactional tables which uses counter transactions instead of changing values in line.
@ItsNotAboutTheCell 8 หลายเดือนก่อน
I'm hopeful this video sparks some ideas from folks who may need to increment on any data type where Power BI has traditionally been the DateTime and a bit of tweaking to cast differently to the situation. At least in this sequential example it's one method, but if you have a timestamp and modified column that adds a whole new range of opportunities but at the very least starting to think of a metadata driven pattern where you can drive your values is incredibly powerful where in the past it's really been locked by the incremental refresh UI or having to do some more complex development with the XMLA endpoint and REST APIs.
@jasminamarkovic6739 7 หลายเดือนก่อน ⁺²
Hello, I heard that MERGE is not supported in Fabric Warehouse, and my Fact tables can have rows that are updated without changing EntryID. Also incremental refresh is not supported on pipelines only append, what is not the same - can create duplicated, so what are then the available options. Thanks
@IshanDeshpande-gj6kd 5 หลายเดือนก่อน ⁺¹
What if I have some updates to existing records, how can we implement that?
@kdubinin 8 หลายเดือนก่อน ⁺³
Srsly? Bigint as id in 2024 and then procedure with table name as string parameter?
@Arikshtein 8 หลายเดือนก่อน ⁺¹
Isn't it easier to create a column (as a view or in power query) in the table that would be some dummy date (say 1900-01-01) plus [your ID column] seconds. Then you magically have a date column and incremental refresh will work out of the box 🧐
@SudipBala-k7q 7 หลายเดือนก่อน ⁺²
What if data are deleted and need to be reflected in the tables ?
@ItsNotAboutTheCell 8 หลายเดือนก่อน ⁺⁴
Patrick makes the videos people need!
@szklydm 8 หลายเดือนก่อน ⁺¹
#teampatrick #iwatchpatrick
@ItsNotAboutTheCell 8 หลายเดือนก่อน ⁺¹
@@szklydm team #iwatchpatrick for sure!
@kel78v2 2 หลายเดือนก่อน
This is just an incremental append. What we've been trying to solve are the updates on data changes since MERGE INTO is still not supported even in fabric data warehouse? Would appreciate any help here.
@raj7sam 8 หลายเดือนก่อน ⁺¹
I would like to understand more about how stored procedures and pbi are connected, or more detailed video might help
@prachijain1693 8 หลายเดือนก่อน ⁺¹
WOW, Hello! I have a question about working with a notebook. How can I store 128-bit data without it being automatically converted into scientific notation? Additionally, when I convert the data into a string, some values are stored in scientific format. How can I prevent this from happening?
@benkloosterman7934 2 หลายเดือนก่อน
Used the same to get data from Lakehouse into Warehouse which is very poorly documented ( i did an sql query from the lakehouse sql end point trhough had to select Synapse as the source type). The change feed can add a change id.
This also covers live changes as the job runs every 2 minutes.
@dvvolynkin 8 หลายเดือนก่อน ⁺¹
📝 *Summary::*
*Incremental Loading*
💭 00:00:20 The video discusses the concept of incremental loading and its importance in data warehousing.
💻 00:00:37 The process of incrementally loading a warehouse using Pipelines and dataflows is demonstrated.
🔧 00:01:59 A stored procedure is created to accept a table name as a parameter and return the max ID for the given table.
⚙ 00:03:24 The return value from the stored procedure is used in Power Query to filter data based on the max ID.
📈 00:04:36 The max IDs for each table are updated in a meta table to keep them in sync with the warehouse data.
*Warehouse Setup*
🏗 00:00:51 The creation of a meta table for the data warehouse, including two columns: table name and max ID.
🔗 00:02:21 Connection to the source system and adoption of a pattern for incremental loading.
💻 00:04:08 Creation of a dynamic stored procedure to accept table and column names and output the max ID.
*Fabric Integration*
🔗 00:02:24 Integration with Fabric and adoption of a pattern for incremental loading from the source system.
🗄 00:03:34 Creation of a Pipeline to call the data flow and update max IDs for subsequent runs.
🔧 00:05:10 Use of additional store procedures to load DMS and fax in Fabric.
*Data Flow and Power Query*
🔗 00:02:32 Connection to the orders table in the north wind database using a view called DW orders.
⚙ 00:02:51 Creation of a function in Power Query for each table to return the max ID for incremental loading.
📈 00:03:44 Execution of the store procedure to update max IDs after staging data into the warehouse.
*Warehouse Limitations*
⚠️ 00:05:03 Limitation of identity columns in the warehouse and Fabric, not currently supported but on the roadmap.
*Pipeline Execution*
▶️ 00:05:18 The Pipeline execution is demonstrated, showing the update of max IDs for each table.
🗄 00:05:33 A full Pipeline is created to clear out staging tables, execute the data flow, and load facts and DMS.
▶️ 00:05:59 The concept of using a watermark pattern to incrementally load the data warehouse is explained.
*Conclusion and Call to Action*
💬 00:06:08 Encouragement for viewers to post questions and comments in the comments section below the video.
👀 00:06:14 Invitation to explore a playlist for more information about Fabric integration.
👋 00:06:17 Gratitude expressed from the presenters, Adam and Patrick, for watching the video.
------------------
🔑 *Key takeaways::*
*Incremental Loading Process*
📈 00:00:37 Incremental loading involves using Pipelines and dataflows to load only new or updated data into the warehouse, improving efficiency and reducing processing time.
🔧 00:01:59 The creation of a stored procedure to return the max ID for a given table enables dynamic filtering of data based on incremental updates.
*Pipeline Execution and Warehouse Maintenance*
🗄 00:05:33 The creation of a full Pipeline for clearing staging tables, updating max IDs, and loading facts and DMS demonstrates a comprehensive approach to warehouse maintenance.
------------------
*Summarized by TubeSum Chrome Extension*
@Dano1000 3 หลายเดือนก่อน
Great video! Interested how this would work over a lakehouse for source data and the metadata/ stored procedure in a warehouse, not sure how to reference this in the dynamic stored proc….or would you have to move to static per table load?

ต่อไป

เล่นอัตโนมัติ

Understanding OneLake within Microsoft Fabric