73
136 575

10x Spark performance improvement in Microsoft Fabric

13:20

Microsoft Fabric: Good Notebook Development Practices 📓 (End to End Demo - Part 8)

13:31

Microsoft Fabric: Machine Learning Tutorial - Part 2 - Data Validation with Great Expectations

14:03

Microsoft Fabric: Machine Learning Tutorial - Part 1 - Overview of the Course

8:32

Data is a socio-technical endeavour

20:38

No Code Low Code is Software DIY How Do You Avoid a DIY Disaster

14:57

Compelling Data Storytelling with Power BI: Titanic Survivors

Creative Walkthrough: Titanic Passenger Diagnostic Report in Power BI
Explore the Titanic Passenger Diagnostic Report created using Power BI and published to the Data Stories Gallery. In this video, Paul Waller, walks you through the design decisions and data visualization techniques used, inspired by an interactive museum exhibit. Learn about the demographics, survival rates, and the aftermath of the Titanic disaster, all visualized with a sepia-tone aesthetic reminiscent of early 20th-century newspapers.
This report was showcased at SQLBits 2024 by Barry Smart. Discover how to add navigation, use colour-safe palettes, and more in your Power BI projects.
Find the report here: community.fabric.microsoft.com/t5/Data-Stories-Gallery/Learning-From-Disaster-Titanic-Passenger-Diagnostics/m-p/3852968
- 00:00 Introduction to Titanic Passenger Diagnostic Report
- 00:11 Data Source and Report Design Concept
- 00:26 Navigation and Creative Approach
- 01:19 Cover Page and Branding
- 01:51 Introduction Screen Overview
- 02:13 Demographics of Titanic Passengers
- 03:12 Who Survived: Analysis and Visuals
- 04:06 Conclusion and Presentation Tips
🔗Useful Links:
👉 Power BI Data Stories Gallery: Learning From Disaster - Titanic Passenger Diagnostics: community.fabric.microsoft.com/t5/Data-Stories-Gallery/Learning-From-Disaster-Titanic-Passenger-Diagnostics/m-p/3852968
👉 Power BI Data Stories Gallery: World Bank Heath and Wealth Report: community.fabric.microsoft.com/t5/Data-Stories-Gallery/Accessible-Data-Storytelling-World-Bank-Heath-and-Wealth-Report/m-p/2927555
👉 Accessible Data Storytelling with Power BI: Design Concepts and Accessible Colours: endjin.com/what-we-think/talks/accessible-data-storytelling-with-power-bi-design-concepts-and-accessible-colours
👉 Data Storytelling with Power BI: The World Bank World Health and Wealth Report: endjin.com/what-we-think/talks/data-storytelling-with-power-bi-the-world-bank-world-health-and-wealth-report
👉 How to Build Navigation into Power BI: endjin.com/blog/2024/03/how-to-build-navigation-in-power-bi
👉 How to develop an accessible colour palette for Power BI: endjin.com/blog/2023/02/how-to-develop-an-accessible-colour-palette-for-power-bi
👉 How to enable data teams with the design assets required for impactful data storytelling in Power BI: endjin.com/blog/2022/09/how-to-enable-data-teams-with-the-design-assets-required-for-impactful-data-storytelling-in-power-bi
👉 How to Create Custom Buttons in Power BI: endjin.com/blog/2022/02/how-to-create-custom-buttons-in-power-bi
👉 How to Build a Branded Power BI Report Theme: endjin.com/blog/2022/01/how-to-build-a-branded-power-bi-report-theme
👉 Generating custom themes in Power BI - A designer's perspective: endjin.com/blog/2022/01/generating-custom-themes-in-power-bi-a-designers-perspective
📺 Related Videos:
👉 [Course] Microsoft Fabric: from Descriptive to Predictive Analytics: th-cam.com/video/uaRePHeqvQU/w-d-xo.html
👉 Design concepts and accessible colours for Power BI Reports: th-cam.com/video/lVUVqlUKOhU/w-d-xo.html
👉 Creating Content Images Using PowerPoint Templates for Power BI: th-cam.com/video/xEC9B_FjrIo/w-d-xo.html
👉 Creating Cover Page Images in a PowerPoint Template for Power BI: th-cam.com/video/hLXh7UA8O30/w-d-xo.html
#powerbi #datastorytelling #design #accessibility

มุมมอง: 201

วีดีโอ

10x Spark performance improvement in Microsoft Fabric

13:20

10x Spark performance improvement in Microsoft Fabric

มุมมอง 387หลายเดือนก่อน

Boosting Apache Spark Performance with Small JSON Files in Microsoft Fabric. Learn how to achieve a 10x performance improvement when ingesting small JSON files in Apache Spark hosted on Microsoft Fabric. Ian Griffiths, Technical Fellow at endjin, shares insights and techniques to overcome Spark's challenges with numerous small files, including parallelizing file discovery and optimizing data lo...

Microsoft Fabric: Good Notebook Development Practices 📓 (End to End Demo - Part 8)

13:31

Microsoft Fabric: Good Notebook Development Practices 📓 (End to End Demo - Part 8)

มุมมอง 1.6K2 หลายเดือนก่อน

Microsoft Fabric End to End Demo - Part 8 - Good Notebook Development Practices Notebooks can very easily become a large, unstructured dump of code with a chain of dependencies so convoluted that it becomes very difficult to track lineage throughout your transformations. With a few simple steps, you can turn notebooks into a well-structured, easy-to-follow repository for your code. In this vide...

Microsoft Fabric: Machine Learning Tutorial - Part 2 - Data Validation with Great Expectations

14:03

Microsoft Fabric: Machine Learning Tutorial - Part 2 - Data Validation with Great Expectations

มุมมอง 6612 หลายเดือนก่อน

In part 2 of this course, Barry Smart, Director of Data and AI, walks through a demo showing how you can use Microsoft Fabric to set up a "data contract" that establishes minimum data quality standards for data that is being processed by a data pipeline. He deliberately passes bad data into the pipeline to show how the process can be set up to "fail elegantly" by dropping the bad rows and conti...

Microsoft Fabric: Machine Learning Tutorial - Part 1 - Overview of the Course

8:32

Microsoft Fabric: Machine Learning Tutorial - Part 1 - Overview of the Course

มุมมอง 6843 หลายเดือนก่อน

In this video Barry Smart, Director of Data and AI, provides an overview of the end to end demo of Microsoft Fabric that we will be providing as a series of videos over the coming weeks. The demo will use the popular Titanic data set to show off features across both the data engineering and data science experiences in Fabric. This will include Notebooks, Pipelines, Semantic Link, MLflow (Experi...

20:38

Data is a socio-technical endeavour

มุมมอง 243 หลายเดือนก่อน

Our experience shows that the the most successful data projects rely heavily on building a multi-disciplinary team.

No Code Low Code is Software DIY How Do You Avoid a DIY Disaster

14:57

No Code Low Code is Software DIY How Do You Avoid a DIY Disaster

มุมมอง 653 หลายเดือนก่อน

No-code/Low-code democratizes software development with little to no coding skills needed. But how do you evaluate if software DIY is the right choice for you? From the blog post: endjin.com/blog/2024/03/no-code-low-code-software-diy

7:15

How to Build Navigation into Power BI

มุมมอง 623 หลายเดือนก่อน

Explore a step-by-step guide on designing a side nav in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI. From the blog post: endjin.com/blog/2024/03/how-to-build-navigation-in-power-bi

5:47

Data & AI Engineering Maturity

มุมมอง 223 หลายเดือนก่อน

As data and AI become the engine of business change, we need to learn the lessons of the past to avoid expensive failures. From the blog post: endjin.com/blog/2024/03/data-ai-engineering-maturity

The Heart of Reactive Extensions for .NET (Rx.NET)

5:19

The Heart of Reactive Extensions for .NET (Rx.NET)

มุมมอง 9304 หลายเดือนก่อน

Reactive Extensions for .NET (Rx.NET) is profoundly useful, and developers often end up loving. Why is that? Endjin Technical Fellow Ian Griffiths explains how Rx.NET's deep roots in mathematical foundations make it both powerful and beautiful. � If you'd like to learn more about Rx.NET, please check out the new FREE book Introduction to Rx.NET 2nd Edition (2024): introtorx.com/ #csharp #dotnet ...

Microsoft Fabric: Processing Bronze to Silver using Fabric Notebooks

10:48

Microsoft Fabric: Processing Bronze to Silver using Fabric Notebooks

มุมมอง 4.4K7 หลายเดือนก่อน

Notebooks in Fabric are a lot like notebook experiences in other tools you're probably already familiar with. They allow us to write code in a variety of languages and create a commentary alongside our code using interactive cells. Fabric notebooks also have built in integration with Lakehouses, and provide a built-in filesystem that can be used to store arbitrary files somehow associated with ...

Microsoft Fabric: Role of the Silver Lakehouse in the Medallion Architecture

6:14

Microsoft Fabric: Role of the Silver Lakehouse in the Medallion Architecture

มุมมอง 2K7 หลายเดือนก่อน

The Silver layer is where we apply standardization to our source datasets. This standardization aligns field names across sources, applies common data cleaning operations and organizes the data into a well known structure. The Silver layer, which is generally stored at full-fidelity (i.e. the granularity of the data is the same in the Silver layer as it is in the Bronze layer) provides the foun...

15:12

Microsoft Fabric: Local OneLake Tools

มุมมอง 2.4K11 หลายเดือนก่อน

Sometimes you need to be able to interact with your cloud data locally - maybe it's to troubleshoot/diagnose an issue, or just to do some analysis using your favourite local tools. Ideally you'd be able to browse your cloud data locally and avoid always having to download new copies of your files just to do this. Fortunately, there are a couple of tools that allow you to do just this with your ...

Show & Tell: A Brief Intro to Tensors & GPT with TorchSharp

24:55

Show & Tell: A Brief Intro to Tensors & GPT with TorchSharp

มุมมอง 66511 หลายเดือนก่อน

Every Friday lunchtime, the whole of endjin get together for a "Show & Tell" session. These are essentially lightning talks on any subject; they can based on what someone has been working on that week, it can be a topic someone has been researching, it can be a useful tip or trick, it can be technical or non-technical; it just has to be something the speaker has found interesting, and wants to ...

Microsoft Fabric: Creating a OneLake Shortcut to ADLS Gen2

8:42

Microsoft Fabric: Creating a OneLake Shortcut to ADLS Gen2

มุมมอง 4.7K11 หลายเดือนก่อน

Microsoft Fabric: Creating a OneLake Shortcut to ADLS Gen2

Microsoft Fabric and The Pace of Innovation - The Decision Maker's Guide - Part 3

11:51

Microsoft Fabric and The Pace of Innovation - The Decision Maker's Guide - Part 3

มุมมอง 739ปีที่แล้ว

Microsoft Fabric and The Pace of Innovation - The Decision Maker's Guide - Part 3

Microsoft Fabric & Generative AI - The Decision Maker's Guide - Part 2

9:20

Microsoft Fabric & Generative AI - The Decision Maker's Guide - Part 2

มุมมอง 1.1Kปีที่แล้ว

Microsoft Fabric & Generative AI - The Decision Maker's Guide - Part 2

Hedging your Microsoft Fabric Bet - The Decision Maker's Guide - Part 1

17:51

Hedging your Microsoft Fabric Bet - The Decision Maker's Guide - Part 1

มุมมอง 2.2Kปีที่แล้ว

Hedging your Microsoft Fabric Bet - The Decision Maker's Guide - Part 1

Microsoft Fabric: Ingesting 5GB into a Bronze Lakehouse using Data Factory - Part 3

14:27

Microsoft Fabric: Ingesting 5GB into a Bronze Lakehouse using Data Factory - Part 3

มุมมอง 6Kปีที่แล้ว

Microsoft Fabric: Ingesting 5GB into a Bronze Lakehouse using Data Factory - Part 3

Microsoft Fabric: Inspecting 28 MILLION row dataset in Bronze Lakehouse - Part 2

12:13

Microsoft Fabric: Inspecting 28 MILLION row dataset in Bronze Lakehouse - Part 2

มุมมอง 6Kปีที่แล้ว

Microsoft Fabric: Inspecting 28 MILLION row dataset in Bronze Lakehouse - Part 2

Microsoft Fabric: Lakehouse & Medallion Architecture - Part 1

12:32

Microsoft Fabric: Lakehouse & Medallion Architecture - Part 1

มุมมอง 13Kปีที่แล้ว

Microsoft Fabric: Lakehouse & Medallion Architecture - Part 1

A 10 minute Tour Around Microsoft Fabric

12:37

A 10 minute Tour Around Microsoft Fabric

มุมมอง 5Kปีที่แล้ว

A 10 minute Tour Around Microsoft Fabric

Microsoft Fabric Briefing - after 6 months of use on the private preview.

19:10

Microsoft Fabric Briefing - after 6 months of use on the private preview.

มุมมอง 23Kปีที่แล้ว

Microsoft Fabric Briefing - after 6 months of use on the private preview.

Reactive Extensions API in depth: Marble Diagrams, Select() and Where()

4:50

Reactive Extensions API in depth: Marble Diagrams, Select() and Where()

มุมมอง 106ปีที่แล้ว

Reactive Extensions API in depth: Marble Diagrams, Select() and Where()

18:47

Rx .NET Workshop: 08 Schedulers

มุมมอง 292ปีที่แล้ว

Rx .NET Workshop: 08 Schedulers

Rx .NET Workshop: 07 Reactive Coincidence

22:00

Rx .NET Workshop: 07 Reactive Coincidence

มุมมอง 193ปีที่แล้ว

Rx .NET Workshop: 07 Reactive Coincidence

Rx .NET Workshop: 06 Parameterizing Concurrency

17:48

Rx .NET Workshop: 06 Parameterizing Concurrency

มุมมอง 271ปีที่แล้ว

Rx .NET Workshop: 06 Parameterizing Concurrency

11:24

Rx .NET Workshop: 05 Writing Queries

มุมมอง 254ปีที่แล้ว

Rx .NET Workshop: 05 Writing Queries

Rx .NET Workshop: 04 Unified Programming Model

12:21

Rx .NET Workshop: 04 Unified Programming Model

มุมมอง 326ปีที่แล้ว

Rx .NET Workshop: 04 Unified Programming Model

16:51

Rx .NET Workshop: 03 Event Processing

มุมมอง 548ปีที่แล้ว

Rx .NET Workshop: 03 Event Processing

ความคิดเห็น

@MdRahman-wl6qi วันที่ผ่านมา
how to move data from one lakehouse's to another lakehouse table using pyspark?
@ullajutila1659 13 วันที่ผ่านมา
The video was ok, but it did not explain how the ADLS storage account networking needs to be configured. More specifically, how to configure it in a secure manner, without allowing access from all networks.
@ravishkumar1739 15 วันที่ผ่านมา
Hi @endjin great videos, have you uploaded the architecture diagram file anywhere that I can download and reuse for my own projects?
@MuhammadKhan-wp9zn 23 วันที่ผ่านมา
How can i contact you pls let me know Thanks
@MuhammadKhan-wp9zn หลายเดือนก่อน
This is a framework level work, not sure how many will understand and appreciate your efforts you did to create a video, but I will highly appreciate your thoughts and work and at one point I was thinking if I got a chance to create a framework how I will do, you gave very nice guide line here, once again thank you for video, I would like to see your other videos too.
@vinzent345 หลายเดือนก่อน
Is there an option to connect from your local machine directly to the synapse spark cluster? Doesn't seem that debug friendly, having to compile & upload it every time. It almost feels more sensible to host your own autoscaling Spark Cluster in Azure Kubernetes Services. If I do so, I can interact directly with the Cluster and build Sessions locally. What do you think?
@idg10 28 วันที่ผ่านมา
In this scenario, it would make more sense to run Spark locally. There are a few ways you can do that, but as you'd expect it's not entirely straightforward, and not something easily addressed in a comment.
@ManojKatasani หลายเดือนก่อน
very clean explanation, appreciate your efforts. is there any chance we get code on each layer ( Bronze to sliver etc.. advance thank you
@ManojKatasani หลายเดือนก่อน
you are the best
@rodrihc หลายเดือนก่อน
Thanks for the video! Is there any alternative to run the test notebooks of synapse from the cicd pipeline in azure devops?
@jamesbroome949 หลายเดือนก่อน
There's a couple of ways to achieve this - neither are immediately obvious but definitely possible! There's no API for just running a notebook in Synapse, but you can submit a Spark batch job via the API. However, this requires a Python file as input, so it might mean pulling your tests out of a Notebook and writing and storing them separately in an associated ADO repo: learn.microsoft.com/en-us/rest/api/synapse/data-plane/spark-batch/create-spark-batch-job?view=rest-synapse-data-plane-2020-12-01&tabs=HTTP Possibly an easier route would be to create a separate Synapse Pipeline definition that runs your test notebook(s) and use the API to trigger that pipeline run from your ADO pipeline. This is a straightforward REST API but operates asynchronously, so you'd need to poll for completion as the pipeline/tests are running: learn.microsoft.com/en-us/rest/api/synapse/data-plane/pipeline/create-pipeline-run?view=rest-synapse-data-plane-2020-12-01&tabs=HTTP Hope that helps!
@YvonneWurm หลายเดือนก่อน
Do you know how to view the definition of the view or stored procedures?
@jamesbroome949 หลายเดือนก่อน
Hi - I don't believe there's a way in Synapse Studio to automatically script out the definitions like you can do in, say SQL Server Management Studio. But you can see the column definitions for you View if you find your database under the Data tab and expand the nodes in the explorer. Hope that helps!
@datasets-rv7jf 2 หลายเดือนก่อน
looking forward to many more of this!
@endjin 2 หลายเดือนก่อน
Barry has ~9 parts planned!
@ThePPhilo 2 หลายเดือนก่อน
Great videos 👍👍 Microsoft advocate using seperate workspaces for bronze, silver and gold but that seems to be harder to achieve due to some current limitations. If we go with a single workspace and a folder based set up like the example will it be hard to switch to seperate workspaces in future? Is there any prep we can do to make this switch easier going forward (or would there be no need to switch to a workspace approach)?
@ramonsuarez6105 2 หลายเดือนก่อน
Thanks a lot Barry. Great video. I couldn't find the repository for the series among your Github repos. Will there be one?
@vesperpiano 2 หลายเดือนก่อน
Thanks for the feedback. Glad to hear you are enjoying it. Yes - we are planning to release the code for this project on Git at some point soon.
@StefanoMagnasco-po5bb 2 หลายเดือนก่อน
Thanks for the great video, very useful. One question: you are using PySpark in your notebooks, but how would you recommend modularizing the code in Spark SQL? Maybe by defining UDFs in separate notebooks that are then called in the 'parent' notebook?
@endjin 2 หลายเดือนก่อน
Sadly you don't have that many options here without having to fall back to Python/Scala. You can modularize at a very basic level using notebooks as the "modules", containing a bunch of cells which contain Spark SQL commands. Then call these notebooks from the parent notebook. Otherwise, as you say, one step further would be defining UDFs using some Python and then using spark.udf.register to be able to invoke them from SQL. Ed
@applicitaaccount1258 2 หลายเดือนก่อน
Really looking forward to this series, thanks for taking the time to put it together. I really enjoy the pace and detail level if the conte t @endjin put together.
@applicitaaccount1258 2 หลายเดือนก่อน
Great series, what the naming convention you are using in the full version of the solution ? I noticed the LH is prefixed with HPA
@endjin 2 หลายเดือนก่อน
The HPA prefix stands for "House Price Analytics", although the architecture diagram on the second video has slightly old names, as you've probably noticed. The full version uses <medallion_layer>_Demo_LR, where LR stands for "Land Registry". Ed
@applicitaaccount1258 2 หลายเดือนก่อน
@@endjin - Thanks for the clarification Ben.
@kingmharbayato643 2 หลายเดือนก่อน
Finally someone made this video!! Thank you for doing this.
@malleshmjolad 2 หลายเดือนก่อน
Hi Ed, We have loaded few tables using synapse link into adls gen2 and created shortcut to access the adlsgen2 files in fabric,but while loading the files into tables,we are not getting the column names for the tables and it is showing as c0,c1....etc which is causing an issue,can you please give some insights on how to overcome this and load the tables with metadata also
@endjin 2 หลายเดือนก่อน
Hi - thanks for the comment! Which Synapse Link are you using? Dataverse? If so, this uses the CDM model.json format which doesn't include header rows in the underlying CSV files. You would have to read the shortcut data, apply the schema manually, and then write the data out to another table (inside a Fabric notebook or something) if you wanted to use that existing data. However, if you're using Synapse Link for Dataverse, you should instead consider using the new "Link to Microsoft Fabric" feature available in Dataverse: learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-view-in-fabric. This will include the correct schema.
@gpc39 2 หลายเดือนก่อน
Very useful. One thing I would like to do is avoid having to add lakehouses to each notebook. Is there a way to do this within the notebook? Eventually given two Lakehouses, Bronze and Silver, I would want to merge from the Bronze table into the Silver Table. I have the merge statement working; it's just the adding of the lakehouses, which I can't see. I'm doing most of the programming with SQL, as I am less adept with PySpark, but am learning. Thanks Graham
@endjin 2 หลายเดือนก่อน
Hi Graham. Thanks for your comment! By "do this within the notebook" do you mean "attach a Lakehouse programmatically"? If so, take a look at this: community.fabric.microsoft.com/t5/General-Discussion/How-to-set-default-Lakehouse-in-the-notebook-programmatically/m-p/3732975 By my understanding, a notebook needs to have at least one Lakehouse attached to it in order to run Spark SQL statements that read from Lakehouses. Once it has one Lakehouse, remember that you can reference other Lakehouses in your workspace by using two-part naming (SELECT * FROM <lakehouse_name>.<table_name>`) without having to explicitly attach the other Lakehouses. And if you need to reference Lakehouses from other workspaces, you'll need to add a shortcut first and then use two part naming. Ed
@jensenb9 2 หลายเดือนก่อน
Great stuff. Is this E2E content hosted in a Git repo somewhere that we can access? Thanks!
@endjin 2 หลายเดือนก่อน
Not yet, but I believe that is the plan.
@Nalaka-Wanniarachchi 2 หลายเดือนก่อน
Nice share on best practices round up ...
@ramonsuarez6105 2 หลายเดือนก่อน
Excellent video, thanks. Do you ever use a master notebook or pipeline to run the 3 stages one after the other for the initial upload or subsequent incremental uploads? Why not use python files or spark job definitions instead of some of the notebooks that only have classes and methods ? How do you integrate these notebooks with testing in CD/CI?
@endjin 2 หลายเดือนก่อน
All great questions! I'll be covering most of your points in upcoming videos, but for now I'll try to answer them here.. > Do you ever use a master notebook or pipeline to run the 3 stages one after the other for the initial upload or subsequent incremental uploads? Yes, we either use a single orchestration notebook or a pipeline to chain the stages together. On the notebook side, more recently we've been experimenting with the "new" mssparkutils.notebook.runMultiple() utility function to create our logical DAG for the pipeline. We've found this to be quite powerful so far. On the pipeline side, the simplest thing to do is to have multiple notebook activities chained together in the correct dependency tree. The benefit of the initial method is that the same Spark Session is used. This is particularly appealing in Azure Synapse Analytics where Spark Sessions take a while to provision (although this is less significant in Fabric since sessions are provisioned much more quickly). The benefit of the pipeline approach is that it's nicer to visualise, but arguably harder to code review given its underlying structure. One thing we do in either option is make the process metadata-driven. So we'd have an input parameter object which captures the variable bits of configuration about how our pipeline should run. E.g. ingestToBronze = [true/false], ingestToBronzeConfig = {...}, processToSilver = [true/false],processToSilverConfig = {...}, .... This contains all the information we need to control the flow of the various permutations of processing we need. Stay tuned - there'll be a video on this later on in the series! > Why not use python files or spark job definitions instead of some of the notebooks that only have classes and methods? We could and we sometimes do! But the reality is that managing custom Python libraries and SJDs are a step up the maturity ladder that not every org is ready to adopt. So this video was meant to provide some inspiration about a happy middle-ground - still using Notebooks, but structuring them in such a way that follows good development practices and would make it easier to migrate to custom libraries in the future should that be necessary. > How do you integrate these notebooks with testing in CD/CI? Generally we create a separate set of notebooks that serve as our tests. See endjin.com/blog/2021/05/how-to-test-azure-synapse-notebooks. Naturally, development of these tests isn't great within a notebook, and it's a bit cumbersome to take advantage of some of the popular testing frameworks out there (pytest/behave). But some tests are better than no tests. Then, to integrate into CICD, we'd wrap our test notebooks in a Data Factory pipeline, and call that pipeline from ADO/GitHub: learn.microsoft.com/en-us/fabric/data-factory/pipeline-rest-api#run-on-demand-item-job. --- Sadly I can't cover absolutely everything in this series, so I hope this comment helps!
@ramonsuarez6105 2 หลายเดือนก่อน
@@endjin Thanks a lot Ed. You guys do a great job with your videos and posts. Super helpful and inspiring :)
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin 2 หลายเดือนก่อน
Thank you all for watching, if you could do me a favour, hit subscribe and turn notifications 🔔 on it helps us more than you know.
@john_britto_10x 2 หลายเดือนก่อน
In your architecture diagram, you have mentioned HPA orchestrator pipeline. Is that a separate pipeline that needs to be created than the ones at the process layers below?
@endjin 2 หลายเดือนก่อน
Thanks for the comment! Yes, the orchestrator pipeline is a separate pipeline that includes the configuration that controls the flow of the pipeline. I'll be showing that in an upcoming video, so stay tuned!
@john_britto_10x 2 หลายเดือนก่อน
@@endjinawesome. Awaiting for it.
@john_britto_10x 2 หลายเดือนก่อน
endjin have you published the 8th part video, if so the link please?
@endjin 2 หลายเดือนก่อน
It's coming in the next week or so. We've just started a new series you might be interested in th-cam.com/video/uaRePHeqvQU/w-d-xo.html
@john_britto_10x 2 หลายเดือนก่อน
Thank you so much for replying quickly.
@endjin 2 หลายเดือนก่อน
Part 8 - Good Notebook Development Practices - is now available: th-cam.com/video/UyS6ZUgh-Wc/w-d-xo.html
@sailing_life_ 2 หลายเดือนก่อน
Ohhh yea video #2! Nice job @enjin team!
@endjin 2 หลายเดือนก่อน
Glad you enjoyed it. 7 more parts to come. Don't forget the "d" in endjin!
@sailing_life_ 2 หลายเดือนก่อน
@@endjin hah Sorry!! Endjin* :) :)
@iamjameson 3 หลายเดือนก่อน
Nice, looking forward to this series
@endjin 3 หลายเดือนก่อน
We're just editing part 2, which should be published on Thursday.
@rh.m6660 3 หลายเดือนก่อน
Not bad, not bad at all. Could you point to a best practice framework for this type of work? Or is this just the way you like to work? I'm finding i can do to the data what i want but am not sure if im being efficient and scalable. I like the idea of using helpernotebooks, i believe cusom spark environments also support custom library's now.
@endjin 2 หลายเดือนก่อน
This is the way we like to work based on a background of software engineering within the company. We believe data solutions should get the same treatment as software solutions, and for that reason we're huge proponents of applying DataOps practices when building data products. In fact, we've presented at Big Data LDN and SQLBits on this very topic: endjin.com/news/sqlbits-2024-workshop-dataops-how-to-deliver-data-faster-and-better-with-microsoft-cloud Sadly we don't have a specific recording or blog we can share with you (yet!), but part of the point of this series is to expose you to some of our guidance in this area. But stay tuned! Check out "Part 8 - Good Notebook Development Practices" which is now available: th-cam.com/video/UyS6ZUgh-Wc/w-d-xo.html --- And you're right that Environments in Fabric support custom libraries. This is another way you can package up your common code and more easily use it across workloads. But this increases the CICD complexity a little bit, so take that into account! Ed
@profanegardener 3 หลายเดือนก่อน
Great video.... I'm very new to Synapse and it was really helpful.
@carlnascnyc 3 หลายเดือนก่อน
this is indeed a great series, the best one of all youtube imo, it's direct to the point and straightforward, just the right amount of complexity, can't wait for the next chapters (especially the gold zone design/loading), please keep up the good work and many thanks!!! cheers!!
@endjin 2 หลายเดือนก่อน
Part 8 - Good Notebook Development Practices - is now available: th-cam.com/video/UyS6ZUgh-Wc/w-d-xo.html
@NKMAKKUVA 4 หลายเดือนก่อน
Thanks for the explanation
@GabrielSantos-qu6so 4 หลายเดือนก่อน
Why you store the data on Bronze layer in the files folder and dont populate in the tables instead? Or do the both? Where i'm working currently, we ingest to the table, not to a file.
@endjin 2 หลายเดือนก่อน
The principle we like to take is "do as little as necessary to land your data into bronze". Any conversions/transformations/parsing you do when ingesting into bronze introduces scope for error, and ultimately modifies the "raw" state of the dataset. The reason "bronze" has also been known as "raw" in the past (i.e. before Databricks coined the Medallion Architecture) is because the data in that layer was an unadulterated form of your source dataset. The main benefits of this being: auditability, reprocessing capability, error recovery, and the flexibility it provides where different users can consume the same data in different ways. Naturally, you can't always store your Bronze data in the same format it's stored in the underlying source, especially if your source is a database, for example. So in this instance, some sort of conversion is necessary. What you convert it to is up to you: csv is popular, as is parquet. Ingesting into Delta tables goes a step further - you're applying a schema to the incoming data and enforcing a specific read/write protocol in the Bronze zone. I would argue this makes the Bronze data less accessible for people that need to consume it in the future (which may not be an issue for your use-case). If you have a valid reason to use Tables instead of Files (which I'm sure you do!), I would still recommend storing a copy of the raw file in the Files section in some format that suits you. Imagine if your Delta table gets corrupted for some reason, and the historical data is no longer available at source - what would you do then? Hope this helps! Ed
@welcomesithole1501 4 หลายเดือนก่อน
Amazing video, I can see that he knows what he is talking about, unlike other TH-camrs who just copy and paste code from somewhere and fail to give proper explanations. I wish there was a series of this man teaching TorchSharp, especially converting his Python GPT to C#.
@akthar3 4 หลายเดือนก่อน
worked beautifully - thank you !
@Tony-cc8ci 4 หลายเดือนก่อน
Hi, have enjoyed the Microsoft Fabric playlist so far, very informative. This one was 2 months ago, so just wanted to find out if you plan on continuing the series like you mentioned with the helper notebooks?
@endjin 4 หลายเดือนก่อน
Yes, very much so! Ed (and Barry) are just heads down busy preparing for their workshops and talks at the SQLBits conference next week. As soon as that's done they'll have more capacity to finish off this series of videos.
@endjin 2 หลายเดือนก่อน
Part 8 - Good Notebook Development Practices - is now available: th-cam.com/video/UyS6ZUgh-Wc/w-d-xo.html
@Power888w 5 หลายเดือนก่อน
Super video thank you!
@endjin 5 หลายเดือนก่อน
Thank you! Glad you enjoyed it! There are more coming!
@endjin 2 หลายเดือนก่อน
Part 8 - Good Notebook Development Practices - is now available: th-cam.com/video/UyS6ZUgh-Wc/w-d-xo.html
@richprice5434 5 หลายเดือนก่อน
Brilliant video perfect for what I need to do subscribed 🎉
@endjin 5 หลายเดือนก่อน
That's great to hear! There are a few more videos dropping soon.
@endjin 2 หลายเดือนก่อน
Part 8 - Good Notebook Development Practices - is now available: th-cam.com/video/UyS6ZUgh-Wc/w-d-xo.html
@MucahitKatirci 5 หลายเดือนก่อน
Thanks
@endjin 3 หลายเดือนก่อน
Glad you're enjoying the videos
@endjin 2 หลายเดือนก่อน
Part 8 - Good Notebook Development Practices - is now available: th-cam.com/video/UyS6ZUgh-Wc/w-d-xo.html
@MucahitKatirci 5 หลายเดือนก่อน
Thanks
@endjin 3 หลายเดือนก่อน
We're impressed you're making your way through the series!
@MucahitKatirci 5 หลายเดือนก่อน
Thanks
@endjin 3 หลายเดือนก่อน
There should be a new video dropping soon, seeing that you've binged everything so far!
@MucahitKatirci 5 หลายเดือนก่อน
Thanks
@endjin 3 หลายเดือนก่อน
You're most welcome. Check out some of the other videos on the channel. There are some really good talk about gathering requirements and testing.