Real Time Streaming with Azure Databricks and Event Hubs

แชร์
ฝัง
  • เผยแพร่เมื่อ 8 ก.ค. 2024
  • 🚀Join me in this tutorial as we build a real-time analytics streaming solution using Azure Databricks and Event Hubs.
    ⌚Timestamps:
    Project Overview: 00:00:00
    Solution Architecture: 00:00:44
    Event Hubs Overview and Implementation: 00:04:11
    Databricks Cluster and Library Set Up: 00:12:32
    Importing the Project Notebook: 00:16:35
    Initiating the Stream and Processing the Bronze Layer: 00:18:20
    Processing the Silver Layer: 00:33:08
    Processing the Gold Layer: 00:43:51
    Near Real-Time Power BI Report: 00:54:43
    Project Debrief: 00:59:41
    🔥Project Overview:
    We'll start by creating real time data in Azure Event Hubs and then store and process the data in our Databricks Lakehouse, implementing the Bronze, Silver, and Gold layers of the Medallion Architecture. The final step involves creating a near real-time report on Power BI, showcasing the power of streaming analytics.
    Links and Resources
    🔗 GitHub: github.com/malvik01/Real-Time...
    🔗 Spark Structured Streaming API: spark.apache.org/docs/latest/...
    🔗 Watermarking: www.databricks.com/blog/featu...
    🔗 Check Out My Udemy Courses: www.pathfinder-analytics.net/...
    📺 Don't forget to hit the like button, subscribe, and turn on notifications to stay updated with our latest tutorials and tech insights!
    📬 For more information, queries, or feedback, feel free to drop a comment below.
    🔖 Tags
    azure data engineering, end-to-end azure data engineering project, azure data engineering project free
    Hashtags
    #azure #dataengineering #databricks #DataEngineeringProject #RealTimeDataEngineering #EventHubsTutorial #BigDataSolutions #CloudDataEngineering #DataScienceProject #AzureStreaming #AdvancedDataEngineering
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 20

  • @RajeshRRajamani
    @RajeshRRajamani 25 วันที่ผ่านมา

    Precise , Detailed and to the point . Thanks

  • @suniguha
    @suniguha 3 หลายเดือนก่อน +1

    Excellent presentation. Thank You.

  • @shubhampawade2933
    @shubhampawade2933 2 หลายเดือนก่อน

    Lovely! thanks a ton. You've earned a subscriber.

  • @sraoarjun
    @sraoarjun 3 หลายเดือนก่อน

    You make perfect videos and this is really quality content !!!

  • @ranjansrivastava9256
    @ranjansrivastava9256 6 หลายเดือนก่อน +1

    Great Video !!! Could you please help us to understand on Tumbling window and Watermark concept while reading and writing real time data. If possible.

  • @Hitesh939
    @Hitesh939 3 หลายเดือนก่อน

    Awesome video Bhai :)

  • @ranjansrivastava9256
    @ranjansrivastava9256 6 หลายเดือนก่อน

    Dear , you have used the the protocol AQMP, should it will be AMQP? Kindly clarify on that.

  • @SudarshanThakurIRONPULLER
    @SudarshanThakurIRONPULLER 2 หลายเดือนก่อน

    Very useful .Can you also explain how we can compress and send data to event hub and read in spark ?

  • @monalisachatterjee7222
    @monalisachatterjee7222 6 หลายเดือนก่อน +1

    its gem of a knowledge. I was searching for it. finally found one. just one question how to do similar thing if we dont have unity catalog

    • @pathfinder-analytics
      @pathfinder-analytics  6 หลายเดือนก่อน

      Thank you! Without Unity Catalog you will need to use the Hive Metastore and Service Principals / Mount Points

    • @vemedia5850
      @vemedia5850 5 หลายเดือนก่อน

      how is this done for this project? as also don't have unity catalog under a student account.@@pathfinder-analytics

  • @neelred10
    @neelred10 4 หลายเดือนก่อน

    Is eventhub necessary ? I think we can use autoloader directly to connect to Kafka . Please let me know if there are any limitations that would warrant use of event-hub in between

  • @user-ce9tn6qh5r
    @user-ce9tn6qh5r 2 หลายเดือนก่อน

    does eventhub has to be in the same resource group as storage account?

  • @satwikkumar-eq6fm
    @satwikkumar-eq6fm 3 หลายเดือนก่อน

    Hi, while creating the cluster I'm unable to add unity catalog? Please help

  • @Ramakrishna410
    @Ramakrishna410 3 หลายเดือนก่อน

    How to capture batchid as newcolumn and write it to adls. Plz help me

  • @desifood1895
    @desifood1895 5 หลายเดือนก่อน

    Did u use Autoloader here ...

  • @Ramakrishna410
    @Ramakrishna410 2 หลายเดือนก่อน

    How to trigger my pipeline when any new message reached to eventhub

  • @a2zhi976
    @a2zhi976 4 หลายเดือนก่อน

    can you please do similar video with Synpase spark cluster instead of databricks

  • @abhishekkumartiwari1207
    @abhishekkumartiwari1207 6 หลายเดือนก่อน

    How to stop the job automatically

  • @user-ce9tn6qh5r
    @user-ce9tn6qh5r 2 หลายเดือนก่อน

    Hi,
    i mounted my container correctly, but the df.writestream give the following error for checkpointlocation, what should I check?
    nalysisException: [RequestId=0b4e8597-1d71-483d-9447-ab4e6c3d9470 ErrorClass=INVALID_STATE.UC_CLOUD_STORAGE_ACCESS_FAILURE] Failed to access cloud storage: [AbfsRestOperationException] error code: UNKNOWN, status code: -1 exceptionTraceId=de24bf5b-e34c-41a8-8855-710ae71aabd2
    thank you.