Hello for the video, it could't follow it up, because of the juniper notebook, what do you recommend me to follow in order to replicate what you did in this vidoe. Thank you.
You have not append any meta data with the bronze layer, like when it was ingested, which file is the source of it? bronze layer should have all historical data, no? and what should be done next at the silver layer, so that only unprocessed data is processed to the silver table?
sha1 creates a hash (secure in its name is a misnomer -- it is just longer than an MD5 hash), which should not be considered a replacement for encryption. Python supports the HMAC algorithm through its hmac module, which allows you to mix in a secret key for encryption-like security. Otherwise the presentation is thorough. Thank you!
In the video orders/spend information data is exported as csv files. Should source OLTP systems export data? Is it more practical than the other methods(jdbc, etc...) ?
This is one of the ways to build a simple pipeline with Databricks - how one can easily get data from cloud storage and apply some transformations on it. Delta Live Tables (DLT) is the recommended approach for modern ETL/more complex workflows. We will publish an explainer video on DLT soon.
Solid demo for an intro to data engineering !
Thanks for the demo. Do you all have a link to the slide deck and the data set please?
Can u provide us the data file or source for practice shown in this video?
Hi, where I can get this code that you are showing here?
Super nice !! Can you provide the datasets !?
strange, 1/ Bronze: Loading data from blob storage , and path is from S3? am i missing something here?
where can i find this notebook ?
Nice. Is the notebook available to download and try?
Hello for the video, it could't follow it up, because of the juniper notebook, what do you recommend me to follow in order to replicate what you did in this vidoe. Thank you.
What about on prem data and iot data? Does DBX has ingestion capabilities?
You have not append any meta data with the bronze layer, like when it was ingested, which file is the source of it?
bronze layer should have all historical data, no?
and what should be done next at the silver layer, so that only unprocessed data is processed to the silver table?
Very clear and quick tutorial. Well done, thanks!
sha1 creates a hash (secure in its name is a misnomer -- it is just longer than an MD5 hash), which should not be considered a replacement for encryption. Python supports the HMAC algorithm through its hmac module, which allows you to mix in a secret key for encryption-like security.
Otherwise the presentation is thorough. Thank you!
In the video orders/spend information data is exported as csv files. Should source OLTP systems export data? Is it more practical than the other methods(jdbc, etc...) ?
Thanks for the demo
Really helpful
Is this the recommended way of doing ETL with databricks? I thought delta live tables where the recommended approach now
This is one of the ways to build a simple pipeline with Databricks - how one can easily get data from cloud storage and apply some transformations on it. Delta Live Tables (DLT) is the recommended approach for modern ETL/more complex workflows. We will publish an explainer video on DLT soon.
Great demo
Informative Great Demo! Many thanks
Thank you. 🙏
Good
So what is the challenge here, because this is like a 12 year old person can set up, basically just organizing some tasks in sequential order.