Optimizing Your Data Infrastructure - How To Become A Better Data Engineer

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 ก.ค. 2024
  • I once had an engineer tell me that they essentially didn’t want to consider cost as they were building a solution. I was baffled. Don’t get me wrong, yes, when you’re building, you iterate and aim to improve your solutions cost.
    But from my perspective, I don’t think completely ignoring costs from day one is a good plan.
    Cost plays a role in all forms of projects, whether you’re building bridges or writing code. How much budget is allocated to build and maintain a solution is important. In the real world, it can change the materials used, the timeline, or the final product’s design.
    In the tech world it also plays a similar role, and it can help you build more optimal systems.
    Thanks all for watching.
    If you're looking to improve your data infrastructure or reduce your costs, you can set up a free consult below.
    calendly.com/ben-rogojan/cons...
    Also, if you'd like to learn more about Estuary and how you can use it in your data infrastructure, then check it out below.
    bit.ly/3Ed1RLe
    If you enjoyed this video, check out some of my other top videos.
    Top Courses To Become A Data Engineer
    • Top Courses To Become ...
    What Is The Modern Data Stack - Intro To Data Infrastructure Part 1
    • What Is The Modern Dat...
    If you would like to learn more about data engineering, then check out Googles GCP certificate
    bit.ly/3NQVn7V
    If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here.
    seattledataguy.substack.com/​​
    Or check out my blog
    www.theseattledataguy.com/
    And if you want to support the channel, then you can become a paid member of my newsletter
    seattledataguy.substack.com/s...
    Tags: Data engineering projects, Data engineer project ideas, data project sources, data analytics project sources, data project portfolio
    _____________________________________________________________
    Subscribe: / @seattledataguy
    _____________________________________________________________
    About me:
    I have spent my career focused on all forms of data. I have focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. I have also helped develop analytics for marketing and IT operations in order to optimize limited resources such as employees and budget. I privately consult on data science and engineering problems both solo as well as with a company called Acheron Analytics. I have experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.
    *I do participate in affiliate programs, if a link has an "*" by it, then I may receive a small portion of the proceeds at no extra cost to you.

ความคิดเห็น • 17

  • @insightsxdesign
    @insightsxdesign 4 หลายเดือนก่อน +3

    Excellent information! Great job 🙌

  • @dougdataeng
    @dougdataeng 4 หลายเดือนก่อน +2

    Great job! Keep going and bring us valuable content

    • @SeattleDataGuy
      @SeattleDataGuy  4 หลายเดือนก่อน +1

      Thank you! Will do!

  • @nicky_rads
    @nicky_rads 4 หลายเดือนก่อน

    great video, always important to understand the cost implications. Keep it up

  • @Yavin4
    @Yavin4 4 หลายเดือนก่อน +1

    Amazing video. One of your best. Time to data access is a huuuuuuuuggggggeeeee factor in your data cost. How often do you need to access the data? The accurate answer to that question can save a corporation literally millions. I've seen data that's hardly ever used accessible in a second instead of being put into deep storage or better yet completely deleted.

  • @thedatadoctor
    @thedatadoctor 4 หลายเดือนก่อน +2

    Another great one!
    I have also found Estuary to be a great solution to reduce cost on data loading... But as you said, usually we set it to run in batches to reduce cost on the warehouse side.

    • @SeattleDataGuy
      @SeattleDataGuy  4 หลายเดือนก่อน

      Do you have any real time use cases youve implemented. Always curious to hear where people are using real time

  • @firefoxmetzger9063
    @firefoxmetzger9063 4 หลายเดือนก่อน

    Another good question is: "If we turn this off today, how much money would you loose?"
    This is a good way to get business people thinking about value. Paying 200k may be a lot but if its "cashflow positive" its nom-critical to fix. If it isnt you now have the entire business supporting your optimization effort and will face much fewer rodeblocks while exploring alternatives.

  • @avikchatterjee7854
    @avikchatterjee7854 4 หลายเดือนก่อน +1

    Amazing video as always. Sorry to be a bit cringe and annoying but do you think it will be better if you lower the exposure of your camera setting a bit?

    • @SeattleDataGuy
      @SeattleDataGuy  4 หลายเดือนก่อน

      Thanks for the support. Hmm, what do you mean, like remove the blur?

    • @avikchatterjee7854
      @avikchatterjee7854 4 หลายเดือนก่อน +1

      @@SeattleDataGuy No your face looks a bit too bright imo, bg blur is fine

    • @SeattleDataGuy
      @SeattleDataGuy  4 หลายเดือนก่อน

      @@avikchatterjee7854 Oh yeah I filmed this one a while ago. I have tried to reduce the brightness in other videos see this one - th-cam.com/video/gG7upg6QaBI/w-d-xo.html

  • @robertlong6
    @robertlong6 4 หลายเดือนก่อน +1

    You say in the video you'll talk about data modelling in January. You okay your videos a year in advance? 😂

    • @SeattleDataGuy
      @SeattleDataGuy  4 หลายเดือนก่อน

      hahahha! I filmed the video in January...I gotta stop referencing other videos or get faster and putting out videos... I wanted to put out this data modeling video -th-cam.com/video/gG7upg6QaBI/w-d-xo.html

  • @internetexplorer1593
    @internetexplorer1593 4 หลายเดือนก่อน +1

    My organization has an issue with interdepartmental meetings with lots and lots of different excel spreadsheets/reports that are only on people's local devices. This turns into a "my data" vs "your data." Unfortunately, we haven't collectively agreed upon a data source "gospel." There are these completely silo'd legacy systems that have permissions barriers, so there's not an obvious stand out source to immediately make "gospel truth." Any suggestions on how to pitch a centralized data source to be agreed upon? Any advice helps. Great channel, amazing content