Data Pipeline Reality Rant R

แชร์
ฝัง
  • เผยแพร่เมื่อ 13 ธ.ค. 2024

ความคิดเห็น • 7

  • @danaleeling
    @danaleeling 3 ปีที่แล้ว +1

    Thanks, solid rant. Data issues can only get worse as the information age ages. Older systems, with each new decade a new generation discovering that data entry issues go back ever further in their data sets.

  • @MrSeminolefan1122
    @MrSeminolefan1122 4 ปีที่แล้ว +1

    That would be awesome to look at a case analysis of what kinds of questions can be answered as well as applied.

  • @TomGarner99
    @TomGarner99 4 ปีที่แล้ว +2

    We must work at the same place! 🤣. I spend a lot of time gathering and cleaning data as well and I catch grieve because it takes so long. One of my important data streams is an SQL server table with table and variable names in German. 🤬. I often have to read in spreadsheets people use to log data - crazy headers, joined cells, typos, don't get me started! Good stuff Mark!

    • @CradleToGraveR
      @CradleToGraveR  4 ปีที่แล้ว

      I love it when they change the names of columns without letting downstream know. Ha.

  • @CAVescera
    @CAVescera 4 ปีที่แล้ว +1

    Going through this very situation right now. Perfectly stated.

  • @aliramadan7425
    @aliramadan7425 4 ปีที่แล้ว +1

    Very well said Mark! It takes a lot of time to clean data.