Great summary! Couldn't have described this better after almost 15 years as a data engineer/analyst. Seen all of that, many times, in large and small companies. The argument that developers only stay for a few years also holds for management, thus for opinions on what process, tools, company hierarchies, and strategy should look like. And there go your pipelines, metric definitions and meaning of source data.
Yeah I don't think you can ever 100% retain that knowledge if you don't retain the people. Companies should value that knowledge and not just the skills but they often don't. The new person will never take all that time the initial person took on the project and won't really understand it as well unless they spend a large amount of time debugging it or rewriting some of it.
This was an interesting one. Been so focused on what the field entails and how to get in, but I never really wondered what problems the data world faces at the frontiers. Good stuff man, keep 'em coming!
I find the reasons for most weird decisions regarding architecture, infrastructure and internal vs third party solutions, boil down to a lack of communication, individual KPIs, and the budget going to 'new shiny magical solution that will solve all our problems' instead of 'fixing what we already have'.
I like your comment about adding a column taking 6 months. I built a project mostly on my own and in about that time I was able to build the whole warehouse for a particular internal customer group. I used the business logic I know from the users working there for years so I knew how they define the metrics. Obviously the data isn't the same as what other teams are doing but their data isn't fit for our purpose and their projects take 2 years. Customers will ask us to move categories around, remove some things, etc so obvisouly the results can't be the same but usually there's a good reason why they ask for these modifications and it's better to have a different view of things than having a useless one. In my own opinion you can have a team that works slowly by committee and bureaucracy to build some core data but then you also need to have some other teams that work more flexibly to answer the immediate needs of the business. Then you could always go back and align yourself once the slow team has done their work if you're still around.
Yes, I think larger companies in particular benefit from splitting this work. One that essentially creates infra and others that create adhoc data request type work
Hi Benjamin, "But why?" There are many reasons. But the number 1 reasons is that the people doing the projects don't know what they are doing and can't be bothered to learn. It really is that simple.
Arghhhh company merging...have a horror experience with that. Even though from business point of view it makes total sense and management tried to sold it to the employees, the employees alone are very unsatisfied (all the blockers because of different environment bla bla) is killing the work.
Hey I am thinking of switching my focus to data engineering from data science since I like the idea more of the technology you work with as a data engineering and the goal but I as a data engineering do you need any knowledge about model building or analytics like skills from data science
Great summary! Couldn't have described this better after almost 15 years as a data engineer/analyst. Seen all of that, many times, in large and small companies.
The argument that developers only stay for a few years also holds for management, thus for opinions on what process, tools, company hierarchies, and strategy should look like. And there go your pipelines, metric definitions and meaning of source data.
Yeah I don't think you can ever 100% retain that knowledge if you don't retain the people. Companies should value that knowledge and not just the skills but they often don't. The new person will never take all that time the initial person took on the project and won't really understand it as well unless they spend a large amount of time debugging it or rewriting some of it.
time to start all over again!
This was an interesting one. Been so focused on what the field entails and how to get in, but I never really wondered what problems the data world faces at the frontiers. Good stuff man, keep 'em coming!
Glad you found the video helpful!
I love your perspective. You so get it and articulate everything so clearly. Wow. Just wow. Thank you!
I find the reasons for most weird decisions regarding architecture, infrastructure and internal vs third party solutions, boil down to a lack of communication, individual KPIs, and the budget going to 'new shiny magical solution that will solve all our problems' instead of 'fixing what we already have'.
but but....we need blockchain
This hit home. I'm experiencing a lot of these right now and trying to find a way to communicate this to the team. Great video!
Glad you felt like it resonated! Any specific issues you are trying to target?
I like your comment about adding a column taking 6 months. I built a project mostly on my own and in about that time I was able to build the whole warehouse for a particular internal customer group. I used the business logic I know from the users working there for years so I knew how they define the metrics. Obviously the data isn't the same as what other teams are doing but their data isn't fit for our purpose and their projects take 2 years. Customers will ask us to move categories around, remove some things, etc so obvisouly the results can't be the same but usually there's a good reason why they ask for these modifications and it's better to have a different view of things than having a useless one.
In my own opinion you can have a team that works slowly by committee and bureaucracy to build some core data but then you also need to have some other teams that work more flexibly to answer the immediate needs of the business. Then you could always go back and align yourself once the slow team has done their work if you're still around.
Yes, I think larger companies in particular benefit from splitting this work. One that essentially creates infra and others that create adhoc data request type work
Hi Benjamin,
"But why?"
There are many reasons. But the number 1 reasons is that the people doing the projects don't know what they are doing and can't be bothered to learn. It really is that simple.
Arghhhh company merging...have a horror experience with that. Even though from business point of view it makes total sense and management tried to sold it to the employees, the employees alone are very unsatisfied (all the blockers because of different environment bla bla) is killing the work.
I just starting in this world... happy to subscribe to your channel
Good luck on your journey!
Hey I am thinking of switching my focus to data engineering from data science since I like the idea more of the technology you work with as a data engineering and the goal but I as a data engineering do you need any knowledge about model building or analytics like skills from data science
Some light analytics is always helpful because then you understand what your end-users need.
politics plus ego. FOC or FOMO.
OOf...yeah lots of politics + FOMO...