- 56
- 124 818
mehdio DataTV
United States
เข้าร่วมเมื่อ 8 มี.ค. 2020
I’m a data geek who’s passionate about Big data, Data Science, Web App, and Music. With more than 7 years of experience in the data domain, this channel is a place where I share what I learned from my journey so far. I'll hope you find it interesting and... fun. Because learning should be fun!
See you in the comments!
See you in the comments!
1 YAML file is ALL your need to start your data stack
I've been running two small data pipeline projects for a while with this setup and wanted to take the idea further. I'll continue experimenting, focusing on simplicity.
I hope it inspires some of you!
📓 Resources
The MAD data landscape used in the intro : mattturck.com/mad2024/
Code example 1 : github.com/mehd-io/duckdb-extension-radar
Code example 2 : github.com/mehd-io/pypi-duck-flow
✍️Follow me there for written content:
➡️Blog : mehdio.com/blog
➡️LinkedIn : www.linkedin.com/in/mehd-io/
➡️Bluesky : bsky.app/profile/mehdio.com
➡️X/Twitter : x.com/mehd_io
0:00 A modern data stack is complex!
01:12 CI/CD Tools for data pipelines
02:39 Code example 1
04:44 Code example 2
08:18 Monitoring
09:35 Pricing & Limits
11:28 Outro
#devops #dataengineering #githubactions #github #datapipelines
I hope it inspires some of you!
📓 Resources
The MAD data landscape used in the intro : mattturck.com/mad2024/
Code example 1 : github.com/mehd-io/duckdb-extension-radar
Code example 2 : github.com/mehd-io/pypi-duck-flow
✍️Follow me there for written content:
➡️Blog : mehdio.com/blog
➡️LinkedIn : www.linkedin.com/in/mehd-io/
➡️Bluesky : bsky.app/profile/mehdio.com
➡️X/Twitter : x.com/mehd_io
0:00 A modern data stack is complex!
01:12 CI/CD Tools for data pipelines
02:39 Code example 1
04:44 Code example 2
08:18 Monitoring
09:35 Pricing & Limits
11:28 Outro
#devops #dataengineering #githubactions #github #datapipelines
มุมมอง: 996
วีดีโอ
Most of the time, Python being slow does not matter
มุมมอง 2.5K21 วันที่ผ่านมา
I wanted to react to the latest programming language benchmark that took over the internet. Do the right thing, and measure what matters. 📋Resources The benchmark : benjdd.com/languages/ Python, the calculator of programming : x.com/thehirschibar/status/1861826133773127976?t=WjgZaDW9H9oqcHHxwLK5xw&s=09 Similar blog on why perf is not enough for database : motherduck.com/blog/perf-is-not-enough/...
Speed Up Your Python Workflow with UV
มุมมอง 1.4K28 วันที่ผ่านมา
I've been uv for the past months, and I will use it for many more. 📋Resources uv's docs : docs.astral.sh/uv/ Talk from Charlie Marsh, Astral founder : th-cam.com/video/gSKTfG1GXYQ/w-d-xo.html ✍️Follow me there for written content: ➡️Blog : mehdio.com/blog ➡️LinkedIn : www.linkedin.com/in/mehd-io/ ➡️X/Twitter : x.com/mehd_io 0:00 Intro 1:08 What's uv 3:08 Hands-on uv 9:25 Benchmark uv vs Poetry ...
Why You Don't Need a Fancy SQL IDE: The Simplest Workflow Ever
มุมมอง 2.6Kหลายเดือนก่อน
Quick one about my SQL workflow - keep things simple! 📋Resources : THE shortcut { "key": "cmd k", "command": "workbench.action.terminal.runSelectedText" }, ✍️Follow me there for written content: ➡️Blog : mehdio.com/blog ➡️LinkedIn : www.linkedin.com/in/mehd-io/ ➡️X/Twitter : x.com/mehd_io 0:00 Intro 0:35 The SQL workflow 2:31 Why - Reason 1 3:14 Why - Reason 2 3:37 Why - Reason 3 3:57 Outro #co...
macOS: Essential Productivity Hacks for Developers
มุมมอง 66Kหลายเดือนก่อน
I'm sharing my productivity hacks as a software engineer working on MacOS. These involve using a couple of open-source tools (tiled Windows managers, etc.) with some smart configurations. You can find all my dotfiles here: github.com/mehd-io/dotfiles 📋Resources Skhd : github.com/koekeishiya/skhd Yabai : github.com/koekeishiya/yabai JankyBorders : github.com/FelixKratz/JankyBorders Raycast : www...
15 Python Libraries Every Data Engineer Needs!
มุมมอง 1.5K3 หลายเดือนก่อน
15 Python Libraries Every Data Engineer Needs!
STOP Wasting Time Job Hunting The WRONG Way!
มุมมอง 4743 หลายเดือนก่อน
STOP Wasting Time Job Hunting The WRONG Way!
FAANG or Bust: The Unfiltered Truth & Future-Proof Skills in Data Engineering | ft. Zach Wilson
มุมมอง 1.8Kปีที่แล้ว
FAANG or Bust: The Unfiltered Truth & Future-Proof Skills in Data Engineering | ft. Zach Wilson
I Asked 14 Data Experts 14 Hard Data Questions
มุมมอง 894ปีที่แล้ว
I Asked 14 Data Experts 14 Hard Data Questions
Data + AI Summit 2023 San Francisco - Here we go
มุมมอง 407ปีที่แล้ว
Data AI Summit 2023 San Francisco - Here we go
5 Questions For 5 Data Experts [COALESCE 2022]
มุมมอง 226ปีที่แล้ว
5 Questions For 5 Data Experts [COALESCE 2022]
DCCP E06 - From Webdev To Self-taught Data Engineer | @DarshilParmar
มุมมอง 294ปีที่แล้ว
DCCP E06 - From Webdev To Self-taught Data Engineer | @DarshilParmar
Python Devs, It's Time To Get On The Rust Bandwagon!
มุมมอง 3.9K2 ปีที่แล้ว
Python Devs, It's Time To Get On The Rust Bandwagon!
DCCP E05 - The Best Learning Path For Data | @averysmith
มุมมอง 1992 ปีที่แล้ว
DCCP E05 - The Best Learning Path For Data | @averysmith
10 Questions For 10 Data Experts [COALESCE 2022]
มุมมอง 3362 ปีที่แล้ว
10 Questions For 10 Data Experts [COALESCE 2022]
Highlights [COALESCE 2022 by @dbt-labs ]
มุมมอง 4902 ปีที่แล้ว
Highlights [COALESCE 2022 by @dbt-labs ]
What's The Fuzz About dbt ? From Zero to Hero
มุมมอง 1.8K2 ปีที่แล้ว
What's The Fuzz About dbt ? From Zero to Hero
DCCP E04 - Data Engineers Are Software Engineers | Kris Peeters from@Dataminded
มุมมอง 2972 ปีที่แล้ว
DCCP E04 - Data Engineers Are Software Engineers | Kris Peeters from@Dataminded
An Alternative Approach To Data Engineer Roadmap
มุมมอง 2K2 ปีที่แล้ว
An Alternative Approach To Data Engineer Roadmap
DCCP E03 - Data Stack In The IOT World | Mário Pereira
มุมมอง 3342 ปีที่แล้ว
DCCP E03 - Data Stack In The IOT World | Mário Pereira
DCCP E02 - To Notebook Or Not To Notebook ? | Jeremy Ravenel (@naas-ai )
มุมมอง 2682 ปีที่แล้ว
DCCP E02 - To Notebook Or Not To Notebook ? | Jeremy Ravenel (@naas-ai )
DCCP E01 - Security Concerns For Data Engineers | Matthew Weingarten
มุมมอง 3242 ปีที่แล้ว
DCCP E01 - Security Concerns For Data Engineers | Matthew Weingarten
How To Make Any Development Setup Ready In 1-Click With DevContainer
มุมมอง 6K2 ปีที่แล้ว
How To Make Any Development Setup Ready In 1-Click With DevContainer
Job Hopping As A Software Engineer - Should You Do It?
มุมมอง 5642 ปีที่แล้ว
Job Hopping As A Software Engineer - Should You Do It?
Poor XML😂
Pleases someone tell me how to get a complete rid of any animations in mac like minimizing and maximizing apps
Very interesting man! Nice job
I'm hooked not only on data engineering topics but on your presentation style. Are there any other resources you would recommend to a beginner? Also, i just learned pandas, should I jump to spark or polars? Thank you Medih. Subscribed!
The pdf is insane. There are so many technologies
raycast & aerospace video
20%
Man you got swag.❤
Ray cast video please 🙏🏻
You could use uv 😂
brother can you make a video how to manage dotfile from scratch on mac?
boy from Brazil here, I met you a few months ago through Luciano, your videos are very good!
Let me know if you've been using such a setup for simple data pipelines-I'm curious to hear! Something else I haven't mentioned is the trend of declarative data pipelines using YAML, even with data orchestrators (e.g., Kestra). This brings both tooling even closer together.
Did You know you can use poetry instead of Uv? Would definitely recommend to give a try!
cool! I just find out this awesome walkthrough video about uv : th-cam.com/video/goIwKjsEPOI/w-d-xo.html
Do you know Aerospace?
People calling python not slow haven't worked on large python projects. For small things and scripting I would argue python is probably the fastest language to use for a no result to result time almost always. It's what I reach for every time. Especially since LLMs are optimized for python above any other language. Large python projects are something else. Oh boy. People getting sloppy and slapping in n^2 algorithms for 1000+ long element objects here and there and yeah... Python apps suddenly become unbearably slow. Even moreso on old hardware. Scaling python is too much of a hassle I've found and this is coming from someone who beat the drum of full stack python. I have switched to using exclusively go for backends and I can guarantee even an absolutely terrible go backend implementations will still be more performant and scalable than just about any python.
It seems it's missing how to display the app icons of each space. 😁
Hey! Didn't go into details but feel free to check the dotfiles here : github.com/mehd-io/dotfiles/blob/master/sketchybar/plugins/yabai.sh#L41
I'd love to see the theme used in the thumbnail :D
will do a video on my terminal setup!
Do you have video about your prompt theme?
Will do a future video about my terminal setup!
To bad, they commented to break the PR into multiple PRs 😑
Yeah, man! I use a similar philosophy in my KDE environment on linux.
Pretty good video, I'm one of those who started using DataGrip, then switched to DBeaver. But this video really makes me rethink the option to get rid of it.
The title should be "Most of the time, Python being slow does not matter". That said, it is slow.
going for a walk from bed to kitchen isnt slow aswell. now try to walk 40Km to the city and call it fast :D
What's slow is slow. No one cares about "fast syntax writing" or UX or whatever the hell because that part is subjective, when people mention speed they obviously relate to execution speed.
If you are compiling your core code in C++/C and calling it from extensions/dynamic libraries in Python. Python is slow and you're just being contrarian. Python like Ruby and every other scripting language that has ever existed is always slower than natively compiled code, it will never be faster. They also, while slower, have their places in the various markets and industries programmers and scripters work in. If you need to call most of your code from outside Python, the language is just a framework for you to call native code from.
Python isn't used in AI. Python uses AI. The AI and compute portion aren't done in Python. But developers create bindings for the non dev AI researchers to call the AI libraries.
Anyone that says that python isn't slow, never had to scale it. Python is fundamentally easy to write, hard scale. Which at some point and for certain project it makes it really hard to justify on a purely software engineering perspective, since the more optimize the more you usually end up losing on a readability which is the whole point of using python in the first place, and what for? I can make a Go program in roughly the same time that will run magnitudes faster.
Check out Airflow - OSS data pipelines orchestrator tool, which is standard in the industry. It scaled at Airbnb with thousands of DAGS and I worked at Klarna on the same setup. Purely Python framework. I could go on with the list, but it really depends on what you mean by scale.
I understand ur point, however what I don’t understand is how is ur point relevant to the idea of the benchmark ? The benchmark is straightforward, language runtime speed, it’s a race, yes like F1 race. It’s not about idea -> reality, it’s not about library, it’s not about right tool for right job, it is just to showcase language raw performance when all of them are trying to do the same task. So in this context, for the sake of this race, Python is slow, and it’s ok to be slow, because Python’s performance was never its strong suit.
It's slow. Compared to fast languages, of course it's slow... You can't argue with that. The only way to make Python fast is... to use a fast language and call it from Python. So, yes, Python is slow. And more Python is hell for large projects.
What's your experience with Python's performance ? Love or hate ?
Terrible, especially when you use OO. Very clear from implementation point that it has the worst performance of all script languages. Even ruby is faster now.
So to resume : Python is fast to write but slow in performance. It just depends on the project
In practice it's fast on performance too. If you add compilation time, Rust is slower for scripts that won't be run many times. Python runs instantly without a compile step and takes advantage of libraries written in precompiled code written in other languages for anything that would be slow in pure Python, which makes the performance pretty optimal for the things it is primarily used for.
mehdio: "Python is not slow". also mehdio: "When Python is slow, people write code in C++ to make something actually fast, and then call it from Python". So yeah folks, Python _is_ slow...
What's the issue with that? You start with an easy-to-learn programming language that can scale across many developers (fast development). When you need compute performance optimizations, you bring in low-level to fine-tune things. Mojo makes similar promises, but I don't think it will take off, IMO.
@@mehdio The problem is that writting python extension is absolute terrible because of this 25 year old and organically grown API. And if you want the extension feel pythonic you cant just use a simple wrapper generator.
@llothar68, do you mean terrible in terms of developer experience or terrible in terms of performance? Do you have clear examples in mind ? In the example I gave in the video, Polars, for instance, is used for heavy compute (data pipelines), and it's a relatively recent project (with its core in Rust) that works quite smoothly with the Python binding.
I am with you bro
What tools or theme do you use for your clean terminal?
🔥
You're always the best
Great stuff Mehdi, learned a thing or two I hadn't on other videos. I really need to spend time on the docs, to learn more than the basics.
I'm gonna migrate my poetry projects to uv
Is there any reason why you WOUDN'T use uv as of today? They have a drop-in replacement for pip, so you can literally start speeding things up by just replacing pip install with uv pip install!
The only this holding me back is the fact uv and pyenv clash on the .python_version file and are incompatible. I am so used to working with pyenv and activating/deactivating virtual environments that I miss this feature with uv. And working with PyTorch / Cuda can be hell to figure out dependencies, I am not keen on not having an option to "stash" a working venv in order to try something in another and have the ability to go back (something I use in pyenv constantly).
Just discovered your channel! Nice content! I'll look @ more. Any chance you can share me the portable vesa you use for the monitor? Thanks 🙏
Ill do a video of my full nomad setup as other people requested! :)
@@mehdio i'm leaving for a 2 weeks work trip in 2 weeks, can you write it to me in private so i get it in time lol?
I had the problem, help me plz. ➜ yabai -m space --focus 2 cannot focus space due to an error with the scripting-addition.
sudo yabai --load-sa
Dash to Dock for macOS, finally. Thanks!
Just use linux. And all the shortcut and no mouse...is more for a specialised workflow...just try to use blender with its million shortcut and your setup... This will be a mess in seconds... For coders ok, but the time I waste to use the MacOS basics is mostly the time you use to config and update all this. And less is more (better) for coders to eliminate possibilities, I guess.
Pretty cool, I was looking something like this, I also switch between linux machines and mac os and I'm a total disaster. This looks pretty similar to i3 :-)
You basically turn macOS into a Linux
I couldnt like the video because it took clicking and I threw out my mouse
there's a solution for that : chromewebstore.google.com/detail/vimium/dbepggeogbaibhgnhhndojpepiihcmeb?hl=en
Thanks bro, been jankingly piecing this setup with the defaults. This looks way better
"you don't need a fancy IDE". You lost me at "the first thing we're going to do is open VSCode"
If you had stayed 2 more seconds, I said that the strategy works with any coding editor. Worked with just a terminal, tmux and vim too!
My dude, you are so close to moving to Nvim ;)
Ahah - ive been dreaming about it. If devcontainer is supported smoothly, I may do the jump
@@mehdio Do it! There are a few plugins, videos and blog posts about how to do this though I don't need it myself so haven't tried them out :)
Getting started with Lazyvim was super easy. Then I was playing around with dadbod-ui for sql queries etc. Honestly will probably just map the same "send sql to terminal" command you use though