👋Hi Hussein, I'm a non-cs graduate (coming from Math major background), I've software development knowledge in NodeJS, Java, Spring Boot. But I never went though any formal operating system course before and now I want to learn about operating systems, both for my own development and for potential interview situations.. Should I take your OS course or I should take any university OS course?.............Pls any help me out!🙏
Thanks for simplifying Allegro's Kafka performance optimization blog! Your breakdown really helps us grasp complex topics quickly. Keep up the great work!
from my personal research on XFS, it has the concept of extents. Each extent is a part of the filesystem that the OS can do IO to in parallel. the effect is almost like that there is a seperate journal in each extent. XFS perform massively good with large files, for small files it needs to be tweaked.
Thanks @hnasr. The world needs more Hussein Nassers who can explain and help us understand such complex engineering problems/solutions. Motivates me to learn more n more about the system and software engineering in general.
Great video! I'm surprised that there was such a big win to be had with the file system, especially considering that ext4 is pretty much the standard in the Linux world right now; Although, when I saw the title, I immediately knew it was going to be XFS (I've heard that it was designed with concurrency in mind).
I would love to see a breakdown on why XFS was a better choice than BTRFS and ZFS. For their use case I can imagine, but the analysis would be very informative nevertheless
@@shimadabr I don’t know about BTRFS, but ZFS is slower than XFS, F2FS, EXT4 and BTRFS consistently. Performance is not why you choose ZFS. In fact, I’m pretty sure ZFS would always be a bad choice for Kafka.
20:02 The first thing I thought of when you started talking about ext’s journal and WALs in general was Jay Kreps’ world-changing post (about Kafka): “The Log: What every software engineer should know about real-time data’s unifying abstraction”. As above, so below :)
Ceph recommends using XFS and only XFS for the backing block devices of their object storage system. I think it was even more important when Filestore was the default store, rather than Bluestore. Edit: sorry, I was wrong. Only FileStore used a filesystem underneath. BlueStore is a block device backend to RADOS (object store), while FileStore is a file backend to RADOS, so BlueStore just operates on raw devices/partitions. I don’t know why I didn’t remember that detail. Also interesting would be the up and coming SeaStore, built for the new OSD service, crimson-osd, that will replace ceph-osd. SeaStore uses ScyllaDB’s Seastar framework underneath for even more performant block device handling and it will bring support for ZNS NVMe and leverages SPDK/DPDK for getting the maximum out of NVMe arrays.
Also, “Maciej” is pronounced like so: first syllable is emphasized and is the “ma” in “mama”, the second syllable is the “cha” in “chain” (long a sound). Great video about a great post!
i think that it will take long to recover when journaling is disabled because they may be writing the same data to multiple partitions or files by configuring replication factors
This is literally a general case for most db's and step 0 deploying a mongodb acually..... So they could avoid all that trouble by not skipping db deployment guidlines
XFS is faster because it has less safety features. :) (It's generally less tunable than ext4, but works great for a ton of small files. ext4 is generally better for large chunk writes, fast/thorough recovery, etc.) All depends on desired use case
I have a good idea for next video. So on my project i encountered a problem with perfomance, so there was a attempt to use cahching, but app architecture itself do not alllow to do cahcing efficient becasue a lot of additional requests was made behind the curtain at beckend depending on conditions of each user, so in result absolutely same query at frontend lead to absolutely different responses in final, so its some kined of idempotence problem, so it nearly impossible to implement caching without full refactoring of app and including all conditions in every beginning, in initial query. So it would be very interestiung to hear about how to architecture application it was fully fit for heavy caching usage and worked very fast.
In 13:46 you have made a mistake about the directories and files used by kafka to ensure storage,according to the articles you relying on,the directory is not the topic but the partition and the segment files are used to store messages,juste a clarification.
👋Hi everone, I'm a non-cs graduate (coming from Math major background), I've software development knowledge in NodeJS, Java, Spring Boot. But I never went though any formal operating system course before and now I want to learn about operating systems, both for my own development and for potential interview situations.. Should I take Hussein's OS course or I should take any university OS course?.............Pls any help me out!🙏
Fundamentals of Operating Systems course
oscourse.win
👋Hi Hussein, I'm a non-cs graduate (coming from Math major background), I've software development knowledge in NodeJS, Java, Spring Boot. But I never went though any formal operating system course before and now I want to learn about operating systems, both for my own development and for potential interview situations.. Should I take your OS course or I should take any university OS course?.............Pls any help me out!🙏
For those who wonder: Allegro is something like Amazon in Poland. Let's go! Good work guys 😄
Products are getting faster but I am getting braindead day by day
😂😂+1
Better yet, touch some grass 😉
@@goatslayer5957Drop the “gr” and it’s a party.
Thanks for simplifying Allegro's Kafka performance optimization blog! Your breakdown really helps us grasp complex topics quickly. Keep up the great work!
from my personal research on XFS, it has the concept of extents. Each extent is a part of the filesystem that the OS can do IO to in parallel. the effect is almost like that there is a seperate journal in each extent. XFS perform massively good with large files, for small files it needs to be tweaked.
Cool video. Many thanks to Hussein for help to dive into such things!)
Thanks @hnasr.
The world needs more Hussein Nassers who can explain and help us understand such complex engineering problems/solutions. Motivates me to learn more n more about the system and software engineering in general.
Wonderfully explained, love your content - keep it coming!
Beautifully explained!! Thank you so much publishing such a brilliant content.
As always, Great Stuff! Clean explanation.
Great video! I'm surprised that there was such a big win to be had with the file system, especially considering that ext4 is pretty much the standard in the Linux world right now; Although, when I saw the title, I immediately knew it was going to be XFS (I've heard that it was designed with concurrency in mind).
I would love to see a breakdown on why XFS was a better choice than BTRFS and ZFS. For their use case I can imagine, but the analysis would be very informative nevertheless
@@shimadabr I don’t know about BTRFS, but ZFS is slower than XFS, F2FS, EXT4 and BTRFS consistently. Performance is not why you choose ZFS. In fact, I’m pretty sure ZFS would always be a bad choice for Kafka.
Impresionante Kafka improvement. Tiene curso de Sistemas Operativos. Mencionado al final. Demoró 2 años en construirlo.
20:02 The first thing I thought of when you started talking about ext’s journal and WALs in general was Jay Kreps’ world-changing post (about Kafka): “The Log: What every software engineer should know about real-time data’s unifying abstraction”. As above, so below :)
Please do a refresh of that Kafka video !
Brilliant explanation!!!
Great video. What an awesome voice you have 😊 very soothing 😛
أشكرك كثيرًا أخي حسين على المعلومات القيمة والرائعة.
Ceph recommends using XFS and only XFS for the backing block devices of their object storage system. I think it was even more important when Filestore was the default store, rather than Bluestore.
Edit: sorry, I was wrong. Only FileStore used a filesystem underneath. BlueStore is a block device backend to RADOS (object store), while FileStore is a file backend to RADOS, so BlueStore just operates on raw devices/partitions. I don’t know why I didn’t remember that detail.
Also interesting would be the up and coming SeaStore, built for the new OSD service, crimson-osd, that will replace ceph-osd. SeaStore uses ScyllaDB’s Seastar framework underneath for even more performant block device handling and it will bring support for ZNS NVMe and leverages SPDK/DPDK for getting the maximum out of NVMe arrays.
Also, “Maciej” is pronounced like so: first syllable is emphasized and is the “ma” in “mama”, the second syllable is the “cha” in “chain” (long a sound). Great video about a great post!
i think that it will take long to recover when journaling is disabled because they may be writing the same data to multiple partitions or files by configuring replication factors
This is literally a general case for most db's and step 0 deploying a mongodb acually..... So they could avoid all that trouble by not skipping db deployment guidlines
You talk about something TLS packets decryption with curl. Can you point me to that lecture ?
it is so clean!
Impressive, but the gist is that the docs already prescribe using XFS….
It says xfs OR ext4
XFS is faster because it has less safety features. :)
(It's generally less tunable than ext4, but works great for a ton of small files. ext4 is generally better for large chunk writes, fast/thorough recovery, etc.)
All depends on desired use case
Xfs is optimized for real time streaming media files
ext4 better in handling a lot of smaller files, XFS works better/faster with larger files.
I have a good idea for next video. So on my project i encountered a problem with perfomance, so there was a attempt to use cahching, but app architecture itself do not alllow to do cahcing efficient becasue a lot of additional requests was made behind the curtain at beckend depending on conditions of each user, so in result absolutely same query at frontend lead to absolutely different responses in final, so its some kined of idempotence problem, so it nearly impossible to implement caching without full refactoring of app and including all conditions in every beginning, in initial query. So it would be very interestiung to hear about how to architecture application it was fully fit for heavy caching usage and worked very fast.
Thanks
Why this course is not available for business accounts
I live on the edge.
Poland mentioned 🇵🇱🇵🇱🇵🇱
In 13:46 you have made a mistake about the directories and files used by kafka to ensure storage,according to the articles you relying on,the directory is not the topic but the partition and the segment files are used to store messages,juste a clarification.
good catch, thanks !
Was looking for a netflix show and here I am lol
Watching with 1.5x speed
Ah, thought they did by by switching to „red panda“
👋Hi everone, I'm a non-cs graduate (coming from Math major background), I've software development knowledge in NodeJS, Java, Spring Boot. But I never went though any formal operating system course before and now I want to learn about operating systems, both for my own development and for potential interview situations.. Should I take Hussein's OS course or I should take any university OS course?.............Pls any help me out!🙏
Poland Strong!
I only want to know from which FS to which FS ?
Without wasting 30mins on the video
2:34 what did he say? japanese?
POLSKA GUROM
Funfact ext4 is newer than Xfs
Ii
Good info though i find your style of talking super annoying 🤣
2:34 did bro change into a Japanese hello fellow? 😩😂