WhatsApp handles 3 MILLION TCP Connections Per Server! How do they do it? Let us discuss

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ต.ค. 2024

ความคิดเห็น • 136

  • @MrLoggfreak
    @MrLoggfreak 3 ปีที่แล้ว +164

    if you read a little bit into the erlang language and erlang VM, you'll notice that it's specifically designed to scale paralelly.
    The Erlang VM is kinda like a supervisor managing thousands or millions of erlang-processes, which are kinda like mini-threads.
    If a single erlang process dies (all these erlang processes can run in 1 server-process, or across multiple processors/cores) the vm will notice and re-start that single instance.
    Erlang is designed with a let-it-crash philosophy, allowing processes to crash instead of handing weird edge cases with complex exception handling, and by just restarting the mini-process getting to a known safe state as quickly as posible.

  • @paulcassidy4559
    @paulcassidy4559 3 ปีที่แล้ว +24

    Yo Hussein just want to say your content is really great man. As an early-stage/junior engineer these are the kind of conversations/lessons that you're lucky if you get to have once a week with veteran engineers. And you wear your expertise very humbly. An example to follow for sure. All the best!

    • @atraps7882
      @atraps7882 2 ปีที่แล้ว +2

      well said man, I am also still a junior myself only having being in this field for 2 years, and these sort of "talks" by Hussein is gold for me. Like the technologies, anyone can easily learn and implement, but like those subtle things that hussein include in his videos can only come from being a veteran engineer.

    • @shaperbros3352
      @shaperbros3352 หลายเดือนก่อน

      🎉 love that.

  • @Methodmanishe
    @Methodmanishe 4 ปีที่แล้ว +52

    I like your energy man! Interesting topics and a little humor

    • @hnasr
      @hnasr  4 ปีที่แล้ว +7

      Awesome! Thank you!

    • @iuc7254
      @iuc7254 2 ปีที่แล้ว

      And a little humor :p

  • @farhanyousaf5616
    @farhanyousaf5616 4 ปีที่แล้ว +9

    You have a knack for choosing really interesting topics. Keep 'em coming!

  • @sandyx7381
    @sandyx7381 3 ปีที่แล้ว +27

    They use Erlang. The Erlang VM creates processes ok 1kb and has a supervisor that can spin up failed processes.

  • @nkpg
    @nkpg 4 ปีที่แล้ว +8

    It's just my guess! may be they are using one connect feature in f5 Load balancer(connection aggregation) to speed up connection. But architecture sounds like very very well designed. Can't imagine amount of effort done by architect.

  • @HemantKashniyal
    @HemantKashniyal 3 ปีที่แล้ว +6

    Taking a page from storage systems, they might have servers running custom application kernel module which dedicated kernel threads mapping TCP data to user mapped memory. Every processing would be done in user space without a kernel context switch.
    Storage systems achieve really high performance using this approach.

    • @SteinCodes
      @SteinCodes 2 ปีที่แล้ว +4

      Very close to what is actually, done.
      I was surprised by the video and how utterly the guy fails at how it could work and where the bottlenecks lie considering he has been working in the field for over a decade.
      Still respect the fact atleast he is asking interesting stuff. (Maybe I am extra salty as a company threw me out of interview/selection back in my college days because I tried to explain them how a single machine could handle 10 Million+ connections concurrently...)
      Btw this is 2022, and Epyc CPUs with 96+ cores exist and RAM capacity upto 12TB.
      And 100Gbe Nics are dime a dozen, my consumer grade system has 3 PCIE 4.0 ports to make that a possibility times 3.
      It may just do 25-50M concurrent connections these days with ease.
      Now let's discuss how, as I already displayed hardware is more than powerful enough.
      1. The bottleneck is the kernel as you already have noted.(I think)
      2. So just move to user space, and this is done by having a user space data plane that allows efficient and robust alternative to Linux networking stack(which is pretty shit at scale).
      3. The most popular package is DPDK for it.(made by intel)
      4. Next step is to get a network stack, they are dime a dozen as well all based on DPDK. Checkout f-stack made by Tencent peeps and it allowed 10M connections on Xeons back in 2013-15.
      5. Last step your server, modern servers allows async and multi core programming and can easily handle millions of connections. I recommend Nginx. Apache is shit btw.
      As for storage peeps yeah they do love their user space dma buffers, tbh don't we all, I remember doing something similar for streaming in a game I was trying to make because I wanted to load a huge file(scenes/levels store as binary resource) as fast as possible.
      Fun fact: I have seen people use implementations for faster memory mapped io paths than what mmap and Linux kernel provide.
      Generally though the key idea is DMA and DMC with data intensive situations, for which Linux mmap is sufficient.

  • @nicholaslaquerre2785
    @nicholaslaquerre2785 ปีที่แล้ว +1

    wesome video. You're the best Hassein! Long time fan.
    Congrats on discovering OBS!!
    I remember when I made the switch. I played around for hours and hours. You should make a video on how it works: I'm convinced it must implement some level of virtualization, along with very ground-up methods of encoding and Interaction with the OS. It seems to act independently from the influence of other applications in a way? I'd be interested in what you come up with regarding the ins and outs of its design and what seems to be unique abilities.
    Great job as always.

  • @badcommand
    @badcommand 3 ปีที่แล้ว +1

    Listening to you is like listening to an old friend. Wonderful video!

  • @joostvhts
    @joostvhts 3 ปีที่แล้ว +1

    You're good at this, I randomly found your channel when I was looking for an nginx tutorial but then subscribed as well, because, due to the quality, I expected your channel to be bigger than it currently is :)

    • @hnasr
      @hnasr  3 ปีที่แล้ว +1

      Thank you dear appreciate the support! I am so grateful to have you all

  • @arpanghoshal2579
    @arpanghoshal2579 2 ปีที่แล้ว

    This makes me happy since I do elixir which is build on top of erlang and has similar capabilities. The posibilities are just unlimited when you have such a powerfull langugae. Concurrent and parallel programming is very easy with functional languages like elixir/erlang since dnt have any shared state.

  • @grizzly_monkey
    @grizzly_monkey ปีที่แล้ว +1

    They use customized XMPP on heavily customized ejabberd

  • @yodude2493
    @yodude2493 2 ปีที่แล้ว +1

    Love you Hussein!
    Thanks for your videos, maybe you won't be ex tech lead slash millionaire but you bring a lot of knowledge to us and help tons of people to understand a tech.
    See you in the next video comment section!

  • @jayeshsuthar5590
    @jayeshsuthar5590 3 ปีที่แล้ว +1

    Such a interesting topic man! Keep talking about these. Pretty interesting

  • @publicuser993
    @publicuser993 3 ปีที่แล้ว +9

    You definitely should read about Telegram

  • @MrXperx
    @MrXperx 3 ปีที่แล้ว +6

    Hi Hussein. I loved your db udemy course. Is it possible for you to make one on networking? I like your youtube videos but find a code little more coherent.

  • @MrDjRayner
    @MrDjRayner 3 ปีที่แล้ว +4

    Hi, Can you please use a whiteboard and pen, or a pen enabled device during discussion as it helps visual learners like me.

  • @sannge7967
    @sannge7967 3 ปีที่แล้ว

    I might be wrong, but they can have 2 connections from frontend, both http for backend and Sockets tcp for backend, and then there would be a load balancer ip pointed to frontend, and that loadbalancer will be linked with all the servers backend that will redirect users to their local servers and also based on connection type. Also, they can also have multiple ports in one server as you says

  • @somasundarv
    @somasundarv 3 ปีที่แล้ว +1

    I somehow tumbled upon one of your video. Boom, I subscribed and I became a fan of your content. I keep playing all your videos and learning a lot from those. Regarding this video I have a noob question. How do we even decide or come up with number of initial machines(I know we can scale up horizontal in later stage) we should use during our design phase?

  • @prashanthb6521
    @prashanthb6521 ปีที่แล้ว

    Yo man, a friendly suggestion, please take it positively because I like some of the topics you talk about. Please stay on topic and narrate smoothly. Even 1.5x speed doesnt help to ignore you wavering onto irrelevant words/topics/humour/pauses. I still give this video a thumbsup. Thanks.

  • @streetfashiontv9149
    @streetfashiontv9149 3 ปีที่แล้ว +3

    Why did it surprise you that they had there servers in the US?

    • @ankitlakum1
      @ankitlakum1 3 ปีที่แล้ว

      Hahaha

    • @debkr
      @debkr 2 ปีที่แล้ว

      Haha :)

  • @sariksiddiqui6059
    @sariksiddiqui6059 4 ปีที่แล้ว

    Another nice video man.. I think you have been using the terms socket and connections interchangeably.I think normally server listens on a single socket but have multiple connections.You can be limited on socket by no of ports available,connections are more of property of cpus and memory and others

    • @hnasr
      @hnasr  4 ปีที่แล้ว +6

      Siddiqui Sarik thanks for correcting me Siddiqui! Appreciate it. Correct, server listens on a socket on a port e.g 80. And connections to port 80 Are uniquely identified by source port/source ip

  • @GK-rl5du
    @GK-rl5du 4 ปีที่แล้ว +6

    Dumb questions alert:
    1. Is Linux Kernel's stock TCP stack that scalable or do they have their own customizations to it ?
    2. Given that everything is E2E encrypted in whatsapp. I am surprised why CPU consumption is too low at that scale. Is it possible?

    • @espeon91
      @espeon91 2 ปีที่แล้ว +1

      In the blog post linked, they show that they were using FreeBSD, not Linux.
      This post pre-dates E2E on WhatsApp. CPU usage should not matter for this anyway as the server can't decrypt the messages

  • @aamironline
    @aamironline 3 ปีที่แล้ว +3

    At 1:19 You said, in the middle east every single person uses WhatsApp - This statement looks factually very much incorrect because as per my knowledge, in the Middle East, especially UAE, WhatsApp, Slack, Discord and many such commonly used communication channel is banned!

    • @debkr
      @debkr 2 ปีที่แล้ว

      This broadcaster has no idea what he talks.

  • @StyleTrick
    @StyleTrick 4 ปีที่แล้ว

    Really interesting discussion, it would be great if you hosted some system design vids along too!

    • @hnasr
      @hnasr  4 ปีที่แล้ว +2

      I started a system design series will need to crank some more content ..
      th-cam.com/play/PLQnljOFTspQXSevtRqvMNycWfHM7cXc3d.html

  • @Textras
    @Textras 4 ปีที่แล้ว

    Now we're cooking with fire Hussein! :) *Well you always are to be clear, but love these 'at scale' videos.

    • @hnasr
      @hnasr  4 ปีที่แล้ว +1

      :D :D More to come! thanks Textras!!

  • @TestAutomationTV
    @TestAutomationTV 3 ปีที่แล้ว +3

    AFAIK a server can have 65536 port at one time, how are they managing millions of sockets on one server?

    • @hnasr
      @hnasr  3 ปีที่แล้ว +3

      a server can listen on one port and can have many connections connecting to that port as long as the Source IP and Source port are unique.

    • @TestAutomationTV
      @TestAutomationTV 3 ปีที่แล้ว +3

      ​@@hnasr Here's what I made of your reply. They have multiple network interfaces installed on a server. Each one is listening to maximum possible number of ports.
      A server can listen to any number of client connection requests on a port as long as client IP/Port are unique.
      Is this understanding correct?

    • @debkr
      @debkr 2 ปีที่แล้ว +2

      @@TestAutomationTV This @Hussein guy is really funny

    • @TestAutomationTV
      @TestAutomationTV 2 ปีที่แล้ว

      @@debkr yes, he's both technically sound and funny. I am also an instructor and a content creator and I would love to incorporate his creative style into my work.

  • @anindyasundarmanna6683
    @anindyasundarmanna6683 3 ปีที่แล้ว

    Is it a problem of my side or the highest resolution of this video is 720p? It's bit blurry. :(

  • @sectumsemparium
    @sectumsemparium 3 ปีที่แล้ว +1

    I agree it has to be TCP, they cant maintain multiple connections so they force you more like restrict you to ta single login on a app on any device

  • @mangaldev
    @mangaldev 3 ปีที่แล้ว +1

    Is it only me or anyone else also thinks, he looks more like Carry Minati :)

  • @code_with_om
    @code_with_om 2 ปีที่แล้ว +1

    Can't they use the front end load balancer 🙂 calrify if I am getting something wrong

  • @ArunprasadRajkumar
    @ArunprasadRajkumar 3 ปีที่แล้ว +1

    I think they started with XMPP and ejabberd, not sure what is there currently!

    • @debkr
      @debkr 2 ปีที่แล้ว

      They are still using XMPP but heard that instead of XML they have heavily modified the protocol to use JSON. Not sure though.

  • @MithunKumar-xy9pp
    @MithunKumar-xy9pp 3 ปีที่แล้ว +1

    how is session and connection management done? How do back-end maintain so many connections info?

  • @srjshapthnktl4978
    @srjshapthnktl4978 2 ปีที่แล้ว

    UDP and multicast subscriptions would need much less memory and can service a lot more requests..

  • @adrenaline.2530
    @adrenaline.2530 3 ปีที่แล้ว

    Great discussion 👍

  • @shaperbros3352
    @shaperbros3352 หลายเดือนก่อน

    Ty 😊

  • @hitmusicworldwide
    @hitmusicworldwide 3 ปีที่แล้ว

    Very popular in EMEA NOW I understand why Facebook bought it before any one else did

  • @rahul_bali
    @rahul_bali 3 ปีที่แล้ว +2

    Guys Read about EVM, erlang virtual machine.

  • @learnnow9598
    @learnnow9598 3 ปีที่แล้ว

    Can we Use udp for a chat application? Can you describe the advantages and disadvantages of it?

  • @nightking4615
    @nightking4615 3 ปีที่แล้ว +2

    Ever heard of ERLANG? Look into it, you will get all your answers.

    • @imacprousersam7306
      @imacprousersam7306 3 ปีที่แล้ว

      Naseer is talking deep, you guessed Erlang ok fine but how does Erlang achieve this internally

    • @rahul_bali
      @rahul_bali 3 ปีที่แล้ว

      @@imacprousersam7306 man, why you always talk this way.. Just gratitude for this man.
      Then move on and do your own research.
      Erlang is pretty old shit, around for more than your age.

    • @debkr
      @debkr 2 ปีที่แล้ว

      @@imacprousersam7306 Oh so did naseer explained that in this video. I missed it so sorry, Lol.

  • @siyaram2855
    @siyaram2855 3 ปีที่แล้ว

    Please make more videos on erlang

    • @debkr
      @debkr 2 ปีที่แล้ว

      This video was not about erlang. He does not know himself what this video is about :(

  • @ravikumar-yq5df
    @ravikumar-yq5df 4 ปีที่แล้ว

    Awesome content 👍👌

  • @techwithameer
    @techwithameer 4 ปีที่แล้ว +2

    Can you make a Video on GPT 3 and will these AI technologies affect developer jobs also?

  • @seanwu5562
    @seanwu5562 3 ปีที่แล้ว +2

    In the far East we use Line

  • @JugaadTech
    @JugaadTech 3 ปีที่แล้ว +1

    million connection per second? per minute? per day?? rate is important

    • @hnasr
      @hnasr  3 ปีที่แล้ว +5

      It is a stateful and long-lived, once the connections are created they are there. And the maximum number of connections are 3 million.
      We are not talking about requests which will be described as you said.

    • @JugaadTech
      @JugaadTech 3 ปีที่แล้ว

      @@hnasr understood

    • @debkr
      @debkr 2 ปีที่แล้ว

      @@hnasr Read about jabber and xmpp protocol please.

  • @AnimusAgent
    @AnimusAgent 2 ปีที่แล้ว

    Everyone in Brazil also uses whatsapp, so its like 200 million more XD

  • @tekfreaks
    @tekfreaks 4 ปีที่แล้ว

    Nice content once again. you were discussing most of the time. I felt, the actions which you make to explain was missed(Since, the screen was smaller). But, loved these discussions. Good to know about the design perspective.
    Always keep making videos. Thanks Hussein

    • @hnasr
      @hnasr  4 ปีที่แล้ว +2

      Glad you enjoyed it Waseem! right! i should have made my screen bigger when I annotate .. will do next time :)

  • @abbasfais
    @abbasfais 3 ปีที่แล้ว +1

    Here is this blog that talks about actual software engineering stuff. Then there are others that just like to talk how coders are supposed to be and how coding is cool, their pride with being a supposed software engineer, showing off their offers and salaries from FAANG.Yeah, basically all BS except the work itself.

  • @Tldrx
    @Tldrx 3 ปีที่แล้ว

    I thought there is only 60K port in a server. how come there is 2 million connections?

    • @baxiry.
      @baxiry. 3 ปีที่แล้ว

      When there are thousands of people connected to yalla.com? They are connected to it through one port ":80" or ":443"

    • @debkr
      @debkr 2 ปีที่แล้ว

      Number of connections has nothing to do with number of ports. Search networking basics and would find some wonderful tutorials. All the best.

  • @debkr
    @debkr 2 ปีที่แล้ว

    What do you do sir ?

  • @srjshapthnktl4978
    @srjshapthnktl4978 2 ปีที่แล้ว

    I don't see why UDP can't do the job. There is no need for TCP.

  • @jayjayma
    @jayjayma 3 ปีที่แล้ว

    Hey, I'm relatively new to computer science. I was wondering how does a server support more than 65535 simultaneous connections? Maybe I am misinterpreting this.

    • @ruhnet
      @ruhnet 3 ปีที่แล้ว

      +John Marshall I believe they would be using multiple IPs. You can use 65535 ports on each IP.

    • @jayjayma
      @jayjayma 3 ปีที่แล้ว

      @@ruhnet Interesting. How may I utilize other IP's apart from 127.0.0.1 on a local machine

    • @ruhnet
      @ruhnet 3 ปีที่แล้ว +2

      @@jayjayma 127.0.0.0/8 is a full class A block so in addition to 127.0.0.1 you can assign 127.0.0.2, 127.0.0.3... to your local interface. (Up to over 16 million addresses if you need/want.)
      😁

    • @jayjayma
      @jayjayma 3 ปีที่แล้ว +1

      @@ruhnet wow thanks!

  • @mitsmps6645
    @mitsmps6645 2 ปีที่แล้ว

    In India.
    Do you have aadhar card(ID)?
    NO
    oh it's fine.
    But you do have whatsapp right.

  • @umesh.uk11
    @umesh.uk11 ปีที่แล้ว

    Topic is good but explain with some drawings

  • @VinothKumar-zl2ht
    @VinothKumar-zl2ht 2 ปีที่แล้ว

    super

  • @AnubhavShrivastava
    @AnubhavShrivastava 2 ปีที่แล้ว

    number of ports can be only 65535?

  • @Ali_Alhajji
    @Ali_Alhajji 4 ปีที่แล้ว

    Most Chinese people don't know WhatsApp.. actually it was blocked recently.

  • @mrhidetf2
    @mrhidetf2 4 ปีที่แล้ว +1

    cant u check out the protocol with apk-tool? then u atleast know what exactly the client is sending

    • @debkr
      @debkr 2 ปีที่แล้ว

      +1

  • @tarekali7064
    @tarekali7064 4 ปีที่แล้ว +1

    Hey Hussein! If you need help setting up OBS, just @ me and I can assist you in attaining a nice setup.

    • @hnasr
      @hnasr  4 ปีที่แล้ว

      will do thanks Tarek!

    • @tarekali7064
      @tarekali7064 4 ปีที่แล้ว

      @@hnasr No problem!

  • @pammybcc
    @pammybcc 3 ปีที่แล้ว +1

    i am able to understand at 1.75
    Welcome

  • @ankitlakum1
    @ankitlakum1 3 ปีที่แล้ว

    Sockets between clients telling watsapp each other (indempotentialy)

  • @UjjwalKumar-wg4wu
    @UjjwalKumar-wg4wu 3 ปีที่แล้ว

    signal protocol?

  • @fxstreamer238
    @fxstreamer238 3 ปีที่แล้ว

    is 1m connection that hard though?

    • @putrafajarh
      @putrafajarh 3 ปีที่แล้ว

      for single server 2012

  • @egor.okhterov
    @egor.okhterov 3 ปีที่แล้ว +1

    If someone have used ejabberd in production... well, you know it's pain in the ass =)

    • @debkr
      @debkr 2 ปีที่แล้ว +1

      Cannot agree more. The XMPP protocol itself is just too heavy for now.

  • @debkr
    @debkr 2 ปีที่แล้ว

    Do you even understand technology. Of course they are using a single port. When you don't understand technology why mislead young engineering. At the beginning you said probably they use some native protocol then you are talking about Layer 7 networking. Do you know your videos is just filled with just jargons. Please talk constructive. I am sure you don't understand technology.

  • @Derbauer
    @Derbauer 4 ปีที่แล้ว

    For real?

  • @ca7986
    @ca7986 4 ปีที่แล้ว

    ❤️

  • @ZelenoJabko
    @ZelenoJabko 4 ปีที่แล้ว +2

    NGINX would probably be able to achieve 1 million connections as well. This is nothing special.

    • @brangi
      @brangi 4 ปีที่แล้ว +5

      You clearly don't understand OTP

  • @noobian3314
    @noobian3314 3 ปีที่แล้ว +7

    this guy said a whole lotta nothing....

    • @九世心
      @九世心 3 ปีที่แล้ว +1

      Yes, you are right, but only the discussion itself is already interesting enough. It reminds me how powerful BEAM-based languages are.

    • @debkr
      @debkr 2 ปีที่แล้ว +1

      Cannot agree more. Lol

    • @michael30000
      @michael30000 ปีที่แล้ว

      That’s because you are a moron.

  • @maratmkhitaryan9723
    @maratmkhitaryan9723 3 ปีที่แล้ว

    Yeah microservices are fucked up thing. But it's more fucked up when you create microservices that communicate via RPC and it becomes microlith instead of good decoupled microservices :)

  • @pajeetsingh
    @pajeetsingh 3 ปีที่แล้ว

    Uses OTP.
    Proud. Lol.

  • @nginxplusenespanol8937
    @nginxplusenespanol8937 3 ปีที่แล้ว +1

    NGINX

  • @nginxplusenespanol8937
    @nginxplusenespanol8937 3 ปีที่แล้ว

    GSLB

  • @debkr
    @debkr 2 ปีที่แล้ว

    Seriously too much bullshit. Don't do this please. Please i am requesting you. Probably every tech guy in the world knows it is based on Jabber XMPP protocol. Later yes they did a lot of tweaking. Either talk about their core architecture or talk about scalable network architecture. You have no idea what are you talking about.

  • @youtubegarbage4u
    @youtubegarbage4u 3 ปีที่แล้ว

    blog was written back in 2012...why are you discussing now? especially why not discuss what the stats is in 2020/2021?

  • @lucioleepileptique9195
    @lucioleepileptique9195 ปีที่แล้ว

    This is not a low level protocol, but many companies have implemented this :
    en.m.wikipedia.org/wiki/Signal_Protocol