14: Distributed Logging & Metrics Framework | Systems Design Interview Questions With Ex-Google SWE

แชร์
ฝัง
  • เผยแพร่เมื่อ 12 ม.ค. 2025

ความคิดเห็น • 73

  • @martinwindsor4424
    @martinwindsor4424 10 หลายเดือนก่อน +44

    Jordan might not be a pregnant, but he never fails to deliver.

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 หลายเดือนก่อน +19

      I might be pregnant

    • @jhonsen9842
      @jhonsen9842 5 หลายเดือนก่อน +2

      Jordan is pregnant, and he Delivers in All sys design.

  • @siddharth-gandhi
    @siddharth-gandhi 10 หลายเดือนก่อน +5

    Bro's single handedly making me question studying ml over systems. Bravo on these videos!

  • @knightbird00
    @knightbird00 2 หลายเดือนก่อน +2

    Talking points
    Can intro about push (low latency, needs app changes (better data), version skew) vs pull (highly scalable, no app changes(generic data), service discovery)
    2:07 High volume, time series, structured vs unstructured, text logs, data sink
    4:26 Data aggregation (tumbling vs hopping)
    6:30 Time series db (hyper table, chunks, partition (label, timestamp))
    10:40 Text logs (elasticsearch)
    12:40 Structured data (better encoding, protobuf, avro, schema registry)
    16:05 unstructured data (post processing to structured data, flink -> file -> ETL using spark)
    17:40 Analytics data sink (column but not tsdb, more like OLAP), use parquet files for loading (S3 vs HDFS).
    23:45 Stream enrichment
    25:15 Diagram

  • @chadcn
    @chadcn 10 หลายเดือนก่อน +10

    Congrats on 200 videos mate! Keep up the great work 🚀🚀

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 หลายเดือนก่อน +3

      Thanks man!! I guess I actually enjoy doing this 😄

  • @doobie91
    @doobie91 10 หลายเดือนก่อน +3

    Thanks a lot for your videos. Currently looking for a new job, brushing up/learning a lot about system design, watched lots of your videos recently. Appreciate your work. Keep it up!

  • @JapanoiseBreakfast
    @JapanoiseBreakfast 2 หลายเดือนก่อน +2

    How to build distributed logging and metrics in 3 easy steps:
    1) Join Google.
    2) Run blaze build //analog:all //monarch:all.
    3) Profit.
    Congratulations, you have now built distributed logging and metrics. Thank you for coming to my ted talk.

  • @VijayInani
    @VijayInani 5 หลายเดือนก่อน +2

    Why are you so underrated!!! You should have been famous until now (more than your current famous index!).

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 หลายเดือนก่อน +3

      I'm famous in the right circles (OF feet creators)

  • @beecal4279
    @beecal4279 3 หลายเดือนก่อน +2

    thanks for the video
    in 22:00 when we say Parquet files are partitioned by time, do we mean partitioned by the file creation time?

    • @jordanhasnolife5163
      @jordanhasnolife5163  3 หลายเดือนก่อน +2

      I mean time of the incoming data/message, however that's probably similar

  • @SunilKumar-qh2ln
    @SunilKumar-qh2ln 10 หลายเดือนก่อน +2

    Very informative video as always.
    Was just thinking how the metric is pulled by prometheus (which will eventually store in the DB).
    How the different clients responsibility is assigned to the aggregator pods so that metric is pulled exactly once from each client pods.

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 หลายเดือนก่อน

      I'm not too familiar with prometheus personally, feel free to expand on what you're mentioning here!

    • @georgekousouris4900
      @georgekousouris4900 3 หลายเดือนก่อน +1

      In this video you are using the push method, by having hosts connect to Kafka directly. This could be deemed too perturbing to the millions of hosts, so instead they can expose a /metrics endpoint that a consumer can use to fetch their current data.
      To answer the question above, we need to do some sort of consistent hashing to assign the millions of hosts to consumer instances and then put the data in Kafka (can create multiple messages, one for each metric).
      In the push method, we are putting the data directly to Kafka from each EC2 host where it is buffered before being consumed by our Spark Streaming instance that updates our DBs.

  • @prasenjitsutradhar3368
    @prasenjitsutradhar3368 10 หลายเดือนก่อน +3

    Great content!....pls make a video on code deployment!

  • @DavidWoodMusic
    @DavidWoodMusic 10 หลายเดือนก่อน +5

    My interview is in 9 hours
    I hear your voice in my sleep
    I have filled a notebook with diagrams and concepts
    And I am taking a poopy at this very second
    We just prevail

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 หลายเดือนก่อน +4

      Just imagine me doing ASMR as I tell you about my day in a life

    • @bezimienny5
      @bezimienny5 8 หลายเดือนก่อน +3

      Yo how did it go? Are you in your dream team? I sure hope so

    • @DavidWoodMusic
      @DavidWoodMusic 8 หลายเดือนก่อน +1

      @@bezimienny5 thanks friend. Offer was made but I turned it down. Turned out to be a really poor fit.

    • @bezimienny5
      @bezimienny5 8 หลายเดือนก่อน +2

      @@DavidWoodMusic oh damn. That's a Shame. I'm kinda struggling with a similar decision right now. I passed all the interview stages but even at the offer stage I'm still learning new key pieces of info about the position that no one told me about before....
      But hey, you beat the systems design interview! That's an amazing win and now you know you can do it 😉

  • @guitarMartial
    @guitarMartial 2 หลายเดือนก่อน +2

    Jordan can TSDB's run in multi leader configuration as there are perpetually no write conflicts per se? And is that a typical pattern where a company might run multiple Prometheus leaders which just replicate data amongst themselves to get an eventually consistent view? Or is single leader replication still preferred? Thinking multi leader helps with write ingestion. Thanks!

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 หลายเดือนก่อน +1

      There are many time series databases, so you'd have to look it up. But at the end of the day I think what you're looking for is better solved by just sharding per data source.
      That should help with ingestion speeds. If you're not really writing to the same key on multiple nodes, I'm hesitant to call it multi leader replication.

  • @abhishekmarriala7013
    @abhishekmarriala7013 8 วันที่ผ่านมา +1

    Can I consider this for something like design google analytics ?

  • @deepitapai2269
    @deepitapai2269 6 หลายเดือนก่อน +2

    Great video as always! Why do you store the files on S3 as well as a data warehouse? Why not just store on the data warehouse directly from Parquet files? Is it that we need a Spark consumer to transform the S3 files before putting the data into the data warehouse?

    • @jordanhasnolife5163
      @jordanhasnolife5163  6 หลายเดือนก่อน +1

      Depends on the format of the S3 data. If it's unstructured, then we'd likely need some additional ETL job to format it and load it into a data warehouse.

  • @shibhamalik1274
    @shibhamalik1274 9 หลายเดือนก่อน +2

    Hey jordan what is the data source in the last diagram here ? Is it the VM pushing logs / serialised Java objects etc to kafka ?? U mean the application when it logs a statement that statement makes a push to kafka ?
    Then what should be the partition key of this kafka cluster ? Should it be server id or a combination of server id + app name or how should be we structure this partition key ?

    • @jordanhasnolife5163
      @jordanhasnolife5163  9 หลายเดือนก่อน +1

      Yes the application is pushing to Kafka. I think that you should probably use the app/microservice name as the Kafka topic, and then within that partition by server ID in kafka

  • @shibhamalik1274
    @shibhamalik1274 9 หลายเดือนก่อน +2

    Hey Jordan do u have a video on pull vs push based models of consumption ? I blv kafka is pull based but I want to understand who uses push based

    • @jordanhasnolife5163
      @jordanhasnolife5163  9 หลายเดือนก่อน +1

      Nothing regarding which message brokers do push based messages, feel free to Google it and report back

    • @shibhamalik1274
      @shibhamalik1274 9 หลายเดือนก่อน +2

      Ok
      @jordan. I think kafka is push based not pull based .
      Pull based could be custom implemented I think …

  • @Dozer456123
    @Dozer456123 3 หลายเดือนก่อน +2

    Is it true that s3 files would still have to get loaded over the network for something like AWS Athena? That seems to be a data warehousing strategy that relies on native s3 and not loading all of it across network

    • @jordanhasnolife5163
      @jordanhasnolife5163  3 หลายเดือนก่อน +1

      Unfortunately I can't claim to know very much about Athena, I'll have to look into it for a subsequent video.

    • @Dozer456123
      @Dozer456123 3 หลายเดือนก่อน

      @@jordanhasnolife5163 Sorry, didn't mean to use you like Google :P.
      I researched it after I asked, and it's quite cool. Basically a serverless query engine that's direct-lined into S3

  • @31737
    @31737 7 หลายเดือนก่อน +2

    hey Jordan great video, does this require any sort of API design? given that we need to read through the data metrics does it makes sense to also describe the API structure, let me know your thoughts, thanks.

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 หลายเดือนก่อน

      Sure. You need an endpoint to read your metrics by time range, and it probably returns paginated results. (Perhaps taking in a list of servers)
      Anything else you're looking for?

    • @31737
      @31737 7 หลายเดือนก่อน +2

      @@jordanhasnolife5163 right also for elastic search result you gonna need an API unless you wanna combine it with metrics which I don't think it's a good idea

    • @31737
      @31737 7 หลายเดือนก่อน +2

      Also a request for making a video for tracking autonomous cars + collecting other metrics sensors/etc, thanks man your work is gold and I love the depth them

  • @nithinsastrytellapuri291
    @nithinsastrytellapuri291 9 หลายเดือนก่อน +2

    Hi Jordan, I am trying to cover infrastructure-based system design questions like this one first. Can you please clarify if I need to watch video 11, 12, 13 to understand this? Any prerequistes ?( I have covered concepts 2.0). Is it same for 17, 18, 19 videos as well?

  • @shibhamalik1274
    @shibhamalik1274 9 หลายเดือนก่อน +2

    Hey jordan nice video do u have any video on which databases support cdc and how ?

    • @jordanhasnolife5163
      @jordanhasnolife5163  9 หลายเดือนก่อน +1

      I think you can figure out a way to make it work on basically any of the major ones, don't have a video on it though

  • @mukundkrishnatrey3308
    @mukundkrishnatrey3308 6 หลายเดือนก่อน +2

    Hi Jordan,
    Regarding the post processing of unstructured data, can we do the batch processing in Flink itself, as it does support that, or it's not suitable for large scale of data?
    What could be the size of data which can dealt by flink itself, after which we might need to use HDFS/ Spark?
    PS :- Thanks for the amazing content, you're the best resource I've found till date for system design content :)

    • @jordanhasnolife5163
      @jordanhasnolife5163  6 หลายเดือนก่อน +2

      Flink isn't bounded in the amount of data it can handle, you can always add more nodes. The difference is that flink is for stream processing. Feel free to watch the flink concepts video, it may give you a better sense of what I mean here.

    • @mukundkrishnatrey3308
      @mukundkrishnatrey3308 6 หลายเดือนก่อน +2

      @@jordanhasnolife5163 Okay, got it now, thanks a lot again!

  • @frostcs
    @frostcs 5 หลายเดือนก่อน +2

    Hyper table is more of timescaledb concept which is more of b+tree not sure why you mention LSTM tree there 9:00

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 หลายเดือนก่อน +1

      Fair point I guess if it's built on postgres it would be a b tree

  • @OmprakashYadav-nq8uj
    @OmprakashYadav-nq8uj 2 หลายเดือนก่อน +2

    Great system design content. One think i noticed voice is not sync with real video.

  • @helperclass
    @helperclass 9 หลายเดือนก่อน +2

    Great video man. Thanks.

  • @timavilov8712
    @timavilov8712 10 หลายเดือนก่อน +4

    U forgot to mention the tradeoff between polling and pushing for event producers

    • @timavilov8712
      @timavilov8712 10 หลายเดือนก่อน +3

      Great video tho !

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 หลายเดือนก่อน +1

      I'm assuming you mean event consumers not producers. Yeah this is one of those things where it's kinda built into the stream processing consumer that you use, so under the hood I assume we'll be using long polling.
      I don't know that I see the case made here for web sockets since we don't need bidirectional communication. Server sent events may also be not great because we'll try to re-establish connections automatically, which may not be what we want if we rebalance our kafka partitions.

  • @JapanoiseBreakfast
    @JapanoiseBreakfast 2 หลายเดือนก่อน +2

    Missed opportunity for a coughka pun.

  • @jordiesteve8693
    @jordiesteve8693 6 หลายเดือนก่อน +2

    thanks for your work!

  • @AdeshAtole
    @AdeshAtole 27 วันที่ผ่านมา +1

    Yes ofcourse I know all the terms used in this design.

  • @bimalgupta3648
    @bimalgupta3648 8 หลายเดือนก่อน +2

    Watching this while taking a dump

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 หลายเดือนก่อน +1

      Responding to this while taking a dump

    • @bimalgupta3648
      @bimalgupta3648 8 หลายเดือนก่อน +1

      @@jordanhasnolife5163 No wonder you have no life

  • @hoyinli7462
    @hoyinli7462 10 หลายเดือนก่อน +2

    great job!

  • @chaitanyatanwar8151
    @chaitanyatanwar8151 หลายเดือนก่อน +2

    Thank you!

  • @aryanpandey7835
    @aryanpandey7835 10 หลายเดือนก่อน +2

    sir please share slides with us

  • @sohansingh2022
    @sohansingh2022 10 หลายเดือนก่อน +3

    Thanks

  • @zuowang5185
    @zuowang5185 8 หลายเดือนก่อน +1

    do these new videos replace the old? th-cam.com/video/_KoiMoZZ3C8/w-d-xo.html

  • @user-se9zv8hq9r
    @user-se9zv8hq9r 10 หลายเดือนก่อน +4

    can we design onlyfans or fansly