debugging and fixing pyuwsgi in python 3.12 (advanced) anthony explains

แชร์
ฝัง

ความคิดเห็น • 56

  • @jmson23
    @jmson23 5 หลายเดือนก่อน +29

    Impressive Anthony. That's quite the rabbit hole of debugging! Really enjoy your videos please keep posting.

  • @Reecepbcups
    @Reecepbcups 5 หลายเดือนก่อน +11

    Amazing!! such an amazing debug session, thanks for the walk though here. You know its good when you have to break out gdb across py versions, phew

  • @kRySt4LGaMeR
    @kRySt4LGaMeR 5 หลายเดือนก่อน +2

    When you said you had a follow-up didn't expect a hour-long in-depth look into the issue. Thanks so much!

  • @NestiGX
    @NestiGX 5 หลายเดือนก่อน +3

    Wow, that's impressive. You summed it up very well, quite enjoyable to watch :D

  • @shadowviper95
    @shadowviper95 5 หลายเดือนก่อน +2

    very cool to see a "real world" case like this in such detail!

  • @ilovepeaceandplaying8917
    @ilovepeaceandplaying8917 4 หลายเดือนก่อน +1

    An interesting bug to debug for you and it was really interesting to watch you debug it step by step like crime investigation, loved it, every time I watch your videos, I learn new things. Thanks for putting this as a video.

  • @mrswats
    @mrswats 5 หลายเดือนก่อน +2

    This was super interesting! I watched the stream where you were setting all this up, happy to see the conclusion of jt!

  • @MattLayman
    @MattLayman 4 หลายเดือนก่อน +1

    That was epic @anthonywritescode! I had to grab some popcorn. :) I love that you not only fixed the problem, but then kept going until you truly understood why things were ok in 3.11 but not in 3.12.

  • @djchainz
    @djchainz 5 หลายเดือนก่อน +1

    Wow, this was epic. Thank you for recording all this, I hope it was cathartic.

  • @RealMineManUK
    @RealMineManUK 5 หลายเดือนก่อน

    Great video, showing the whole process from the beginning till the end

  • @malteplath
    @malteplath 5 หลายเดือนก่อน

    Thanks for getting to the bottom of this. I watched your first video on chasing this bug, and I could not wait to see who the culprit was.

  • @xfilinxcl2481
    @xfilinxcl2481 5 หลายเดือนก่อน +10

    А потом менеджер увидит это мизерное изменение в коде и подумает, что ты не работаешь, потому что код для поиска бага нигде не учитывался :)

  • @redark7
    @redark7 5 หลายเดือนก่อน +1

    This was quite deep! Changing the thread state and doing fork sounds very dangerous… without very clear of what and why.

  • @ericng8807
    @ericng8807 5 หลายเดือนก่อน +30

    me wishing i was advanced enough to watch the video

    • @mrswats
      @mrswats 5 หลายเดือนก่อน +1

      Man, same

    • @Ash-qp2yw
      @Ash-qp2yw 5 หลายเดือนก่อน +2

      Me wishing this was the top comment. :p

  • @applePrincess
    @applePrincess 5 หลายเดือนก่อน +2

    Amazing video as always.
    Yes, fflush is required since buffering mode on stdout is platform independent as per C spec or POSIX.
    most *modern* platform does line buffering though.
    Or you could call setvbuf before calling printf to set line buffered

  • @chadobrien3352
    @chadobrien3352 4 หลายเดือนก่อน +1

    I definitely expected a pen flip in this session. Video brought me back to Mr Swoyer's AP Calc classes.

    • @anthonywritescode
      @anthonywritescode  4 หลายเดือนก่อน

      yoooooo how are you doing!! I was actually doing some pen flips on my last stream :D hope you're doing well!

    • @chadobrien3352
      @chadobrien3352 4 หลายเดือนก่อน

      @@anthonywritescode I'm doing pretty well! Thanks. Our group just picked up (inherited) a python project so been using your channel to catch up on the python goodness. I actually found your channel through @ThePrimeagen who did a reaction video.

  • @teejaded
    @teejaded 4 หลายเดือนก่อน

    Great fun! Debugging interpreted languages can be very challenging! I spent quite a lot of time improving the compatability of the gopherlua socket package. A lot of that reminds me of this, staring at the luasocket c code and comparing the behavior to the go port.

  • @weekendforever
    @weekendforever 5 หลายเดือนก่อน +6

    Thanks for sharing. It's a shame one can only subscribe once to a channel and not do it twice for more content. ;)

    • @anthonywritescode
      @anthonywritescode  5 หลายเดือนก่อน +2

      there's always twitch and @anthonywritescode-vods ;)

  • @GiovanniBarillari-z9f
    @GiovanniBarillari-z9f 3 หลายเดือนก่อน +1

    On one side, this was super-interesting to watch, and a good "excuse" to learn about several CPython internals. On the other side, I don't get why such a vast portion of the Python community still relies on uwsgi given the amount of weird stuff it does with interpreter state and the GIL - not to mention the usual segfaults - in place of some modern alternatives :/

    • @anthonywritescode
      @anthonywritescode  3 หลายเดือนก่อน

      there isn't a non-async performant replacement.

    • @anthonywritescode
      @anthonywritescode  3 หลายเดือนก่อน

      and arguably any equivalent thing (say apache mod-wsgi) has to do exactly the same dance

  • @er63438
    @er63438 4 หลายเดือนก่อน

    Hats off! Spectacular.

  • @wagneralberto5456
    @wagneralberto5456 5 หลายเดือนก่อน +1

    amazing video, thank you, i had learned so much.

  • @InkFPS
    @InkFPS 5 หลายเดือนก่อน

    Very impressive engineering. 👏

  • @BohdanBorkivskyi
    @BohdanBorkivskyi 4 หลายเดือนก่อน

    I've got a question about the final fix. So before the fix there was pyuwsgi_setup that was calling uwsgi_setup and two other functions pyuwsgi_init and pyuwsgi_run are calling pyuwsgi_setup meaning they were calling uwsgi_setup. Now that uwsgi_setup is moved out, behavior of pyuwsgi_init and pyuwsgi_run did not change - they still call both pyuwsgi_setup and uwsgi_setup. But calling pyuwsgi_setup does not result in calling uwsgi_setup anymore - is it ok?

    • @anthonywritescode
      @anthonywritescode  4 หลายเดือนก่อน +2

      technically in the old code if you ran setup without anything else you'd create zombie children -- calling setup without run didn't make any sense

  • @TheJobCompany
    @TheJobCompany 5 หลายเดือนก่อน +1

    Any particular reason why you named your functions with leading underscores in that bisect script? Sureely lexical visibility shouldn't matter when you're quickly hacking together a temporary script... right?

    • @anthonywritescode
      @anthonywritescode  5 หลายเดือนก่อน +2

      no benefit to cutting corners. I try and do things professionally even if it's just a hacky script (like there's no reason for type annotations there either, or functions even!)

    • @TheJobCompany
      @TheJobCompany 5 หลายเดือนก่อน +1

      @@anthonywritescode love that, I'm guilty of doing the same thing. This is just the first time I've ever thought of every single function in a single-file script as part of its private api haha

    • @anthonywritescode
      @anthonywritescode  3 หลายเดือนก่อน +1

      today it's a script but you never know how it may evolve in the future. no reason to cut corners only some of the time

  • @ruroruro
    @ruroruro 5 หลายเดือนก่อน +1

    I still don't quite understand why does the new version of PyThreadState_Swap cause this issue. If it ends up being called with oldts == newts, shouldn't it just unlock and then relock the same GIL? Or am I missing something?

    • @anthonywritescode
      @anthonywritescode  5 หลายเดือนก่อน +3

      the deadlock is in PyThreadState_Restore (when it restores an already locked thread state), not in swap. the self-swap is weird but not the problem

    • @ruroruro
      @ruroruro 5 หลายเดือนก่อน +1

      @@anthonywritescode ah I see, I thought that you tried recompiling 3.12 with the NoGIL version of swap and that got rid of the deadlock, but instead segfaulted at a later point. I guess, I am still a bit confused about how PyThreadState_Swap is relevant to this bug. Or was it just a red herring all along?

    • @anthonywritescode
      @anthonywritescode  5 หลายเดือนก่อน +2

      (this is in the video but I'll reiterate) the swap happens in the parent process after the first fork so it affects the child only after reload. it stashes a locked thread state (where it didn't before)

    • @ruroruro
      @ruroruro 5 หลายเดือนก่อน +2

      @@anthonywritescode Ah, I think I finally get it. The docs for PyThreadState_Swap say that "The global interpreter lock must be held and is not released". Previously, uwsgi called it without holding the GIL, but that wasn't an issue, presumably because this only occurred during reloading (at which point the python threads aren't running, so there is no possibility of any race conditions). In 3.12 the Swap logic changed so that it released and reacquired the GIL. I am assuming that attempting to release the GIL when it is already released is a no-op, so the sum total effect of this was that now the Swap would acquire the GIL, if it was not already held (of course, calling it without holding the GIL is UB according to the docs).
      Did I get that right? If so, I wonder if PyThreadState_Swap should have a sanity check/assert that would verify that the called does indeed already hold the GIL.

  • @Quarky_
    @Quarky_ 5 หลายเดือนก่อน +1

    Amazing debugging adventure! I'm wondering how long it took you from identifying the bug to fixing and understanding it? Did you work on other things in between, or were you on this uninterrupted?
    EDIT: I guess what I'm interested in is, if you had a lot of uninterrupted time for this adventure, or if you had to find time here and there. Me personally, I find any deep debugging difficult if I get interrupted.

    • @anthonywritescode
      @anthonywritescode  5 หลายเดือนก่อน +1

      I answered this in a comment below but most of it was in a 5 hour stream

    • @Quarky_
      @Quarky_ 5 หลายเดือนก่อน +1

      @@anthonywritescode thanks, found that comment! Thanks for sharing this kind of work in your videos :⁠-⁠)

  • @8b227f
    @8b227f 4 หลายเดือนก่อน +1

    To me the bug is actually in python. No need to swap thread states when oldts == newts. Early return instead.

    • @anthonywritescode
      @anthonywritescode  4 หลายเดือนก่อน +1

      there should never be a normal case where you swap to yourself

  • @con-f-use
    @con-f-use 5 หลายเดือนก่อน +2

    "micro whiskey" 😁

  • @TigerWalts
    @TigerWalts 5 หลายเดือนก่อน +1

    Just started watching and noticed the video length.
    If it takes that long to explain then it must have been a doozy.

  • @nexovec
    @nexovec 5 หลายเดือนก่อน

    How much time did you spend on this?

    • @anthonywritescode
      @anthonywritescode  5 หลายเดือนก่อน +7

      the video? about 15 minutes of prep before a one-take
      the whole fix? checking the vods it looks like 2.5 hours for the bisect and 5 hours for the actual fix. though that doesn't include the sprinkling of small bits of time over weeks trying to narrow down the actual cause

    • @nexovec
      @nexovec 5 หลายเดือนก่อน +4

      @@anthonywritescode It would probably take me 7 hours to set up the bisect 😆

  • @king40342
    @king40342 5 หลายเดือนก่อน

    Titillating write-up!

  • @FocusAccount-iv5xe
    @FocusAccount-iv5xe 5 หลายเดือนก่อน

    +

  • @Cyberbeni
    @Cyberbeni 4 หลายเดือนก่อน

    Could've just jumped from 10:17 to 57:30 by starting with reading the documentation.

    • @anthonywritescode
      @anthonywritescode  4 หลายเดือนก่อน +1

      eh not quite -- I don't think that line in the docs would have meant anything to me until I learned how all the other pieces fit together