Watch out for this (async) generator cleanup pitfall in Python

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ก.ย. 2024

ความคิดเห็น • 37

  • @amarasa2567
    @amarasa2567 วันที่ผ่านมา +36

    So basically, don't open and close resources in your generators, but give handles to those resources to your generators?
    Like don't let your generator open a file itself, but give it a file handle?

    • @PeterZaitcev
      @PeterZaitcev 21 ชั่วโมงที่ผ่านมา +2

      Exactly.

  • @marckiezeender
    @marckiezeender วันที่ผ่านมา +54

    How about instead of "don't use generators" we just say "don't use 'with' or 'finally' blocks inside of generators"
    Edit: what I mean is that you should avoid context and resource management inside of generators. If the generator needs access to, say, a file, then the generator's caller should be responsible for opening and closing that file.

    • @AbstractObserver
      @AbstractObserver วันที่ผ่านมา

      He explain why thats not a good way to do that wat in the video

    • @themartdog
      @themartdog วันที่ผ่านมา +9

      ​@@AbstractObserver at what timestamp? I didn't hear him say that. Even if he did, I think it's a good idea to avoid with/finally in these contexts

  • @aaliboyev
    @aaliboyev วันที่ผ่านมา +4

    As a conclusion "Wrap all your with statements with for loop inside into separate function which only holds this statement to guarantee gc"

  • @PeterZaitcev
    @PeterZaitcev 21 ชั่วโมงที่ผ่านมา +2

    what *could* be done with the mentioned PEP is to add support for these iterclose methods but do not activate them in any default classes, instead provide a wrapper in contextlib that you can attach to your generator functions (or classes) which adds the missing method.

  • @aaronm6675
    @aaronm6675 วันที่ผ่านมา +22

    Wild

    • @endoflevelboss
      @endoflevelboss วันที่ผ่านมา +7

      James Murphy for new BDFL 🎉

  • @xelaxander
    @xelaxander วันที่ผ่านมา +5

    The resource existing at least as long as the generator seems like to me like the natural solution. Imagine if it wasn’t, I.e. the generator was accessible while the resource isn’t. Suddenly it’s even more cursed, since you can try to get an element from the generator, but it indeterminately fails to yield.

    • @xelaxander
      @xelaxander วันที่ผ่านมา +1

      Resources not being garbage-collected immediately is a choice necessary to allow alternative Python implementations with different optimization goals.
      That said, at least in CPython, you can also just del g, and then gc.collect(). Although I’m not sure if the Python spec guarantees cleanup.

    • @DrGreenGiant
      @DrGreenGiant วันที่ผ่านมา +1

      ​@@xelaxanderyeah my question would be, is it in the spec or is it an implementation detail. If the latter then I'd happily be that guy who says it sounds like a non cpython implementers problem.

  • @motbus3
    @motbus3 21 ชั่วโมงที่ผ่านมา

    You can also use a context manager which encloses both generator and resources

    • @motbus3
      @motbus3 21 ชั่วโมงที่ผ่านมา

      But that's annoying

  • @janhwillems10000
    @janhwillems10000 21 ชั่วโมงที่ผ่านมา

    Thanks a lot James!

  • @RogerValor
    @RogerValor 14 ชั่วโมงที่ผ่านมา

    Not sure how often I care about when resources are cleaned up, as long as they are cleaned up. But I guess this is good to know for the case when I encounter it. So thx!

  • @danwellington3571
    @danwellington3571 วันที่ผ่านมา +13

    Obligatory Rust comment that completely ignores the 30-odd years of design choices Python has to do its best to not break

    • @DrGreenGiant
      @DrGreenGiant วันที่ผ่านมา +1

      Works fine in assembly too

  • @idoben-yair429
    @idoben-yair429 21 ชั่วโมงที่ผ่านมา +6

    The ACTUAL solution is to write your code such that it doesn't rely on exactly when resources get cleaned up, and if the exact time of the cleanup is important, then express it EXPLICITLY in the code by calling a cleanup method. RAII-style context managers are not the right tool when the order and timing of cleanup is important. Generators are not the issue here. The engineering approach is the issue.
    To elaborate: RAII means Resource Acquisition Is Initialization. The converse, i.e., Reference Loss Is Resource Disposal, does NOT hold. This is an important implication of the reference counting style of resource management, unlike C++-style RAII where you can engineer your classes to enforce both RAII and RLIRD.

    • @ilmuoui
      @ilmuoui 19 ชั่วโมงที่ผ่านมา

      Sounds like Rust with extra added steps

    • @idoben-yair429
      @idoben-yair429 19 ชั่วโมงที่ผ่านมา +2

      @@ilmuouievery language has different tools, stop worshipping languages, they’re just that - tools

    • @themartdog
      @themartdog 16 ชั่วโมงที่ผ่านมา

      ​@@ilmuoui rust itself is programming with extra stepa

    • @RogerValor
      @RogerValor 13 ชั่วโมงที่ผ่านมา

      @@ilmuoui i love rust, but with years of python and gc language experience (java, c#, ...), i never expect the GC to release things in time anyway, that is something that i always notice as topic with C and C++ programmers, so the codebase avoids relying on this, and the resulting code is the simple readable version of the problem, except you really need it, like writing to a file you just read, or deleting or sth., the code has to use those explicit methods here shown. The with secures me not to have to explicitly close the file, this is what I want from it.
      to be honest most rust apps are inside of frameworks anyway with the resources in some injection system, or (or rather, and) simply have some central static structs with arc-mutexes as a backbone, and with async, it is not unusual, to actually move into controlling it, once you reach the complexity, that it makes problems with an async main, but you clearly see which variable is temporary, and which one is part of a system like that.
      And from experience, in comparison, you will reach the point where you have to build complex type maintenance, faster, simply because you have to be strict about the type system and express so much with it, however you cannot simply forget those things in python either, even if you are implicit, so you usually avoid singletons, and act like memory is an ocean and abstraction is free.
      python is expected to be fast fixable, fast deployable, and error robust on runtime, but rust is expected to be efficient, error robust at compile time and be usable as compiled system code. however both usually are memory safe, and on that regard, python and rust feel weirdly close (of course in python you get an exception, but you will still not write into the wrong memory).
      So I totally get what you mean you see as extra steps (which zig and go really solve well with defer), but I think each language and also mindset behind it has it's own extra steps.

    • @darthwalsh1
      @darthwalsh1 11 ชั่วโมงที่ผ่านมา

      Instead of using a context manager, you would use an explicit try: finally: that calls a cleanup? Wouldn't that have the same semantics as with:

  • @tc14hd23
    @tc14hd23 วันที่ผ่านมา +17

    No problem was ever solved by adding 'async' in front of it

  • @fyaa23
    @fyaa23 20 ชั่วโมงที่ผ่านมา

    Why do you expect garbage collection to behave like RAII in other languages? The point is that you don't have to take care to clean it up, but you can hardly control when it happens.

  • @tophat593
    @tophat593 วันที่ผ่านมา +11

    Sorry, beyond tired and ever so slightly drunk. Want to listen but need to sleep and tomorrow I won't remember this video exists.

    • @exploited410
      @exploited410 วันที่ผ่านมา +17

      Answering this so you have a notification tomorrow and will see this video on a clear mind

    • @MarianoBustos-i1f
      @MarianoBustos-i1f 15 ชั่วโมงที่ผ่านมา

      wake up and watch the video.

  • @MrAlanCristhian
    @MrAlanCristhian วันที่ผ่านมา +5

    That is the type of behavior that is detected with unit testing.

    • @austinnar4494
      @austinnar4494 วันที่ผ่านมา +5

      That's not true, the async case is literally non-deterministic. So you could have application code that expects the resource to be cleaned up, the unit tests pass, but then fails in production.

    • @MrAlanCristhian
      @MrAlanCristhian วันที่ผ่านมา

      @@austinnar4494 No, in that case the test will randomly fail. Because the test suit it's supose to run multiple times before deployment. The test suit runs every time a change is made.

    • @themartdog
      @themartdog วันที่ผ่านมา +4

      ​@@MrAlanCristhian but it could randomly fail only .01% of the time...

    • @austinnar4494
      @austinnar4494 14 ชั่วโมงที่ผ่านมา

      @@MrAlanCristhian non-deterministic in this sense does not necessarily mean "randomly." In an async context especially, the particular time that the cleanup code is run is highly sensitive to what other async code is being run and what else is scheduled on the event loop. In unit tests, the async generator could be the only thing being tested, so it always works as expected and passes. But in production, there could be a large number of async tasks on the event loop, so the cleanup code does not run immediately, causing a failure

  • @SpeedingFlare
    @SpeedingFlare วันที่ผ่านมา

    Just use C and do if(error) goto cleanup; where cleanup cleans up anything that's non-null in whatever order you want.
    For async, I don't know