multiprocessing: fork() vs. spawn() (intermediate) anthony explains

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 พ.ย. 2024

ความคิดเห็น • 31

  • @Jakub1989YTb
    @Jakub1989YTb 2 ปีที่แล้ว +8

    At first, I wasn't really fond of the new video thumbnails. But I can say, they grew on me 🙂

  • @Itwasmeeeeeee
    @Itwasmeeeeeee 2 ปีที่แล้ว +3

    Glad to see your paint game has improved !

  • @sadhlife
    @sadhlife 2 ปีที่แล้ว +4

    I'm glad to see how simple the config parsing part of flake8 is. Pylint on the other hand, has it split in so many parts it's hard to even keep in your head.
    There's actually two different config parsing modules, one that uses argparse, and an old one that uses optparse.. (Granted they'll be releasing 3.0 soon that removes the old one)

    • @anthonywritescode
      @anthonywritescode  2 ปีที่แล้ว +3

      it's most of what I've worked on the last few years

  • @justinmusti
    @justinmusti ปีที่แล้ว +2

    I should have watched this video like a week earlier. it would have saved me all this rework. Thanks for the video. Great explanation.

  • @Quarky_
    @Quarky_ 2 ปีที่แล้ว +4

    Thanks! I never understood the difference, now it's pretty clear :)

  • @Khushpich
    @Khushpich 9 หลายเดือนก่อน

    Thanks Anthony, great explanation

  • @d3stinYwOw
    @d3stinYwOw 2 ปีที่แล้ว +1

    MacOS is only compatible with ancient versions of POSIX, so...
    Plus, do you run Paint in VM or in Wine? :D
    Great video as always!

  • @agarbanzo360
    @agarbanzo360 2 ปีที่แล้ว +4

    How does fork’s ROW memory interact with reference counting or other intricacies of Python memory management?

    • @anthonywritescode
      @anthonywritescode  2 ปีที่แล้ว +2

      in general, poorly -- an update to the reference count of an object pages in the whole PyObject* structure. there's ways to help this with `gc.freeze()` which I plan to cover in another video. I've also seen an alternative python implementation optimized for fork-based workloads (think like uwsgi / apache) which moves the refcounts off of the objects and into a reserved area of memory -- though this trades off recount lookup speed for memory and memory is usually cheaper.

    • @agarbanzo360
      @agarbanzo360 2 ปีที่แล้ว

      @@anthonywritescode makes sense, thanks! I did a test and found that using fork() memory would grow over time, not all the way to the memory used by spawn() but enough that I would rather pay the upfront cost and deal with higher but stable memory use than trying to rely on fork() preserving memory.

  • @AceofSpades5757
    @AceofSpades5757 2 ปีที่แล้ว +1

    On Windows, we're used to being slow.

  • @malakarakesh3139
    @malakarakesh3139 10 หลายเดือนก่อน +1

    hey at 5:04 could you explain the OOM issue in fork() and not in spawn(). because spawning also requires a whole new process to start right? and the memory is limited.

    • @anthonywritescode
      @anthonywritescode  10 หลายเดือนก่อน +1

      when using fork the original process is copied, with spawn it starts from 0

    • @malakarakesh3139
      @malakarakesh3139 10 หลายเดือนก่อน +1

      thanks.@@anthonywritescode .
      also want to know how the __name__ works with multiprocessing..
      because I assume that spawned process also gets the __name__ as __main__.
      then shouldn't the child process recursively spawn another child process and son on?

    • @anthonywritescode
      @anthonywritescode  10 หลายเดือนก่อน +1

      are you observing that? I think you can answer your own question

    • @malakarakesh3139
      @malakarakesh3139 10 หลายเดือนก่อน +1

      interestingly that (recursive spawning) is not happening.. and I was wondering why.. perhaps you could help me with it.@@anthonywritescode

    • @anthonywritescode
      @anthonywritescode  10 หลายเดือนก่อน +1

      clearly it doesn't set name to `__main__`

  • @AdamMaczko
    @AdamMaczko 2 ปีที่แล้ว +2

    Hi, I have to use spawn in one of my projects (because of CUDA). I need to spawn multiple processes, but when the first process is spawned it runs through main and terminates and no other process is spawned. I don't really get why this happens. Why could this be the case?

    • @anthonywritescode
      @anthonywritescode  2 ปีที่แล้ว +2

      it's impossible to know without seeing your code -- maybe stop by the discord with a minimal example? th-cam.com/video/ritp4gAqNMI/w-d-xo.html

    • @AdamMaczko
      @AdamMaczko 2 ปีที่แล้ว

      @@anthonywritescode yeah definitely

  • @snek3205
    @snek3205 2 ปีที่แล้ว +1

    What about in fork, if you modify the global variable between entering the Pool context and running the map? (i.e. I do `global glob; glob = 3` before calling print(list(p.map(...))))
    I would guess, based on the paint explanation, that the child process is still watching the memory of the parent process, so it will print [4,5,6]. But when I tried it I still got [3,4,5].
    Thanks!

    • @anthonywritescode
      @anthonywritescode  2 ปีที่แล้ว +1

      once forked the memory spaces are separate -- writes in the parent won't be reflected in the child

  • @TheAulto
    @TheAulto 2 ปีที่แล้ว

    Ohh no! Thumbnail and video title are off by one! (491 vs 492)

    • @anthonywritescode
      @anthonywritescode  2 ปีที่แล้ว +1

      oops at least it was just the title being wrong

  • @liangyu3771
    @liangyu3771 ปีที่แล้ว

    love the video

  • @malakarakesh3139
    @malakarakesh3139 10 หลายเดือนก่อน

    from multiprocessing import Process
    global_variable = 10
    def modify_global():
    global global_variable
    global_variable += 5
    print(f"Child process: Modified global_variable to {global_variable}")
    if __name__ == "__main__":
    global_variable = 4
    print(f"Parent process: Original global_variable is {global_variable}")
    child_process = Process(target=modify_global)
    child_process.start()
    child_process.join()
    print(f"Parent process: After child process, global_variable is still {global_variable}")
    In the above code, the child process can still access the global_variable assignment in the __main__ function.
    I thought I understood that the child process only cares about the program state from the line where it is spawned - and also the global variables - here the glob var assignment is before it is spawned and still it accessed the modified glob variable

    • @anthonywritescode
      @anthonywritescode  10 หลายเดือนก่อน

      it depends whether you're using fork or spawn. the default depends on your python version and operating system

  • @alexandrugheorghe5610
    @alexandrugheorghe5610 ปีที่แล้ว

    Wholesome.

  • @amir.hessam
    @amir.hessam 2 ปีที่แล้ว

    nice