Functional Programming in Python: Parallel Processing with "concurrent.futures"

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ก.พ. 2025

ความคิดเห็น • 47

  • @swadhikarc7858
    @swadhikarc7858 7 ปีที่แล้ว +2

    "invaded your appetite for parallel processing" absolutely true... Thanks much Dan. Once again great video

  • @nickp6987
    @nickp6987 3 ปีที่แล้ว

    Simple explanation, simple video, simple minded me, absorbed the vid tut, so many thanks from me to you!

  • @wesselbindt
    @wesselbindt 3 ปีที่แล้ว +2

    5:20 but the same holds for the multiprocessing pool, swapping to threads instead of processes there is as simple as changing your Pool to a ThreadPool

  • @kyleeames8229
    @kyleeames8229 5 ปีที่แล้ว +1

    These were excellent functional programming tutorials. They earned you my subscription. Well done 👍

    • @realpython
      @realpython  5 ปีที่แล้ว

      I'm glad you enjoyed the tutorials!

  • @philsteamwork
    @philsteamwork 7 ปีที่แล้ว

    Thanks a lot for this. I also figured out that ThreadPoolExecutor has advantages when it comes to debugging.

  • @구삐-i9u
    @구삐-i9u 10 หลายเดือนก่อน

    wonderful videos. cheers bro

  • @AbderrahmaneBenbachir
    @AbderrahmaneBenbachir 3 ปีที่แล้ว +1

    Hi great video, the time.sleep(1) is not an IO it's a sleep state that a thread enter into and wait on a timer. The IO leads to block.

  • @KabbalahredemptionBlogspot
    @KabbalahredemptionBlogspot 2 ปีที่แล้ว

    Be blessed, brother.

  • @AkashJobanputra5
    @AkashJobanputra5 6 ปีที่แล้ว

    I just loved it, didn't knew it was that simple! Thanks, Dan.

  • @jaysanprogramming6818
    @jaysanprogramming6818 3 ปีที่แล้ว

    Thank you for this playlist. I learned a lot.

  • @teuton8363
    @teuton8363 4 ปีที่แล้ว +1

    7:30 Isn't ProcessPoolExecutor only faster if you have more than one core otherwise there is no advantage over ThreadPoolExecutor. I'd rather say, on a single core processor, ProcessPoolExecutor is slower than ThreadPoolExecutor because process switching takes place and process switching has in general a higher overhead than thread switching.

  • @MrPatak007
    @MrPatak007 2 ปีที่แล้ว

    Why didn't you add some number crunching to show that process pool executor would be faster? At least some basic for loop with trivial computations. Would have made the video a 10/10 if you ask me.

  • @Nobodyhells_YT
    @Nobodyhells_YT 7 ปีที่แล้ว +3

    Dan, there is a backport of concurrent for Python 2.7, I've used it a few times so far. I think the package is called futures.

    • @realpython
      @realpython  7 ปีที่แล้ว +1

      Ah great, that's good to know. Here's a link if anyone else needs this: pypi.python.org/pypi/futures

  • @azizfdagli7657
    @azizfdagli7657 6 ปีที่แล้ว

    Dan Thanks for the tutorial, It is a helpful video. I have a question, def compress():
    pid_num = os.getpid()
    os.system('sleep 8') . ## I am calling this function with ThreadPoolExecutor and ProcessPoolExecutor four times. Result this code is complete in 32 seconds that's mean they don't run different core. They are waiting for each other. files = ('1', '2', '3', '4')
    with concurrent.futures.ThreadPoolExecutor() as executor:
    result = executor.map(compress, files) ## is this problem releated with GIL ? Thank you :)

  • @viettienha7312
    @viettienha7312 5 ปีที่แล้ว

    Nice tutorial as always. But can you share some source for us to understand Data Pararell in which these two features based on? That'd be great.

  • @yashsrivastava677
    @yashsrivastava677 7 ปีที่แล้ว

    If one of the processes in the pool gets crashed there is a BrokenProcessPool problem.How to get rid of that ?

  • @WolfBoy2700
    @WolfBoy2700 6 ปีที่แล้ว

    Nicely explained, thank you!

  • @ankitpatidar8993
    @ankitpatidar8993 6 ปีที่แล้ว

    nice videos, all video really helpful

    • @realpython
      @realpython  6 ปีที่แล้ว

      Thanks for the kind words!

  • @rxzzh8074
    @rxzzh8074 5 ปีที่แล้ว

    Thank you!

  • @adityachitrigemath762
    @adityachitrigemath762 6 ปีที่แล้ว

    Is threadpoolexecutor better than processpoolexecutor?

  • @gfdsarewq
    @gfdsarewq 4 ปีที่แล้ว

    Hey ! Nice video!
    How can i terminate the thread ? its running even after main program exists (ctrl+c)
    Code sample:
    with concurrent.futures.ThreadPoolExecutor() as executor:
    result = executor.map(function_name, iterator)

  • @RodolfoDeNadai
    @RodolfoDeNadai 7 ปีที่แล้ว +1

    Nice series...

  • @the_gacker_hub
    @the_gacker_hub 6 ปีที่แล้ว +1

    Yea! I actually can see a big difference, i.e. My former code takes 500 secs, and my latter code with Concurrent-futures takes only 178 secs. But one quest I want to ask you that I have set the max workers to 100, which I really don't know what does that mean? How do max workers work? So can you please explain it in here in very short way, and if it is not possible that tell me that is my workers are valid? (I mean to 100 value).

    • @the_gacker_hub
      @the_gacker_hub 6 ปีที่แล้ว

      Hope to hear from you soon. Thanks, and must say, was a very nice explanation.

  • @kshitizomar6730
    @kshitizomar6730 5 ปีที่แล้ว

    You should also add the link to the code file.

  • @SudhakarChintapalli
    @SudhakarChintapalli 3 ปีที่แล้ว

    Nice video, but how to pass other parameters to the method(when it is needed) other than iterator(here it is scientists).

  • @venkatkumar8672
    @venkatkumar8672 5 ปีที่แล้ว

    can anyone share the source code , i'm constantly getting runtime error i don't know where i'm making mistakes

  • @TadasTalaikis
    @TadasTalaikis 7 ปีที่แล้ว

    Question. how to create namedtuple items grammatically? 'for', 'itertools'?

    • @EdrisSaberi
      @EdrisSaberi 7 ปีที่แล้ว

      What do you mean by 'grammatically'?

  • @wt5968
    @wt5968 6 ปีที่แล้ว

    @Dan Bader - Hey man, thanks for providing this great example. I ran the same exact code in python and got the following error:
    "
    AttributeError: Can't get attribute 'Scientist' on
    Time to complete: 1.2904911041259766s
    "
    Any idea why its happening? I am using Python 3.5.2

    • @roietbd2992
      @roietbd2992 3 ปีที่แล้ว

      You must run this sort of code from a Python file, not an interpreter.
      And the output will show on the Command Prompt window as well (so IDLE's shell window won't show anything as it is a different process), so make sure you call input() or something if you want to see the result, or just run the file from a Command Prompt.

  • @adityachitrigemath762
    @adityachitrigemath762 6 ปีที่แล้ว +1

    Hello Dan, does the threadpoolexecutor make use of multiple cores? What is it that makes concurrent.futures better than the multiprocessing module.

    • @stxnw
      @stxnw 3 ปีที่แล้ว

      he literally explained it???

  • @bjugdbjk
    @bjugdbjk 7 ปีที่แล้ว

    Great video...I have a question so u mean even we use multithread processing from concurrency module,it doesn't happening at the background because of GIL,am I correct could you please confirm.

  • @bjugdbjk
    @bjugdbjk 7 ปีที่แล้ว

    What's next video....??

  • @edgostyn
    @edgostyn 5 ปีที่แล้ว

    I am getting some troubles to make these things work on iPython (Jupyter Notebook). I am testing Dill, that comes whith Pathos package... Please create a video for paralell processing under Jupyter!

  • @GoldPhoenix99
    @GoldPhoenix99 7 ปีที่แล้ว

    This is sort of an interesting way of mimicking the functionality of a Jupyter Notebook. Do you not like Jupyter Notebooks because they cannot be integrated into Sublime Text? Personally, I really hope that once it's matured, Jupyter Labs will have the full range of functionality (PEP8 linter, Python highlighting for Python 3.x, and even highlighting and compiling for Cython since I'd like to start playing around with that.

  • @lineup8
    @lineup8 2 ปีที่แล้ว

    Maybe not applicable or useful for everyone (or anyone) but I hit major issues running the code "as is" on Windows with Python 3.10 - TL;DR - code block had to be:
    import collections
    import concurrent.futures
    #import multiprocessing
    import os
    import time
    from pprint import pprint
    def transform(x):
    Scientist = collections.namedtuple('Scientist', ['name','field','born','nobel'])
    print(f'Process {os.getpid()} working record {x.name}')
    time.sleep(1)
    result = {'name': x.name, "age": 2017 - x.born}
    print(f'Process {os.getpid()} done processing record {x.name}')
    return result
    Scientist = collections.namedtuple('Scientist', ['name','field','born','nobel'])
    if ___name___ == "__main__":
    #slightly tweaked scientists tuple - just for testing
    scientists = (Scientist(name="Ada Lovelace", field="math", born=1815, nobel=False), Scientist(name="Emmy Noether", field="math", born=1880, nobel=True),Scientist(name="Marie Curie", field="math", born=1830, nobel=False),Scientist(name="Ernest Rutherford", field="physics", born=1873, nobel=True),Scientist(name="Sally Ride", field="chemisty", born=1912, nobel=False))
    pprint(scientists)
    print()
    start = time.time()
    #left the multiprocessing in for easy swapping
    # pool = multiprocessing.Pool()
    # result = pool.map(transform, scientists)
    with concurrent.futures.ProcessPoolExecutor() as executor:
    result = executor.map(transform, scientists)
    #.... rest of code as per video
    The reasoning was - I had to carefully place the standard ___name___ == '__main__' condition to prevent the parallel processes from trying to re-run the ENTIRE script (including the launching of the processor pool again... 🤦‍♂) - but it can't wrap the whole code, as the declaration of the "Scientist" type needs to be visible to all processes that go through the script... I think this is something coming from Python 3.9.2 onwards on Windows (based on some quick searches through Stack Overflow)