Dude please keep posting this stuff. You're very knowledgeable and have a great way of explaining stuff and its extremely helpful to early-career programmers like myself. Subscribed!!
Junior Dev from Brazil here! Gotta say your video is the most practical, simple, direct and applicable video I watched and I've been watching a few of them! Great Job!
A while ago I attended a "prestigious" university and took an OS course with a so-called "legendary" Computer Science professor who taught us nothing and left me disgruntled and demotivated. This < 7 minute video blows an entire wasted semester there out of the water!! Thank you @EngineerMan for not only making things clear technically, but also motivating folks like me with your straight-forward lessons, calm demeanor, and real-world applications and examples. You rock - keep it up!!!
0:41 Just a note that, on Linux, there really isn’t that much difference between a “thread” and a “process”, since all their POSIX-defined creation calls are implemented in terms of the same underlying Linux-specific clone(2) call. To be more specific, the POSIX fork(2) call creates a new execution context along with a new process context, whereas pthread_create(3) creates a new execution context within the current process context.
Lawrence D’Oliveiro you mean to the OS? Yeah Linux sees threads and processes as the same essentially, well really “tasks” but Python still is limited by GIL, so really what you’re talking about isn’t relevant here.
Sweet damn! Thanks engineerman, this cleared up a lot for me, especially since I'll be dealing with a bunch of threads myself in C# on Monday at work :3
Love your content man. I'm new to Python and find your videos both interesting and educational. The code review you did recently was particularly interesting. Cool to see what jumps out at you as an expert looking at a beginners code
Really great explanation. I would love to see how you could use threading and multiprocessing to execute two different things. In your example, it was all squaring numbers. But in your explanation, you said you calculate a credit card, take in data and send an email. ANd the email could be sent before the other operations were completed. Can you show us how you could do that? Great job!
Thank you! Good explanation on a subject I found pretty confusing at first. But after this video I feel free to use the threading module without worrying about threading being a poor alternative compared to multiprocessing. The GIL was very well explained.
Version of processes.py that runs on Python v3.7.4: from multiprocessing import Process import os import math def calc(num): for i in range(0, 70000000): math.sqrt(i) if __name__ == '__main__': processes = [] for i in range(os.cpu_count()): print('registering process %d' % i) processes.append(Process(target=calc, args=(1,))) for process in processes: process.start() for process in processes: process.join()
I'm running into a little problem with the multiprocessing code. I run the code in python idle program it only outputs 4 instances of multiprocessing. Pycharm throws an error. Cannot import Process from multiprocess. Don't know why it's doing what it's doing in either program
could you explain why one would need mutexes for threading and not for processes? I thought you need "lock = multiprocessing.Lock()" around variables that are used by multiple function simultaneously (so in a process).
Mutexes in Python serve a different purpose though they are functionally identical to Mutexes in any other language. The Python GIL ensures only one thing is executed at a time, even when threads are used. The implication of this is you'll never find it in a weird state because nothing will ever be accessing it at the same time. That said, if the variable you expect manipulate is manipulated over, say, 10 lines of code, the possibility exists that the interpreter will jump to another thread to execute some code there. If that code were to, say, manipulate that same variable, when the interpreter jumped back to what it was doing in the previous thread, now there's weird values. By locking at the start and end of those 10 lines of code (often called the Critical Section), one can guarantee that another thread won't interrupt and mess everything up.
to extend your web site analogy, what happens if someone else is also trying to buy something at the same time? do they have to wait for the first person's transaction to finish? or would you have something that spawn's a process per request (to a certain amount) and run the thread for the credit card, the email etc in there?
Very nice video. Can you please explain why you use os.cpu_count() for the multi-threading? Doesn't that return the number of cpu's instead of threads?
Thanks. Do you have or could you make a video about communication between processes. Imagine example where a large work is to be divided between processes and then results must be given back to main process. How to asyncronously wait for any of the processes to finish and feed it a new subtask.
Very clear and precise explanation, though some things are technical enough to deserve an independent search to know what they mean (I'm not a pro). I remember you did some multi processing from the command line in a previous video, so I guess it's not just a Python-concerned way of computing. Thanks a lot.
So, to put it in a very simple terms: Threading is good when you need to do things in parallel that are not critical in performance, e.g. run a user interface and run a timer in the background, so they don‘t block each other. Multiprocessing is to distribute cpu-heavy operations to multiple cores. Right?
Hey Engineer Man, I have a doubt. So in the multiprocessing a new process is started but is it like on every loop? For example- I am doing image processing on multiple pages(200 pages), so does a new process start in each of those pages. Or if I have 4 core processor, then in one loop eight process of python would start and then in the next loop, the process would continue without restarting python.....??????
A question. I do have two functions and want to run one processor on each. how can I retrieve the results of two function and sum up it? I see how to run two different functions here but how can I use the output of all processors and do some stuff? thanks
Nice explanation! My only questions (which I recognize are beyond the scope of the video) are, how do you get data into the processes, and how do you get data out from them?
You can pass the data as an argument or you can pass an identifier of data that it should work on, such as process 1 should process rows 1-1000, process 2 should process rows 1001-2000. These processes would then get that slice of data to work on. For getting data out, I use queues. Check out my Blackjack video to see how queues work.
question: since the process uses multiple GIL for each, and if we use multiple processes in payment, does the card be charged equal to no.s of cores? Is it bad to use the multi-process in this case?
Is it possible to use multi processing to read frames of a video, for example, 10 frames per process, and somehow, get insights of those 10 frames of 3 processes altogether?
Those are facilities which can be thought of as single-threaded concurrency, similar to how Node.js does it. In that model, while something is awaiting a value (a blocking operation which often performs operations outside of the Python interpreter), the code can go work on other stuff while it waits.
@@shahlin30 No, it’s called “coroutines”. It is explicitly non-preemptive, unlike threading. No context switch occurs until you get to an “await” call (or the coroutine terminates).
Hey dude, thanks for the video. Theres a question. I wanna do a cracking thing and I wanna know what would you recommend to use. whether threading or multiprocessing??
Hey man, you are the best thanks for this quick explanatory scenes. Quick question here, what happens if i use multiprocess or multithread within a "with f.open('somefile.txt',w):" statement ?
How would processing be limited? A CPU can only run so many commands in parrallel so would the program run just as fast as threading if that were to happen?
Excellent although 3 years later. Can you give more/complex examples for python concurrent/parallel. Can or anyone advise: 1. Is thread/process suitable for lengthy/complex (other than simple calculation) processing? 2. Can thread/process be used for "endlessly" to process the live data, i.e. no completion? if yes, need .join() or How/Can .join() work the continuous/live-data processing scenario after .start()? Thanks a lot.
Hey engineerman, while using threading in my tkinter software my tkinter window is freezing up and going black what should i do. Currently , i am working on a youtube video downloader project and it is ver ylaagy.
i was always afraid from this topic. you have excellent way of simplifying things. i have a small issue. why the processes file show only one process in my computer. thank you
Dumb question here from a python novice, but, if I wanted to run a python program multiple times at one time to speed up the process, which method is better?
In multiprocessing is it ones own PC that decides if it will use different cores per process (ideally?) Also if your computer does not have 32 cores why does top/task manager show 3200% CPU usage? Would have thought 400% would be the max total on a four core machine?
The operating system handles distribution of processes over cores, or if only one core exists, prioritizes process execution within one core. The max cpu would show as 100% * number of logical cores. My server has 16 physical cores and 32 logical cores (made available via hyper threading). In practice though the server can only handle 16 cores worth of work.
So i don't understand... Why is everyone saying that python doesn't support multiple processing? I've heard this a couple of times... This video proves that it can just fine...
Thanks you for making sense. I didn't know that threading wasn't actually doing more than one thing at a time. I guess I don't need to know this anymore but still if anyone can tell me what the difference between "_thread" and "threading" is, that would be great.
They are similar but with a couple differences. Threading is when two sets of code execute independently from one another and often in parallel (Python excluded). Async is more like some task gets worked on when resources are available to work on it and then the caller is notified via a thing called a callback function once it completes. Asynchronous execution often occurs on a single thread.
Dude please keep posting this stuff. You're very knowledgeable and have a great way of explaining stuff and its extremely helpful to early-career programmers like myself. Subscribed!!
Will do. Thanks for the kind words.
Junior Dev from Brazil here! Gotta say your video is the most practical, simple, direct and applicable video I watched and I've been watching a few of them! Great Job!
A while ago I attended a "prestigious" university and took an OS course with a so-called "legendary" Computer Science professor who taught us nothing and left me disgruntled and demotivated. This < 7 minute video blows an entire wasted semester there out of the water!! Thank you @EngineerMan for not only making things clear technically, but also motivating folks like me with your straight-forward lessons, calm demeanor, and real-world applications and examples. You rock - keep it up!!!
is it cs50?
Awesome Video. Good work. I had no clue what Threading was before the video. Thanks!
Yeah, I loved it.
I have no idea the diference between multi threats and multi processes, grate way to start the video!
Fist view = subscribed
You explained this more clearly and helped me understand this faster than my professor ever could lmao.. Thank you so much!
0:41 Just a note that, on Linux, there really isn’t that much difference between a “thread” and a “process”, since all their POSIX-defined creation calls are implemented in terms of the same underlying Linux-specific clone(2) call.
To be more specific, the POSIX fork(2) call creates a new execution context along with a new process context, whereas pthread_create(3) creates a new execution context within the current process context.
Lawrence D’Oliveiro you mean to the OS? Yeah Linux sees threads and processes as the same essentially, well really “tasks” but Python still is limited by GIL, so really what you’re talking about isn’t relevant here.
I subscribed to the channel after watching only 2 mins of this.
This is how good and simple your explanation is.
Welcome!
Sweet damn! Thanks engineerman, this cleared up a lot for me, especially since I'll be dealing with a bunch of threads myself in C# on Monday at work :3
Nice!
He just covered a 2.5 hr lecture in 6 minutes.... And Ironically more effectively... Great job man!!!
Love your content man. I'm new to Python and find your videos both interesting and educational. The code review you did recently was particularly interesting. Cool to see what jumps out at you as an expert looking at a beginners code
Sold! I mean Subscribed! Thank you. It's just the most awesome explanation among development videos I've seen so far.
Every year I come back here to remember this explanation. Thanks man
Great explanation, thank you!
Great runtime CPU usage example, that one really helped clear the idea.
That you very much to explain the examples for both. It is really important to understand real-world use cases while learning this stuff.
Excelent video. This is just what I needed to my algo trade. Thank you a lot.
Best in class, and under 7 minutes, WoW 👌
you explained both and the diferences better then all the other videos explaining only one, thanks so much!
Really great explanation. I would love to see how you could use threading and multiprocessing to execute two different things. In your example, it was all squaring numbers. But in your explanation, you said you calculate a credit card, take in data and send an email. ANd the email could be sent before the other operations were completed. Can you show us how you could do that? Great job!
Thank you! Good explanation on a subject I found pretty confusing at first. But after this video I feel free to use the threading module without worrying about threading being a poor alternative compared to multiprocessing. The GIL was very well explained.
I'm watching this in 2020 and really man this video is a gem!
Short, simple and precise. Good enough to start learning on this topic.
1:22 what's the difference between access to copy and copy
Great videos, explains it very quickly and efficiently
Clear, concise and straight to the point. Thank you kind sir.
Best youtube channel for me.
What happen if you set more processes than cores that you have?
Watched the video while eating food; Informational & entertaining
Version of processes.py that runs on Python v3.7.4:
from multiprocessing import Process
import os
import math
def calc(num):
for i in range(0, 70000000):
math.sqrt(i)
if __name__ == '__main__':
processes = []
for i in range(os.cpu_count()):
print('registering process %d' % i)
processes.append(Process(target=calc, args=(1,)))
for process in processes:
process.start()
for process in processes:
process.join()
That was a very clear and concise explanation, thanks!
Subscribed!
Thank you so much engineer man! Awesome explanation!
Learned something! Not ready to try without oversight. Thank you.
cool thanks, but what text editor is that? looks pretty cool
I'm running into a little problem with the multiprocessing code. I run the code in python idle program it only outputs 4 instances of multiprocessing.
Pycharm throws an error. Cannot import Process from multiprocess.
Don't know why it's doing what it's doing in either program
You're not a engineer man, you're a GOD!
Thank you Engineering man !!! Very interesting topic and loved the way you explained !!
Did you do lesson on sharing data across processes ? For example taking continously serial type incoming data to use, store or display
Your videos are amazing man, good work!
could you explain why one would need mutexes for threading and not for processes?
I thought you need "lock = multiprocessing.Lock()" around variables that are used by multiple function simultaneously (so in a process).
Mutexes in Python serve a different purpose though they are functionally identical to Mutexes in any other language. The Python GIL ensures only one thing is executed at a time, even when threads are used. The implication of this is you'll never find it in a weird state because nothing will ever be accessing it at the same time. That said, if the variable you expect manipulate is manipulated over, say, 10 lines of code, the possibility exists that the interpreter will jump to another thread to execute some code there. If that code were to, say, manipulate that same variable, when the interpreter jumped back to what it was doing in the previous thread, now there's weird values. By locking at the start and end of those 10 lines of code (often called the Critical Section), one can guarantee that another thread won't interrupt and mess everything up.
@@EngineerMan thanks for your clear answer!
Short crisp videos, upvote the shit out of this.
to extend your web site analogy, what happens if someone else is also trying to buy something at the same time? do they have to wait for the first person's transaction to finish? or would you have something that spawn's a process per request (to a certain amount) and run the thread for the credit card, the email etc in there?
To my knowledge, each TCP connection to a webserver would be on its own thread, and there can be multiple more threads within each thread.
Very nice video. Can you please explain why you use os.cpu_count() for the multi-threading? Doesn't that return the number of cpu's instead of threads?
Thanks. Do you have or could you make a video about communication between processes. Imagine example where a large work is to be divided between processes and then results must be given back to main process. How to asyncronously wait for any of the processes to finish and feed it a new subtask.
Very clear and precise explanation, though some things are technical enough to deserve an independent search to know what they mean (I'm not a pro).
I remember you did some multi processing from the command line in a previous video, so I guess it's not just a Python-concerned way of computing.
Thanks a lot.
You can also import `cpu_count` from `multiprocessing`. That way os isn't needed.
5:37 That example is probably more simply and safely implemented with coroutines.
or literally any queue service
So, to put it in a very simple terms:
Threading is good when you need to do things in parallel that are not critical in performance, e.g. run a user interface and run a timer in the background, so they don‘t block each other.
Multiprocessing is to distribute cpu-heavy operations to multiple cores.
Right?
Yep, you got it.
Why are the threads and the processes in the two examples 32? Is it a default maximum number? Thanks for the answer in advance!
Hey Engineer Man, I have a doubt.
So in the multiprocessing a new process is started but is it like on every loop?
For example- I am doing image processing on multiple pages(200 pages), so does a new process start in each of those pages. Or if I have 4 core processor, then in one loop eight process of python would start and then in the next loop, the process would continue without restarting python.....??????
That was a brilliant, well explained lesson. Love your videos.
Good work explaining! Thank you.
A question. I do have two functions and want to run one processor on each. how can I retrieve the results of two function and sum up it? I see how to run two different functions here but how can I use the output of all processors and do some stuff? thanks
Nice explanation! My only questions (which I recognize are beyond the scope of the video) are, how do you get data into the processes, and how do you get data out from them?
You can pass the data as an argument or you can pass an identifier of data that it should work on, such as process 1 should process rows 1-1000, process 2 should process rows 1001-2000. These processes would then get that slice of data to work on. For getting data out, I use queues. Check out my Blackjack video to see how queues work.
Really awesome video. Thank you! Can you do threading in multiprocessing? Is there any advantage of doing threading in each process on each core?
I think a python interpreter or any compiler doesn't allow to do that because u can't create a child from the demon process.
question: since the process uses multiple GIL for each, and if we use multiple processes in payment, does the card be charged equal to no.s of cores? Is it bad to use the multi-process in this case?
Precise. On Point. No Bullshit.
Is it possible to use multi processing to read frames of a video, for example, 10 frames per process, and somehow, get insights of those 10 frames of 3 processes altogether?
How does async / await fit into that context?
Those are facilities which can be thought of as single-threaded concurrency, similar to how Node.js does it. In that model, while something is awaiting a value (a blocking operation which often performs operations outside of the Python interpreter), the code can go work on other stuff while it waits.
@@EngineerMan Isn't that just threading?
@@shahlin30 I think he answered that above...
@@shahlin30 No, it’s called “coroutines”. It is explicitly non-preemptive, unlike threading. No context switch occurs until you get to an “await” call (or the coroutine terminates).
I just stick to multiprocesses if I want more parallelism then start threads within those.
Hey dude, thanks for the video.
Theres a question. I wanna do a cracking thing and I wanna know what would you recommend to use. whether threading or multiprocessing??
@@germanjesus1117 thanks dude
How to know how many cores are present in my system?
so, GIL also ensures that the interpreter also uses 1 core? Because, technically, threads could use different cores
When will you upload the intermediate and expert python tutorial videos?
I don't have an exact date yet. Those series are very time consuming and I want to make sure I do them right.
I'm looking forward for the series, good luck with them :)
Hey man, you are the best thanks for this quick explanatory scenes. Quick question here, what happens if i use multiprocess or multithread within a "with f.open('somefile.txt',w):" statement ?
Could you have a thread that is executed in priority and then other threads are executed when there is time to do so ?
Yes, but this wouldn't be Python's responsibility. Thread scheduling is the job of the operating system.
How would processing be limited? A CPU can only run so many commands in parrallel so would the program run just as fast as threading if that were to happen?
Excellent although 3 years later. Can you give more/complex examples for python concurrent/parallel. Can or anyone advise:
1. Is thread/process suitable for lengthy/complex (other than simple calculation) processing?
2. Can thread/process be used for "endlessly" to process the live data, i.e. no completion? if yes, need .join() or How/Can .join() work the continuous/live-data processing scenario after .start()? Thanks a lot.
By Tuesday /Wednesday you should pass 50k subs. Congrats.
I know, it's nuts. I'm in disbelief.
nope
what's the terminal command to see cpu usage?
how can he sees its system information in terminal?
How does the process approach deal with memory? Is the required amount significantly more?
Hey engineerman, while using threading in my tkinter software my tkinter window is freezing up and going black what should i do.
Currently , i am working on a youtube video downloader project and it is ver ylaagy.
Hi, i have a query
I want a python script to run at boot time but not the whole program just one thread, SO can anyone please help me it that
i was always afraid from this topic. you have excellent way of simplifying things. i have a small issue. why the processes file show only one process in my computer. thank you
Dumb question here from a python novice, but, if I wanted to run a python program multiple times at one time to speed up the process, which method is better?
Very clearly explained. Nice!
In multiprocessing is it ones own PC that decides if it will use different cores per process (ideally?) Also if your computer does not have 32 cores why does top/task manager show 3200% CPU usage? Would have thought 400% would be the max total on a four core machine?
The operating system handles distribution of processes over cores, or if only one core exists, prioritizes process execution within one core. The max cpu would show as 100% * number of logical cores. My server has 16 physical cores and 32 logical cores (made available via hyper threading). In practice though the server can only handle 16 cores worth of work.
@@EngineerMan Thanks for the quick and clear response. Thanks also for the great video.
how can we use process module on a rasberry pi which is a single core processer :D
Thank you, short sharp and highly informative
How to display the cpu usage like yours?
What do you suggest to use when a bunch of networks requests need to be made? I'm guessing threads from what you have told in the video?
Strictly speaking, if you want to do a ton of stuff at once but nothing is really CPU intensive, threads are the way to go.
Engineer Man can you please do one more on asyncio. I think it would be very interesting topic. Thanks
Very clear. Thanks!
When should i use asyncio and when threads for concurrency
Quick question, is there a good way(program or other) to scan a network for hackers other issues?
With what application you use to see the cores percentage ?
What are you using to write your code in all these videos. It seems like some nice software.
Nice explanation, thanks!
Super informative 👌🏽
Really great vid man. I like your format. You could maybe pace it a bit more
So i don't understand... Why is everyone saying that python doesn't support multiple processing? I've heard this a couple of times... This video proves that it can just fine...
32 cores in 2018. That's impressive. Which processor were you using back then?
My Script Just Stops Everytime same code and everything anyone help?
what program do you use to write the code? (seems like some sort of notepad)
Thanks you for making sense. I didn't know that threading wasn't actually doing more than one thing at a time.
I guess I don't need to know this anymore but still if anyone can tell me what the difference between "_thread" and "threading" is, that would be great.
what is your theme color please?
awesome explanation!
This is awesome! I finally understand! Thx!
Thank you for the good explanation, I understood it!
Is multi-threading similar to asynchronous programming?
They are similar but with a couple differences. Threading is when two sets of code execute independently from one another and often in parallel (Python excluded). Async is more like some task gets worked on when resources are available to work on it and then the caller is notified via a thing called a callback function once it completes. Asynchronous execution often occurs on a single thread.