Intro to async Python | Writing a Web Crawler

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 พ.ย. 2024

ความคิดเห็น • 125

  • @thepaulcraft957
    @thepaulcraft957 ปีที่แล้ว +176

    The idea of explaining async by printing out real world tasks is genius

    • @unperrier5998
      @unperrier5998 ปีที่แล้ว +10

      you must've already known async concepts and/or specifically asyncio because if I were a noob and didn't know much about async, let alone asyncio in particular, I'd be lost. Especially if I wasn't familiar with python language in the first place.
      At the very least I'd have to go through the video multiple times and read more wherever mCoding is glossing over central concepts like event loop, scheduling and a coroutines.

    • @eeriemyxi
      @eeriemyxi ปีที่แล้ว +4

      It should not be considered a genius thing to do; every video that is explaining an abstract notion should try its best to show its uses in the real world. I think the content creators on TH-cam who are just here for the fat wallets have caused this unfortunate lower expectation from the average educational video.

    • @eeriemyxi
      @eeriemyxi ปีที่แล้ว +2

      @@unperrier5998 I don't agree that it is always necessary to have a textual definition known understandably to learn a concept clearly. Even if they didn't explain what _coroutines_ are, they did show what it is; and I think that is still easier to understand than all those fancy words used by authors of many articles on any topic. Although my argument is not that this is _always_ the case; in a fair share of concepts, it is better to read some article than to just see how it all works in an intended comprehensive manner.

    • @unperrier5998
      @unperrier5998 ปีที่แล้ว +2

      @@eeriemyxi James is too fast, he speaks with a high pace, may be okay for native English speakers but we're not all lucky. He also doesn't explain very well, unfortunately. It's missing visual cues as to how the tasks are scheduled and a graph of messages going through the queues would help.
      Personally I don't need it because I've been doint that for years, but I had difficulties following the pace of the video. So I imagine noobs will struggle.

    • @minernooberz68
      @minernooberz68 ปีที่แล้ว +2

      @@unperrier5998 I agree I do not think a beginner would understand this video, but I think JavaScript would teach you async await better

  • @quitethecontrary1846
    @quitethecontrary1846 ปีที่แล้ว +4

    Thank you...the idea of waiting on multiple things at once is what made it "click" for me

  • @packetpunter
    @packetpunter ปีที่แล้ว +35

    having additional things to try out at the end of the video is awesome. your content is always so great! a sincere thank you :)

    • @mCoding
      @mCoding  ปีที่แล้ว +20

      It's a great complement as an instructor that my students want to see the homework!

    • @Aceptron
      @Aceptron 10 หลายเดือนก่อน

      @@mCoding Yes, it is absolutely helpful!!!

  • @Cucuska2
    @Cucuska2 ปีที่แล้ว +13

    Great video, for a while I have been hoping that I would just stumble upon a Python-specific async tutorial.

  • @Phaust94
    @Phaust94 ปีที่แล้ว +6

    This channel is like top 3 channels for Python on the web.
    Thanks for this and keep it up!

    • @abdelghafourfid8216
      @abdelghafourfid8216 ปีที่แล้ว +5

      what are other two ?

    • @mjaysmileofficial
      @mjaysmileofficial ปีที่แล้ว +1

      @Abdelghafour Fid I'm not this guy, but Arjan Codes is pretty good. He explains things using more close to real-world examples

    • @notead
      @notead ปีที่แล้ว +2

      @@mjaysmileofficial I feel like the quality of his videos has declined. Giving heavy content crater vibes now

    • @mjaysmileofficial
      @mjaysmileofficial ปีที่แล้ว

      @Note Maybe it is. I haven't watched his latest content, but the videos about async python are pretty good

    • @Just2Dimes
      @Just2Dimes ปีที่แล้ว

      @@abdelghafourfid8216 besides Arjan Codes, which has been mentioned above, Corey Schafer and Anthonywritescode are worth looking into - if you didn't know them already.

  • @WebMafia
    @WebMafia ปีที่แล้ว +2

    wow, this is by far the best asyncio tutorial I have encountered. thank you!

  • @lawrencedoliveiro9104
    @lawrencedoliveiro9104 ปีที่แล้ว +27

    2:53 time.sleep() blocks the current thread, asyncio.sleep() only blocks the current task.
    Remember that a “task” is not a concept of the Python language itself: it is purely a figment of the asyncio library -- it is a wrapper around a coroutine object, which _is_ a concept of the Python language. Tasks are schedulable entities managed by asyncio.

    • @Graham_Wideman
      @Graham_Wideman ปีที่แล้ว +1

      Am I right to infer that your point is that sleep() blocks the OS thread in which Python is running, hence Python has no way to do anything else while the OS has Python suspended?

    • @ed_iz_ed
      @ed_iz_ed ปีที่แล้ว

      sleep blocks execution at the python level, while being non blocking at the os level, this means the actual thread can be used for something else, just not by your python program

  • @brycedevoe9814
    @brycedevoe9814 ปีที่แล้ว +44

    The best advice I can give to anyone regarding python's async/asyncio is to read the documentation, as there are a lot of edge cases and it's well documented.
    For example you need to save objects from create_task with a class variable or they get garbage collected

    • @unperrier5998
      @unperrier5998 ปีที่แล้ว +7

      yeah the weak reference can be a problem. Overall I find the asyncio API too low-level to be really useful.
      I've written applications and once you mix multiprocessing (external processes having their own ioloop) it becomes too complicated, you have to manage everything yourself. to give an example it's like if python only exposed an API for threading where you have to create the thread yourself, create local storage and manage a queue for each thread yourself. That'd be terrible to use... well asyncio is at that level imo, not high enough (yet) to be awesome.

  • @hriscuvalerica4814
    @hriscuvalerica4814 ปีที่แล้ว +1

    Everything i needed and more . Async content is lacking on TH-cam.

  • @lawrencedoliveiro9104
    @lawrencedoliveiro9104 ปีที่แล้ว +9

    4:46 You can have a nonempty pending list even in the absence of a timeout. There is another optional argument to asyncio.wait() which specifies whether to wait for all tasks to complete (the default), or just the first one.

  •  ปีที่แล้ว +3

    Wow, this is crazy similar to Rust! The difference is pretty much only function names. And such a clear explanation with a good analogy, I can share this with Rust newbies.

  • @ziggyspaz
    @ziggyspaz ปีที่แล้ว

    Hands down the best async python tutorial

    • @mCoding
      @mCoding  ปีที่แล้ว

      Wow, thanks!

  • @clouck59
    @clouck59 ปีที่แล้ว

    I thought I was experienced in python. I discovered "await" and "async" with this video.

  • @CritiKaster
    @CritiKaster ปีที่แล้ว +5

    Dude your videos are so great - making hard stuff easy - thanks!

    • @mCoding
      @mCoding  ปีที่แล้ว

      I appreciate the praise!

  • @Al-.-ex
    @Al-.-ex ปีที่แล้ว +7

    Great intro to async, though definitely feels like something I need to do myself a couple times before I understand it/remember it fully. Really appreciate the suggestions at 13:18 for that!!

  • @ali-om4uv
    @ali-om4uv ปีที่แล้ว +7

    Wow that is some great content. But for me some of the webcrawling concepts are a little distracting. It would be so great if you could add a video for async work on Datapipelines ( get some csv files from an fftp, transfrom them and bulk load them to an relational db for example including logging , taks handling and retry logic). I think this would be a great lesson for many data scientists who are not really educated in data eneneering topics. Anyhow thanks for the homework, i hope i will not fail it :-)

    • @mCoding
      @mCoding  ปีที่แล้ว +5

      Great suggestion! Since data pipelines are often compute-bound rather than io-bound, multiprocessing is also a common option to use instead of asyncio. Check out my video on unlocking all your cores where I show a basic ETL workflow using multiprocessing.

  • @probaddie456
    @probaddie456 ปีที่แล้ว +42

    10:54 You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts. so many times but it is not getting to me. Even enhanced irregular regular expressions as used by Perl are not up to the task of parsing HTML. You will never make me crack. HTML is a language of sufficient complexity that it cannot be parsed by regular expressions. Even Jon Skeet cannot parse HTML using regular expressions. Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide. The cannot hold it is too late. The force of regex and HTML together in the same conceptual space will destroy your mind like so much watery putty. If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes. HTML-plus-regexp will liquify the n​erves of the sentient whilst you observe, your psyche withering in the onslaught of horror. Rege̿̔̉x-based HTML parsers are the cancer that is killing StackOverflow it is too late it is too late we cannot be saved the transgression of a chi͡ld ensures regex will consume all living tissue (except for HTML which it cannot, as previously prophesied) dear lord help us how can anyone survive this scourge using regex to parse HTML has doomed humanity to an eternity of dread torture and security holes using regex as a tool to process HTML establishes a breach between this world and the dread realm of c͒ͪo͛ͫrrupt entities (like SGML entities, but more corrupt) a mere glimpse of the world of reg​ex parsers for HTML will ins​tantly transport a programmer's consciousness into a world of ceaseless screaming, he comes, the pestilent slithy regex-infection wil​l devour your HT​ML parser, application and existence for all time like Visual Basic only worse he comes he comes do not fi​ght he com̡e̶s, ̕h̵i​s un̨ho͞ly radiańcé destro҉ying all enli̍̈́̂̈́ghtenment, HTML tags lea͠ki̧n͘g fr̶ǫm ̡yo​͟ur eye͢s̸ ̛l̕ik͏e liq​uid pain, the song of re̸gular exp​ression parsing will exti​nguish the voices of mor​tal man from the sp​here I can see it can you see ̲͚̖͔̙î̩́t̲͎̩̱͔́̋̀ it is beautiful t​he final snuffing of the lie​s of Man ALL IS LOŚ͖̩͇̗̪̏̈́T ALL I​S LOST the pon̷y he comes he c̶̮omes he comes the ich​or permeates all MY FACE MY FACE ᵒh god no NO NOO̼O​O NΘ stop the an​*̶͑̾̾​̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e n​ot rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

    • @mCoding
      @mCoding  ปีที่แล้ว +16

      You got the reference :)

    • @prringa6099
      @prringa6099 ปีที่แล้ว

      Nicely copied from StackOverflow. ;)

    • @charlesclampitt7865
      @charlesclampitt7865 ปีที่แล้ว +7

      This post looks exactly as it is supposed to look - there are no problems with its content. Please do not flag it for our attention.

  • @unusedTV
    @unusedTV ปีที่แล้ว +11

    I did not know it had a built in HTML parser! Then why do so many people reach for beautifulsoup? Can you do a comparison, and explain if they are or aren't used for the same goals?

    • @eeriemyxi
      @eeriemyxi ปีที่แล้ว

      I personally use Selectolax due to its speed and CSS selector. I don't know much about the built-in HTML Parser, but I don't think it has CSS selectors.

    • @Graham_Wideman
      @Graham_Wideman ปีที่แล้ว +2

      BeautifulSoup is not a parser, it relies on your choice of html parsers (possibly Python's built-in one) to do its work. What BeautifulSoup adds is a suite of methods to navigate and search the resulting document tree to extract the information you want.

  • @indraxios
    @indraxios ปีที่แล้ว +1

    i hit subscribe within first 40 secs, i knew finally i am going to understand this anync maddness

  • @maxthomarino
    @maxthomarino 3 หลายเดือนก่อน

    incredible educational content, you are a great teacher because you have so much experience

  • @b33thr33kay
    @b33thr33kay 6 หลายเดือนก่อน

    Frankly, if you've never done asynchronous programming, this video not very introductory. It's taking me several days to understand how it all works. However, if you have the time to understand the fundamentals, then come back to this example, this is a good video.
    If you're confused by the Queue thing: put() and get() are coroutines because they might yield back control to the event loop if the queue is full or empty, respectively. Internally, they create and store a future then await on it: awaiting on a future yields control to the event loop (it's literally a "yield" statement). The opposite method then resolves the future by calling the future's set_result() method, which unblocks the first method.

  • @royz_1
    @royz_1 ปีที่แล้ว +4

    The syntax seems to have improved a bit since I last tried async programming in python 3.6
    Still javascript has the best syntax for async programming in my opinion.

  • @lawrencedoliveiro9104
    @lawrencedoliveiro9104 ปีที่แล้ว +2

    Another interesting topic would be using asyncio and coroutines in GUI programming. Because every GUI framework implements its own event loop. The key thing about asyncio is it provides a standard event-loop API for Python. You can replace its default event loop with wrappers around alternative event loops, and thereby use coroutine tasks in GUI apps.

  • @efaile3431
    @efaile3431 ปีที่แล้ว +2

    Yes, HTML is not a regular language, however your problem might be. For example, if you are searching for the only id="..." on the page, you can totally do that with a regex.

    • @DrewTNaylor
      @DrewTNaylor ปีที่แล้ว +5

      Oh no, parsing HTML with regex...

    • @ConstantlyDamaged
      @ConstantlyDamaged ปีที่แล้ว +5

      I feel that it's at this point you make soup. Delicious, beautiful soup.

    • @vigge83
      @vigge83 ปีที่แล้ว

      the an​*̶͑̾̾​̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e n​ot rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

    • @talideon
      @talideon ปีที่แล้ว

      No, still not worth it. You'll find plenty of pages that will parse in a browser but fail horribly if you attempt to apply a regex to link extraction. You'll find unexpected attributes, malformed syntax, and so many other problems. You can't even imagine how gross vaguely parsable HTML can get. You need a parser, and regexes are not parsers.

  • @ConstantlyDamaged
    @ConstantlyDamaged ปีที่แล้ว

    An excursion into my usual playground! Very nice and succinctly explained.
    Perhaps another related one would be the concept of async wrappers?

  • @Graham_Wideman
    @Graham_Wideman ปีที่แล้ว +4

    While async is not threading, it would be great if James could clarify how different tasks may communicate properly, for example what happens with variables shared between tasks.

    • @talideon
      @talideon ปีที่แล้ว +1

      It's in-process cooperative multitasking, so assuming you don't hand off to another task in the middle of mutating something, it's fine. Not good, but fine.

  • @unperrier5998
    @unperrier5998 ปีที่แล้ว +14

    Nice, but I find your example too complicated for a video. And I'm an experienced async programmer.
    I can follow, but at least a diagram with the queues would be helpful for global comprehension.
    And a diagram of how the ioloop schedules coroutines, because there are a lot of assumptions and I believe it's hard for beginners to follow without visual cues.

  • @lethalavidity
    @lethalavidity ปีที่แล้ว

    - Am I awaited?
    - You are awaited, shiny and chrome!

  • @arjix8738
    @arjix8738 4 หลายเดือนก่อน

    For async crawling I prefer rust with tokio, really nice stuff.

  • @SirDonald
    @SirDonald ปีที่แล้ว

    Wow that's crazy I was just searching for this yesterday

  • @omgwtfafterparty
    @omgwtfafterparty ปีที่แล้ว +1

    I am looking for tips on how to properly use logging module with asyncio.
    Supposing all the coroutine functions emit some log messages, it will quickly become very difficult to find out which log message comes from which task as they are all executed simultaneously. Or am i missing something? :)

    • @mCoding
      @mCoding  ปีที่แล้ว

      If you want to know what task each message is associated with, include an identifier of the task (like its name) in the log message.

  • @mattlau04
    @mattlau04 ปีที่แล้ว

    Very interesting video, really helps understand how async works

  • @darske1
    @darske1 ปีที่แล้ว +2

    I remember one day I saw an example of this with a script that downloads videos from internet, instead of waiting to download each one sequentially, they used asyncio, but I was wondering, if the downloads happen simultaneously, won't that give less bandwidth to each video and then making each one download slower? How is it still giving benefits with this approach? (Or maybe my understanding of how things are downloaded is not right hehe)

    • @mdemdemde
      @mdemdemde ปีที่แล้ว +2

      I assume it depends on the bottleneck in download speed. If the speed is uncapped, I think what you say is true, but if the server limits the download speed you should improve using asyncio. Does this make sense?

    • @darske1
      @darske1 ปีที่แล้ว

      @@mdemdemde Completely, thanks for your answer! :D

  • @unperrier5998
    @unperrier5998 ปีที่แล้ว +5

    I've written asyncio applications and it quickly becomes complicated once you mix multiprocessing (external processes having their own ioloop), listening on sockets and pipes (each having slightly different implementation that requires you to write different code for each), your code becomes too complicated, you have to manage everything yourself and navigate abstractions that do the same things but are different (reading lines coming from sockets vs pipes).
    To give an comparison, it's like if python only exposed an API for threading where you have to create the thread yourself, create local storage yourself and schedule work to run in the thread via a queue, for every thread, all by yourself. That'd be terrible API to use... well asyncio is at that level imo, not high enough (yet) to be awesome.

    • @talideon
      @talideon ปีที่แล้ว

      It's in-process cooperative multitasking. None of this is surprising when you're familiar with the constraints around cooperative multitasking.

  • @nikolausluhrs
    @nikolausluhrs ปีที่แล้ว

    I have been looking forward to this

  • @macampo
    @macampo ปีที่แล้ว

    great video, thank you very much

  • @PeterZaitcev
    @PeterZaitcev ปีที่แล้ว

    Well, nothing is wrong in the original problem. Your only cook's hat got dirty and thus you bring it to the laundry, but without it you can't bake. However, today somewhere in between 10:00 and 15:00 you've got a courier delivery and thus must be at home. That leads to the situation when you first wait until the Amazon delivery (task with 3h timeout), then you go to the laundry and wait until your clothes wash and dry, and only then you can start baking a cake.

  • @Khushpich
    @Khushpich ปีที่แล้ว

    Great video as always

  • @abdelghafourfid8216
    @abdelghafourfid8216 ปีที่แล้ว

    if someone has done the improvements suggested at the end I would love to check out the final version

  • @aprilli531
    @aprilli531 2 หลายเดือนก่อน

    is it safe to access the same integer and set across multiple coroutines though?

  • @DanielMaidment
    @DanielMaidment ปีที่แล้ว +1

    Nice intro, a bit fast into those queues.

  • @HerChip
    @HerChip ปีที่แล้ว +1

    this async does not directly imply multithreading/core, right?

    • @mCoding
      @mCoding  ปีที่แล้ว +1

      Correct. We are only using 1 thread on 1 core and only 1 process.

  • @caongocdavidlong
    @caongocdavidlong ปีที่แล้ว

    Please do a video on the aiomultiprocess library.

  • @joshinils
    @joshinils ปีที่แล้ว +1

    The biggest problem I had with trying to implement concurrency is working on the same data from different threads. i.e. multiple threads insert into the same dict/list when they are done with their task.

    • @Musava
      @Musava ปีที่แล้ว +1

      You shouldn't mix threading and async, as explained asyncio should be used when you're slowed down by IO (waiting for things over the internet or hardware jobs), and threading when you actually need concurrency. Asyncio is not concurrent, it is single threaded, it's just more efficient time management (most ideally said, there's always something running)

    • @joshinils
      @joshinils ปีที่แล้ว

      @@Musava right, but the same issue applies to both, doesn't it?
      And i was waiting on requests from the internet, and parsing the results.

    • @Musava
      @Musava ปีที่แล้ว

      @@joshinils Nope, with pure async you will never have two functions running concurrently. Imagine looking at your code and you have a laser pointer, if you use only asyncio, you will always be able to track where the code currently is at. It may be a lot of jumping but it executes only one function at a time. If you'd use threading, you'll need more pointers. In the web crawler example, imagine a ideal internet where it takes exactly 5 seconds to load and the data is huge so it is 0.5 seconds to parse the data. Asyncio will send all requests and then wait, since they all will respond at the same time, you will only be able to parse one at a time, because the parsing is not bottlenecked by connection, or anything, (code never waits in the parsing for anything, it is just doing heavy work).
      Edit: thats why you use threading or sometimes even multiprocessing instead of async in heavy computation.
      Async is to save time from waiting (use the time at which code would wait for something), threading is to use more of your pc power

    • @volbla
      @volbla ปีที่แล้ว

      The task/worker doesn't need to update the collection itself. You can put the parsed result in a queue and update the list synchronously. I think.

  • @VirtuousSaint
    @VirtuousSaint ปีที่แล้ว

    any particular reason you didn't use aiohttp?

  • @yomajo
    @yomajo ปีที่แล้ว

    5yrs in python, never did threading, multiprocessing or async. Am I alone? Keep postponing learning it, seems too difficult.

  • @re.liable
    @re.liable ปีที่แล้ว +1

    "asyncio is very much an afterthought" so true :( I did my own delve into asyncio very recently and I didn't have a good time. Maybe it's just my poor docs reading skills tho

    • @mCoding
      @mCoding  ปีที่แล้ว +1

      I think the docs could really use some improvement. Most Python docs are very understandable but I feel like the async docs are a bit chaotic.

  • @akhileshchander5307
    @akhileshchander5307 7 หลายเดือนก่อน

    very common error:
    32 if events._get_running_loop() is not None:
    ---> 33 raise RuntimeError(
    34 "asyncio.run() cannot be called from a running event loop")
    then use:
    await main()

  • @trag1czny
    @trag1czny ปีที่แล้ว +4

    finally an async video 🤩
    discord gang 🤙

  • @lawrencedoliveiro9104
    @lawrencedoliveiro9104 ปีที่แล้ว

    1:57 Best to call it a “coroutine object”.

  • @MrFluteboy1980
    @MrFluteboy1980 ปีที่แล้ว

    Does the HTML parser also support XML? Is there also one for JSON??

    • @eeriemyxi
      @eeriemyxi ปีที่แล้ว +1

      You can use the `xml` and `json` built-in libraries for such use cases.

    • @volbla
      @volbla ปีที่แล้ว +1

      Have you tried to parse json without importing json? Oh no... you poor soul.

  • @mingqingTeng
    @mingqingTeng ปีที่แล้ว

    good async

  • @MithicSpirit
    @MithicSpirit ปีที่แล้ว +3

    Discord gang!

  • @_ramen
    @_ramen ปีที่แล้ว

    great vid

  • @HoSza1
    @HoSza1 ปีที่แล้ว +1

    Web crawlers should handle AJAX (XMLHttpRequest) generated dynamic content, shouldn't they? (I wonder how much of the total content is available these days, I guess not a trivial amount of percentage...) Nice video BTW.

    • @mCoding
      @mCoding  ปีที่แล้ว

      You are certainly welcome to have your hand at implementating it :)

    • @HoSza1
      @HoSza1 ปีที่แล้ว

      @@mCoding When I try to implememt about anything I just find that someone has already did it, or they tell me why am I reinventing the wheel, so frustrating. :) The existing solution probably is that a headless browser gets queried for the page source or the DOM document for the page in question.

    • @Graham_Wideman
      @Graham_Wideman ปีที่แล้ว

      @@HoSza1 That's what Selenium does for you.

  • @chndrl5649
    @chndrl5649 ปีที่แล้ว

    How much does it cost for cpu? Im actually curious.

    • @sonice9020
      @sonice9020 ปีที่แล้ว

      1 night with your mother

  • @jgtb0pl
    @jgtb0pl ปีที่แล้ว +1

    I'm hyped up

  • @CrushedAsian255
    @CrushedAsian255 4 หลายเดือนก่อน

    I made my own “asyncio” using an array of generator functions and while fun(): yield nonsense

  • @dan00b8
    @dan00b8 ปีที่แล้ว

    "it doesnt matter if you use c or python, facebook wont respond any faster"
    rustaceans be like: lets rewrite whole facebook in rust now

  • @haroombe123
    @haroombe123 ปีที่แล้ว

    Please make one for fast api calls. And not some low latency Wikipedia api calls

  • @henrybigelow3570
    @henrybigelow3570 10 หลายเดือนก่อน +1

    why not have one task for each crawl() and have crawl call asyncio.create_task(crawl(url)) for each found url? then you don't need worker functions or a queue.

  • @grzegorzryznar5101
    @grzegorzryznar5101 ปีที่แล้ว

    Btw this background looks very gray comparing to this one from new UI

  • @charlesclampitt7865
    @charlesclampitt7865 ปีที่แล้ว

    How did you fight off Tony the Pony?

    • @mCoding
      @mCoding  ปีที่แล้ว

      Commented on the wrong video?

    • @charlesclampitt7865
      @charlesclampitt7865 ปีที่แล้ว

      @@mCoding It's a reference to the most famous SO question of all time. I'm pretty sure you know it.

  • @Jeyekomon
    @Jeyekomon ปีที่แล้ว

    I really try but somehow the async programming is just too much for my brain.

  • @bettercalldelta
    @bettercalldelta ปีที่แล้ว +2

    notification gang

  • @aeoteng
    @aeoteng ปีที่แล้ว

    Title: "Intro to async..."
    ..."I claim that you already understand asynchronous programming"
    Yes, but no

  • @SP-db6sh
    @SP-db6sh ปีที่แล้ว

    Waiting for part 2: adding this scrapped data into GraphQL with strawberry orm( source can be : stackoverflow issues)...

  • @JadeJuno
    @JadeJuno ปีที่แล้ว

    0:15 ... welp... another video I just cannot watch.

  • @mikhaililin5534
    @mikhaililin5534 ปีที่แล้ว

    What would be the advantage of using async instead of Thread?
    from threading import Thread
    import time
    import random
    def slow_function(thread_index):
    time.sleep(random.randint(1, 10)) # simulates waiting time for an API call response for example
    print("Thread {} done!".format(thread_index))
    def run_threads():
    threads = []
    for thread_index in range(5):
    individual_thread = Thread(target=slow_function, args=(thread_index,))
    threads.append(individual_thread)
    individual_thread.start()
    # at this point threads are running independently from each other and the main flow of application
    print("Main flow of application")
    for individual_thread in threads:
    individual_thread.join()
    # joining threads insures that all threads are done before moving further in the flow of application
    print("All threads are done")
    run_threads()
    p.s. great video as always!

    • @mCoding
      @mCoding  ปีที่แล้ว +1

      There are two main benefits (tradeoffs). Firstly, async is cooperative whereas threads are not, meaning that when using async the only time when you have to worry about someone else interrupting your execution and seeing potentially broken invariants is when you explicitly say "await" or other explit waiting mechanism. In contrast, a thread could be interrupted anywhere at any time, and you must use locks to avoid other threads seeing broken invariants or racing for data. The second tradeoff is how heavy a thread is versus a coroutine. Threads are much more heavy-duty objects, and your system may grind to a halt if you create more than a few thousand threads. You can have millions of coroutines pending without issue, though.

    • @mikhaililin5534
      @mikhaililin5534 ปีที่แล้ว

      @@mCoding Thank you for such a fast and extensive reply! From here I can dig deeper in Cooperative vs Preemptive multitasking subject. I was not aware it was a thing until now.

  • @rlkandela
    @rlkandela ปีที่แล้ว +1

    The initial code is how adhd brain works 🥲😂

  • @khuntasaurus88
    @khuntasaurus88 ปีที่แล้ว

    3:35 is why I think asyncio in python is pointless. You can EASILY replicate this behavious with more control by multithreading or multiprocessing. It will be more readable too

    • @phenanrithe
      @phenanrithe ปีที่แล้ว +1

      Why would it be pointless? Coroutines don't have the overhead of multithreading (let alone multiprocessing), which is actually why they came to be in the first place. Then threads are limited by the number of cores.