Afaiu: Protocol lets you specify a communication contract. If an object implements the methods required by a specific protocol, we can communicate with this object over the given protocol. E.g. If we have two indepentent classes, but both classes implement a protocol, we can communicate with them. If we would use inheritance instead, then both classes would need a common parent class, with its associated methods, and thus the two classes wouldn't be independent anymore. E.g., an independent car and plane class, both implement the Protocol with a method "refuel".
@@ArjanCodes Ok, you should be proud for that comment above, probably the most iconic comment I read in your channel. By the way, I feel the same about learning in your channel: you are teaching me better than any course or book I've ever read. Many thanks from Brazil!
My gut reaction to protocols is "yuck, this is less readable and explicit code". Even though it is less "pythonic", I prefer to make python as close to a typed language as possible, to communicate the structure and intended relationships within my code. I hate using naming conventions, or god forbid comments, to inform users (often my future self) of relationships between classes and type expectations. The one thing protocols might be better for than abstract base classes is clear multiple inheritance. @arjan is there any particular disadvantage to subclassing from the protocol just to formalize the relationship (and get all the "compile time" typing system benefits)
One way to enable passing different configuration params to the exporters is treat configurations as factory functions rather than classes to instantiate. You can just pass a factory method, lambda or functools.partial to define all bizzare configurations.
I have been trying to figure out the differences between Python's `abc.ABC` and `Protocols` for the last week, so this was a welcome explanation. Admittedly, I'm not sure that I like the lack of inheritance in the would-be subclasses because it makes it less clear that they are coupled to the Protocol. Protocols seem like a really great thing, but I feel like their implementation has only added more confusion than it resolves.
My one critique would be that overriding the '__call__' method can be a bit of an anti-pattern that tends to hide what code is actually doing. Avoiding using it in most cases usually creates code that is a bit more obvious and clear and even self-documenting. Like with your example, where you had added a comment: # use the factory to create a media exporter media_exporter = factory() Without the comment, it can be a bit unintuitive on first pass what the factory object is doing by being called, or that it even is an object. It could just be a function, in which case what kind of function is "factory"? Even if you logically conclude it then must be an object with an overridden '__call__' method, or check with your IDE, for creating MediaExporters, it's already taken far too much mental effort to decipher. However, if this was a named method instead, this code becomes much more clear and even completely self-documenting: media_exporter = factory.create() Or even better (code is read far more than it is written): media_exporter = factory.create_media_exporter() This way it is completely clear that factory is an object, and exactly what this factory object is for, and what it is doing when you call its method!
Protocols are ideal when you want to pass both existing classes and your custom classes to any functions. For example, if the existing classes define the following methods: add, sub, div, and mul, you can define a Protocol having these methods and type hint functions to receive your Protocol as arguments. This way you can define new classes that adhere to the Protocol but don’t have to subclass/depend on other existing classes.
I think what is being missed by a lot of folks here is that Protocols allow a class to implement multiple of them while inheritance only lets you inherit from one class. Also, Protocols allow to limit the scope of what a method requires in order to fulfills its purpose. If a method only cares about using a few properties of a parameter, it can make that explicit by accepting a Protocol that lists those properties. All classes that fulfill those properties would automatically start working for that method.
@@fennecbesixdouze1794 TIL! Thanks. It is worth noting that method and field resolution become pretty complex the more inheritance levels you add though!
According to me, it is best to use abstract base class. When RabbitMQueue and SQSQueue are both subclasses of Queue abstract class, looking at the class (actually the parent class name) you come to know that , "Okay! Since their parent class is Queue, I can pass them whenever a method accepts a parameter of type Queue" with confidence. You cannot do the same with Protocol
I am new to Protocols but I share the feeling that they may take decoupling a little too far. In addition to your argument, one may observe that in this video the protocols VideoExporter and AudioExporter are identical. As a consequence, one could pass a ConcreteVideoExporter where a ConcreteAudioExporter was expected and no (static) type checker could point out the mistake. On the other hand this flexibility may be desired behaviour if some class adheres to several protocols and you don't want to / are not in the position to specify all of them as superclasses. It feels like both the power and the pitfall of Protocols is that they facilitate duck typing in type checkers, if that makes sense?
Of course, type checkers are just a tool; in the example above they just illustrate part of what a developer may struggle with understanding / implementing (robustly) / validating.
Exactly! the same with the bunch ok kafka/mqtt clients for which there are several different python implementations differing for a few commas here and there
That's true. I wish you could specify the protocol in the classes that conform to it. In other languages like Swift and Java classes that conform to a protocol (or an interface in java) have to state it in their definition: class DoubleEndedQueue implements Queue {...} And that's because they are nominally typed. In Java and Typescript as well as Python, objects are structurally typed.
Closures are also an option here. They hand of the entirety of your interface configuration to the function signature (by "function signature" I also mean return signature). You can create a factory that accepts some argument (or arguments) and then returns some functions that call back to some abstract actions and attributes that act on the injected/enclosed objects. This is only good if you really don't plan on doing a lot of extending or using dunder stuff. Basically if all you need is a factory with some inversion of responsibility. The benefit is that you don't have to worry about any fancy python datamodel things happening behind your back. It is very restrictive and only gives the end user exactly what methods and data you want them to have.
I love these refactoring videos. They gave me the bug to go back and clean up some old and clunky coffee of mine, but it's a lot more work when one has unit testing. Would you do a video one day on refactoring code that is tested? 🙂
The problem is that Arjan doesn't really "refactor" the code, he complicates it and makes it less readable by using unnecessary things, like sticking dataclasses literally everywhere just because he can. The problem he presents in this video can be solved with one dictionary: FACTORIES = {'low': {'video': BadVideoExporter, 'audio': BadAudioExporter}, 'medium': ...}. Bam, problem solved. Clean and short, without cluttering the code with his unnecessary classes and factories.
@@k283 the point of the videos isn't the literal change he's making so much as the idea behind it. These changes may be "heavy handed" for small code line this, but the **idea** is super helpful in real production code. Classes and dataclasses have many advantages for passing data over dictionaries. Simply being able to use dot notation to access element values saves a lot of time and is more readable than dict's brackets+quoted value. And the IDE offers autocomplete, which you don't get with dictionaries. I would just say that we should use the right tool for the job: for some things a dictionary will be fine, for others this is better.
@@virtualraider > These changes may be "heavy handed" for small code line this, but the *idea* is super helpful in real production code. Maybe, but to me it looks like terrible overengineering. Which is just as bad as smelly code. The solution I proposed here requires one dict; his solution requires adding unnecessary classes and importing dataclasses. That is a prime example of overengineering. It's been said that once you learn to use a hammer, everything starts looking like a nail; well, here we see a dude that learned what dataclasses is, now he writes even print('hello world') only in a dataclass. Because he learned it and is eager to use it :) > being able to use dot notation to access element values saves a lot of time Dunno, creating a class (and thus adding more code, thereby increasing the overall amount of code to read) just for the purpose of using dot instead of [key] seems an overkill to me, like shooting a surface-to-air missile to kill a sparrow. This also injects overheads and, in the end, getting the object's property uses a dictionary under the hood, like __dict__. So it turns out dict is used anyway. But if you don't like some_dict[some_key] notation, why don't you replace all the dictionaries in your code with classes, in order to use dot notation?.. Wait a second... Slow down, you aren't gonna tell me you actually create classes every time you could use a dict, are you? 😨 😱 > and is more readable than dict's brackets+quoted value That is a very arguable point; a line like "vehicles[car]" seems no less readable and short than "vehicles dot car". > I would just say that we should use the right tool for the job With this I agree!
My take in other languages has been to use inheritance when Im trying to create a type taxonomy. I use it as a tool to communicate intent, and default to interfaces to all the other cases. Python is a bit special, but I think the Protocol is closest to interfaces in other languages so I'll start applying that rule in my code. Tl;DR: - inheritance/ABC to communicate relationship - protocols for everything else
Can you help me understand this rule of thumb? In my mind, a similar relationship exists whether Protocol or ABC is used (concrete class implements interface), but that relationship is just communicated less clearly with Protocols
I briefly looked into using Protocol instead of ABC, but what I needed was actually a partially implemented subclass of dict. I made a class called ExtendedDict that declares some helpful overrides (e.g. __eq__ enforces timing-attack safe comparison) and does some clever stuff with __setitem__ to enforce json serializability and preventing extraneous, non-property keys from being set. As a result, almost all classes in the project are serializable to json. (It is an implementation of MuSig on ed25519, which involves 3 rounds of interaction, so serializing properly was very important.) It's probably over-engineered and maybe unpythonic, but it is pretty nice to be able to reliably call json.dumps on any object. I suspect that having a Protocol that specifies abstract methods to_json and from_json would probably be easier to grokk than a system that overrides __setitem__.
@@ArjanCodes in unis like mine you pay for having an education to be a dev, but honestly most of my industrial python skills I actually use at my job recently come from your videos. I really like what you're doing. Thank you, sir :)
Yeah if I'm writing a game, I often come across the issue of I kind of want my class to inherit from multiple objects. If I'm making a big bad evil guy, that's a vampire, I might want to use the statblock of other vampire enemies, but slightly change them, but also I might want the player to be able to engage in dialogue with this enemy, like with normal NPCs to make the whole thing more dramatic. Now if I were to solve this using Abstract Base Classes, I'm not sure how I would go about that, but thankfully, my BBEG can implement multiple protocols at once, which is one of the great things about protocols.
From reading the comments: - ABC is better for subclassing because you can see and your IDE what is signature of this class and its objects - Protocol is better for type hinting arguments, that are being passed to class or function. Without coupling it with single type of classes. You have to also adhere to your protocol and dont use any other methods than what ia defined in the protocol. The main take, dont define a class that adheres to a Protocol. Use ABC instead. Protocols are only for cient code not library code. I.e. RabbitQueue, RedisQueue has add() method, however they are not inherting from the same class and dont share any superclass (either you dont have control over those class, or you dont want to play with multiple inheritance, or any other reason) You can define a Protocol with add() method. This way you have more flexibility while having 100% IDE support, and also makes sure that you dont assume anything from the passed object more than calling .add() on it. Check @Oliver Voggenreiter comment for more info. However he is wrong about inheritance Any criticism is welcome
Vim tip since you mentioned in a recent video that you’ve started using it, and perhaps you do this now. You can type di( while cursor in inside a pair of parentheses to delete everything in between them. Or you can use da( to delete everything inside the parentheses and the parentheses themselves.
What about simply decoupling the generation of audio and video exporters so that they don't have to be generated together? I think this would solve a few issues without any tradeoffs (though correct me if I'm wrong): 1. You can now specify different qualities for video and audio exporters 2. Your factory mapping size won't explode with the square of the number of qualities 3. You can now easily pass config functions for video and audio as they each have their own function
From what I understand, The whole idea of (Abstract) Factory method is providing an interface for creating "families of related or dependent objects". Your points are correct, but then it won't be this pattern. I feel, some better example could have been taken to demonstrate this pattern. "Head First Design patterns" book has a great example around Pizza ingredients.
If you use protocols you won't get warnings for not implementing a given method. I use ABC for making sure the objects I use from third parties have all the methods I want (provided I use them as subclasses of ABC).
i feel like this problem would work well in a more functional approach, you can keep the QUALITY mapping for lookups, but each quality can be a mapping in itself to functions that handle the implementations themselves, and this problem clearly is already separated into steps: 1) prepare audio 2) prepare video 3) export audio 4) export video, and it already being split up would work well more functionally IMO
Use `__slots__` to reduce dataclass's speed disadvantage (vs tuple), as you showed in another video, Arjan. `@property` can be used to add a `_` before each class field's name, which is similar to being immutable -- but also likely makes it even slower.
@@AloisMahdal I don't recall the video exactly, but here's a general approach for a 'name' field that can be read without the '_' but written only with the '_' prefix: @property def name(self): return self._name
Novice Python programmer here. Protocols seem to sacrifice a lot of clarity for the sake of saving not that many characters. It seems counter to the Python credo of "explicit > implicit".
I truly enjoy watching your content as a I always learn something new and useful -- even if I've previously used or read about the topic. I had seen the protocol pattern used in "pythonic code" but was not aware of the Protocol class! I just assumed it was implemented as a coding convention :)
I like to create a special factory for each and every class. The factory takes one dependency, the dependency injection container. All factories use the container to create all instances when possible. When the class object is needed to do work, the factory create instance method is invoked and you'll have a instances to complete your task. All dependencies for all class are these factories. The container can be configured so all factories are only ever created once. And difference factories will either return the same instances, or a new instances. This prevents deep dependency trees from being created. As well as preventing stack overflow from happening when injection dependency trees. When you do this, your code become super easy to write unit tests, and to refactor. You can also throw away all abstract classes, and replace it with the decorator pattern
Why not use __new__() instead? It seems that it's what you want (i.e a default factory) and I think my_class(dependency_container, *other_args) is clearer than my_class_factory(dependency_container, *other_args). The only reason not to would be to let the class be instanted with manual dependency injection.
Using protocol seems same as interface impl in golang where the "implements" declaration is not explicit. One benefit is that if you have 3-4 interfaces and you implement all, you don't need to explicitly specify in the class declaration. And your concrete implementation is completely decoupled from the interface. Even import is not needed. Just implementing the method in the interface is enough. There can be an advantage to typecast builtins to a super type without having explicit boxing classes, like interface{} in golang) Overall I feel Protocol is the way to go.
Dear Arjan, I actually thing the best approach is to get rid of most classes, and use dictionaries and tuples to create factories. As we know, factories and design patterns are actually design principles with a proposed java implemenation in the OG book. Java is more verbose, and Python can be way more concise and readable without it being detrimental to maintainability. I think the rigid multiclass factory system you've implemented is not the optimal way to solve this issue in Python.
Coming from a strong-typing background, I like it if the classes that implement the protocol still explicitly stated `implements theProtocol` rather than relying on duck-typing... which is how we'd deal with an `interface` in PHP and sort-of-but-not-quite like how Rust uses `impl` blocks for `traits`. A wise man once said "Explicit is better than implicit." ;)
Personally I would have kept it as an abstract factory like in the beginning, because then the consume code does not need to care if its an audio or video exporter, where as using tuples sort of couples them together more.
I don't get the last approach with the dataclasses. You went full circle back to high coupling. The program is now tightly coupled to the MediaExporterFactory and its implementation. How am I supposed to use a MockMediaExporterFactory now for my tests? By inheriting from it? Then you need to know the exact details of its implementation and especially the __call__ method as its not at all clear how the sub class is supposed to override it. Protocols were better
Great video, thanks for sharing your knowledge! Is the final code available somewhere? Would be really interested to read through it for further understanding. Thanks in advance!
In the original example, the main use of ABCs was purely for static type checking, which Protocols are designed to aid. If you want to know whether LosslessVideoExporter implements VideoExporter, Python has the @runtime_checkable decorator, which allows for use of isinstance() and issubclass(). Edit: And, well, they can be used as regular abstract base classes.
Seems like Protocols were meant to replace factories but were never "finished". I like what you showed in the video but the questions you raise regarding input parameters lead to problems with long term maintainability. It's nice to know what they are though, great video as always Arjan!
Usually, I also prefer to use ABC, but, for example, in Django I’m trying to write logic which can work both with model instances and data transfer objects and Protocols are good for typing in this case.
Protocol reminds me to the interfaces in go. You don't tell explicitly like in java which type or contract you follow but the method definition will do in runtime
Why don't you link to the PEP, PEP 544, so that people can read what the actual purpose of Protocols are? Lots of questions in the comments here are asking things that are very nicely detailed in the PEP. Design rationale, when to use it and when not to use it.
Check if export_quality is in the dictionary. If not then print error message and add continue statement.
2 ปีที่แล้ว
Hi Arjan, many thanks for your videos, its probably my first go-to programming channel now a days. I am wondering how do you delete thinks so quickly in your code editor, it reminds me to vim, but on VSC. Is it a kind of plugin or just black magic? hahaha thanks again!
This factory requires expliciting extending the lookup dictionary, you can just do a "self registering factory by" schema_classes = { class_type.class_key(): class_type for class_type in SchemaProtocol.__subclasses__() } That way, every class that inherits from the protocol is automatically registered in the factory. You only need to agree on an interface on how the loop up key works, no need to create a dictionary by hand, nor even touch the factory. Just inherit, done Whit this method, you can delete the entire function used to select the factory and the object, since you the dictionary (better than tuple) will be automatically generated and extended by the inheritance, plus it's dinamic. Imagina if you have a mini repo and your factory is a dictionary of 100 classes....that dictionary is going to loop suspicious if you do it by hand...
Arjan, are you familiar with the DCI (Data, Context, Interaction) architecture / late binding (Alan Kay - OOP, James Coplien - Trygve etc.). I'd love to see you do several episodes on those.
- E.g. realizing actual OOP à la Alan Kay in Python (not just class oriented / Data Structures with methods) where Objects themselves communicate with each other via messages instead of imperatively. - Building a simple DCI architected application, implementing roles, late binding and a context in Python - exploring how Python built-in concepts and protocols are already geared towards true Object orientation (Network of Objects, Behaviour focused programming)
Could you do a video on config management? Suppose there is json config, which users provide and there can be some mandatory and some optional fields. Optional fields might need to be filled with default values. I guess this is a common pattern ...
How do I deal with subclasses that have different parameters? In my case is even more complicated because I can put them in the constructor as those are machine learning models. One of them receives a two strings and the other a list of numbers. How do I make a factory for these classes?
I still am not sure on the usefulness of protocols if not using a static type checker like mypy. How are they different from having no protocol at all? The function will accept any type and lets say it tries to call the ".fancy_method" on one of its arguments, if the method is not there, it will just raise the usual AttributeError. I tried this with and without protocol and unless I use mypy or the runtime_checkable+isinstance checks it really makes no difference on the code behavior. Is it just to define an interface instead of writing the expected API in the documentation?
I think for such issues, the most reliable way is to use pydantic. Basically it defines dataclasses, too, but it enforces the type hints it gets (but of course with a slight performance penalty what shouldn't be an issue for high io bound video/audio exporters anyway). mypy always has the disadvantage, that if you don't use it from the start of a project, it probably will ring everywhere and just changing code everywhere might mean breaking it. You could of course decide to reduce nr of alerts from PR to PR, but as pydantic could be used without touching other code, so that's a nice solution for adding new features to an existing code base. Probably would have been mentioned by Arjan himself (he has a more recent video about it, too), but it wasn't so famous and production ready at the time of this video here produced.
I enjoy your explanations; they are clear. thoughtful and insightful. But I think this video highlights the fact that much of programming activity is based on a subjective assessment of needs, and you often face a situation where personal style determines the design choices made. How far do you pursue the 'Pythonic', OOP patterns (or, dare I say it), testing, as opposed to defensive programming? Where do you step back and say 'this is good enough'?
True. At some point things do become a matter of taste. It's way more important to apply the design principles than the specific Python feature you choose to use for it. My idea for these kinds of videos is that I hope they help people understand also that following a design pattern exactly is not always necessary and that you can use a variety that better fits with your codebase and your way of working. And in the end, functions, classes and objects are all callables, so it's just syntactical sugar ;).
Thanks for the suggestion! The pluggable backend pattern is in fact an example of an Adapter. But it's definitely a good use case for it. I'll add it to the list :).
Question: Please correct me if I'm wrong. But I was not able to see that you get any syntax errors from NOT IMPLEMENTING a Protocol. This would be a huge downside to them if you ask me. You would still be able to catch exceptions down the line, but not catching syntax problems ahead of time leads to a really bad workflow where you use a lot of time where you do not realize things were forgotten and have to go back. I think it becomes easier to read when things are explicitly stated. Such as abstract classes. Where as from what I understand from Protocols, they can be considered valid for many scenarios as long as the requested method exist. I would not dare to use these as it would allow all sorts of un-intended functionality. Since a class has no direct relationship with the Protocol, how does it compare it to it? Kind of looks like automatic black magic to me comming from a C++/C# background.
The relationship is setup differently when you use protocols versus inheritance. With inheritance the relationship is defined between the superclass and its subclasses. With protocols it's defined between the protocol and the thing that uses the protocol. If you pass an object of some sort to that thing, the structural typing system checks whether that's allowed when you run the program. If not, it's going to raise a runtime error If you use VSCode together with a tool like Pylance (which I highly recommend), then you are going to get a typing error in your IDE when you try to use an object that doesn't adhere to the Protocol's structure. So you will be able to catch these problems before you run the code.
@@ArjanCodes Ah, that is at least good. I mostly work in the game industry, so not detecting a problem before runtime can mean up towards 15 minutes of wasted setup to start up the editor and run through content to get to the test case in some scenarios. We try to keep it as short as possible but sometimes, sub-sequent events matters a lot for the test case. So it stacks up quickly. Thanks for the answer.
In principle yes, but you need some way to keep track of the video and audio exporter classes to use. To achieve this, you could turn it into a function that returns a function, where the first function has the two classes as a parameter and it returns a function that can create instances of those classes.
Why wouldn't you have the factory accept Callable[[], AudioExporter] and Callable [[], VideoExporter] instead of classes? Then you don't need any extra classes to accomplish configuration. You just make a parameterless function that can initialize the exporter objects with the config you want.
You certainly could do that as well. Actually, classes and functions are both callables so this would allow you to choose either classes or functions to do the job.
Both VideoExporter and AudioExporter are protocols with the exact same methods and signatures, isn't it the case ? How does python decide which one to match to if you have 2 protocols with the exact same method names and signatures ?
Protocols are matched at runtime via duck typing. So actually, it doesn't matter if two protocols overlap or are the same. The only thing that counts is that it defines the interface that is expected. As long as the interface of the class matches what's defined in the protocol that's expected, there's no issue.
In this particular case, I liked that ABC provided stronger typing - you couldn't use an AudioExporter where a VideoExporter was required. But in general I like Python's use of duck typing with dunders and Protocols seem to mirror that well. An alternative in Ruby is the respond_to? method, which is in my opinion the duckiest duck typing :)
I'm not entirely happy with the dictionary solution. I had hoped you'd build something more dynamic like a type where you register the exporters that maybe are decorated with some "tag" decorator specifying aliases for that exporter that the container uses to find the requested factory.
So I ran mypy over the "with_protocol.py" file and got the following error: with_protocol.py:142: error: Incompatible return value type (got "object", expected "ExporterFactory") Looking at the "type hints" related docs it appears that the FACTORIES dictionary was missing a type hint for its contained key/values. Adding the following type hint at line 127 corrects the error: FACTORIES: Mapping[str,ExporterFactory] = { "low": FastExporter(), "high": HighQualityExporter(), "master": MasterQualityExporter(), }
Personally I don't think using Protocols is a good idea. With ABCs you have `SubClass(SuperClass)` where it is very clear that `function(argument: SuperClass)` will accept SubClass. With Protocols you have `Subclass`; One can only tell that `function(argument: SuperClass)` will accept SubClass by: 1. finding where SuperClass is, 2. looking at what functions it has, 3. and then checking if SubClass has them. One is basically giving up a lot of maintainability because they are too lazy to type `(SuperClass)`. To me, Protocols seem more like a "better than nothing, aka `function(argument)`" than a successor to ABC.
In the final variant, why even bother with creating class for factory, when it is literally a simple function to get data class instance? If there is no mutable state to manage, simple combination of frozen data classes and functions may save lines and debugging time imho
Thanks Arjan for this very clear explanation. I find the class inference of the new Protocol (interface) lead to code less readeble and more error prone. I prefer to stick with ABC. Wondering what are your thoughts on this
Hi Arjan, Thanks for the great videos as always. In various videos (ie. DS project refactoring video 1), you have been replacing ABC with protocols for interface. If possible, can you please make a video to explain the advantage of doing so? To me and some comments I saw below, it seems like the protocol makes the interface less explicit, but no obvious advantage I can see except for less code is written (no abstract method, no inheritance by the implementation class). I try to google protocol vs ABC but no one talks about it. Hope you can explain the pros and cons and when to use what.. Thanks!
In your (original and protocol-version of) read_exporter function and FACTORIES dictionary, you could save some memory and a few parentheses by not calling the constructor ("building the factory") before factory type is selected.
Thanks! I do think the Demeter’s law violation here is acceptable since the class only acts as a container for easy access to the video and audio exporter and nothing else, so its job is to expose these things. But you’re right that in most cases this should be avoided.
Would have been better to use _ _ new _ _ instead of _ _ call _ _ and you won't have to perform the awkward manuever of making the instance of MediaExportFactory callable.
I must admit I'm watching this video third time and I still don't understand the Pythonic way you're describing. But I have the impression that some parts of your code (btw. it's a pity there's no link to the code) is completely unnecessary. Would it still work if you removed inheriting from Protocol? Or if you deleted FastExporter class? Or get_video_exporter functions? It would really help if you showed some simple diagram of what inherits what, what uses what, what goes where. A kingdom for a big picture, please!
As I think of the options, my unfortunate answer is "it depends". Protocols are powerful, bit can hide the intention. I use them most when parts of an app needs to be tested and DI demands that a dependency be externalized. Other data structures are useful when they do not get in the way of DI. The beauty of Python is its flexibility in this area - use the data structure that make the most sense. And if performance is a significant consideration - TEST! Then make the most informed design decision.
2 comments: 1. I prefer ABC since makes code much more readable to other developer. 2. Factories shine when need to create complex objects, like example when we are building an usecase object that needs 4 services to do the job. So having to instantiate 4 objects to inject in an use case, i could use a factory for that.
I didn't use protocols yet but they seem pretty inconvenient. To be honest, I still don't see any advantage over ABCs. I mean, it defeats the zen "explicit is better than implicit" in any ways since it is no way obvious that any of the classes that use the protocol are logically derived of that protocol. The other thing is that it is meant to be used with type hints, which are meant to be optional.
💡 Here's my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide.
Afaiu: Protocol lets you specify a communication contract. If an object implements the methods required by a specific protocol, we can communicate with this object over the given protocol. E.g. If we have two indepentent classes, but both classes implement a protocol, we can communicate with them. If we would use inheritance instead, then both classes would need a common parent class, with its associated methods, and thus the two classes wouldn't be independent anymore. E.g., an independent car and plane class, both implement the Protocol with a method "refuel".
Very well put!
For the first time, I'm learning more Python from videos than from books.
Thanks, glad the video was helpful!
@@ArjanCodes Ok, you should be proud for that comment above, probably the most iconic comment I read in your channel. By the way, I feel the same about learning in your channel: you are teaching me better than any course or book I've ever read. Many thanks from Brazil!
My gut reaction to protocols is "yuck, this is less readable and explicit code". Even though it is less "pythonic", I prefer to make python as close to a typed language as possible, to communicate the structure and intended relationships within my code. I hate using naming conventions, or god forbid comments, to inform users (often my future self) of relationships between classes and type expectations.
The one thing protocols might be better for than abstract base classes is clear multiple inheritance. @arjan is there any particular disadvantage to subclassing from the protocol just to formalize the relationship (and get all the "compile time" typing system benefits)
> or god forbid comments
Why? Well-commented code is always a good thing.
21:47 passing in frozen=True to the dataclass decorator will make the dataclass immutable.
One way to enable passing different configuration params to the exporters is treat configurations as factory functions rather than classes to instantiate. You can just pass a factory method, lambda or functools.partial to define all bizzare configurations.
I have been trying to figure out the differences between Python's `abc.ABC` and `Protocols` for the last week, so this was a welcome explanation. Admittedly, I'm not sure that I like the lack of inheritance in the would-be subclasses because it makes it less clear that they are coupled to the Protocol. Protocols seem like a really great thing, but I feel like their implementation has only added more confusion than it resolves.
Well they are subclasses in only that you have to derive from an interface in order to implement it. I am using C# and Java in that regard.
My one critique would be that overriding the '__call__' method can be a bit of an anti-pattern that tends to hide what code is actually doing. Avoiding using it in most cases usually creates code that is a bit more obvious and clear and even self-documenting.
Like with your example, where you had added a comment:
# use the factory to create a media exporter
media_exporter = factory()
Without the comment, it can be a bit unintuitive on first pass what the factory object is doing by being called, or that it even is an object. It could just be a function, in which case what kind of function is "factory"? Even if you logically conclude it then must be an object with an overridden '__call__' method, or check with your IDE, for creating MediaExporters, it's already taken far too much mental effort to decipher.
However, if this was a named method instead, this code becomes much more clear and even completely self-documenting:
media_exporter = factory.create()
Or even better (code is read far more than it is written):
media_exporter = factory.create_media_exporter()
This way it is completely clear that factory is an object, and exactly what this factory object is for, and what it is doing when you call its method!
exactly what i was going to say, explicit is better than implicit, there is really no need to use the magic __call__ method to confuse people
Protocols are ideal when you want to pass both existing classes and your custom classes to any functions. For example, if the existing classes define the following methods: add, sub, div, and mul, you can define a Protocol having these methods and type hint functions to receive your Protocol as arguments. This way you can define new classes that adhere to the Protocol but don’t have to subclass/depend on other existing classes.
you and Zander from the channel "very academy" should have a child. A very pythonic child indeed. You both are very clear teaching guys, love it!
I think what is being missed by a lot of folks here is that Protocols allow a class to implement multiple of them while inheritance only lets you inherit from one class. Also, Protocols allow to limit the scope of what a method requires in order to fulfills its purpose. If a method only cares about using a few properties of a parameter, it can make that explicit by accepting a Protocol that lists those properties. All classes that fulfill those properties would automatically start working for that method.
Good point! That aspect of Protocols is very close to how interfaces in other programming languages work.
Python allows for multiple inheritance.
The rest of what you said I agree with, however.
@@fennecbesixdouze1794 TIL! Thanks. It is worth noting that method and field resolution become pretty complex the more inheritance levels you add though!
These videos are wonderful! Please don't stop
According to me, it is best to use abstract base class. When RabbitMQueue and SQSQueue are both subclasses of Queue abstract class, looking at the class (actually the parent class name) you come to know that , "Okay! Since their parent class is Queue, I can pass them whenever a method accepts a parameter of type Queue" with confidence. You cannot do the same with Protocol
I am new to Protocols but I share the feeling that they may take decoupling a little too far.
In addition to your argument, one may observe that in this video the protocols VideoExporter and AudioExporter are identical. As a consequence, one could pass a ConcreteVideoExporter where a ConcreteAudioExporter was expected and no (static) type checker could point out the mistake.
On the other hand this flexibility may be desired behaviour if some class adheres to several protocols and you don't want to / are not in the position to specify all of them as superclasses.
It feels like both the power and the pitfall of Protocols is that they facilitate duck typing in type checkers, if that makes sense?
Of course, type checkers are just a tool; in the example above they just illustrate part of what a developer may struggle with understanding / implementing (robustly) / validating.
Exactly! the same with the bunch ok kafka/mqtt clients for which there are several different python implementations differing for a few commas here and there
That's true. I wish you could specify the protocol in the classes that conform to it.
In other languages like Swift and Java classes that conform to a protocol (or an interface in java) have to state it in their definition:
class DoubleEndedQueue implements Queue {...}
And that's because they are nominally typed. In Java and Typescript as well as Python, objects are structurally typed.
Closures are also an option here.
They hand of the entirety of your interface configuration to the function signature (by "function signature" I also mean return signature). You can create a factory that accepts some argument (or arguments) and then returns some functions that call back to some abstract actions and attributes that act on the injected/enclosed objects.
This is only good if you really don't plan on doing a lot of extending or using dunder stuff. Basically if all you need is a factory with some inversion of responsibility. The benefit is that you don't have to worry about any fancy python datamodel things happening behind your back. It is very restrictive and only gives the end user exactly what methods and data you want them to have.
I love these refactoring videos. They gave me the bug to go back and clean up some old and clunky coffee of mine, but it's a lot more work when one has unit testing.
Would you do a video one day on refactoring code that is tested? 🙂
The problem is that Arjan doesn't really "refactor" the code, he complicates it and makes it less readable by using unnecessary things, like sticking dataclasses literally everywhere just because he can. The problem he presents in this video can be solved with one dictionary: FACTORIES = {'low': {'video': BadVideoExporter, 'audio': BadAudioExporter}, 'medium': ...}. Bam, problem solved. Clean and short, without cluttering the code with his unnecessary classes and factories.
@@k283 the point of the videos isn't the literal change he's making so much as the idea behind it. These changes may be "heavy handed" for small code line this, but the **idea** is super helpful in real production code.
Classes and dataclasses have many advantages for passing data over dictionaries. Simply being able to use dot notation to access element values saves a lot of time and is more readable than dict's brackets+quoted value. And the IDE offers autocomplete, which you don't get with dictionaries.
I would just say that we should use the right tool for the job: for some things a dictionary will be fine, for others this is better.
@@virtualraider > These changes may be "heavy handed" for small code line this, but the *idea* is super helpful in real production code.
Maybe, but to me it looks like terrible overengineering. Which is just as bad as smelly code. The solution I proposed here requires one dict; his solution requires adding unnecessary classes and importing dataclasses. That is a prime example of overengineering.
It's been said that once you learn to use a hammer, everything starts looking like a nail; well, here we see a dude that learned what dataclasses is, now he writes even print('hello world') only in a dataclass. Because he learned it and is eager to use it :)
> being able to use dot notation to access element values saves a lot of time
Dunno, creating a class (and thus adding more code, thereby increasing the overall amount of code to read) just for the purpose of using dot instead of [key] seems an overkill to me, like shooting a surface-to-air missile to kill a sparrow. This also injects overheads and, in the end, getting the object's property uses a dictionary under the hood, like __dict__. So it turns out dict is used anyway.
But if you don't like some_dict[some_key] notation, why don't you replace all the dictionaries in your code with classes, in order to use dot notation?.. Wait a second... Slow down, you aren't gonna tell me you actually create classes every time you could use a dict, are you? 😨 😱
> and is more readable than dict's brackets+quoted value
That is a very arguable point; a line like "vehicles[car]" seems no less readable and short than "vehicles dot car".
> I would just say that we should use the right tool for the job
With this I agree!
My take in other languages has been to use inheritance when Im trying to create a type taxonomy. I use it as a tool to communicate intent, and default to interfaces to all the other cases. Python is a bit special, but I think the Protocol is closest to interfaces in other languages so I'll start applying that rule in my code.
Tl;DR:
- inheritance/ABC to communicate relationship
- protocols for everything else
I agree, that's a pretty good approach!
Can you help me understand this rule of thumb? In my mind, a similar relationship exists whether Protocol or ABC is used (concrete class implements interface), but that relationship is just communicated less clearly with Protocols
I briefly looked into using Protocol instead of ABC, but what I needed was actually a partially implemented subclass of dict. I made a class called ExtendedDict that declares some helpful overrides (e.g. __eq__ enforces timing-attack safe comparison) and does some clever stuff with __setitem__ to enforce json serializability and preventing extraneous, non-property keys from being set. As a result, almost all classes in the project are serializable to json. (It is an implementation of MuSig on ed25519, which involves 3 rounds of interaction, so serializing properly was very important.) It's probably over-engineered and maybe unpythonic, but it is pretty nice to be able to reliably call json.dumps on any object. I suspect that having a Protocol that specifies abstract methods to_json and from_json would probably be easier to grokk than a system that overrides __setitem__.
omg this content so good I cannot believe it's free
Thanks so much, glad you like it!
@@ArjanCodes in unis like mine you pay for having an education to be a dev, but honestly most of my industrial python skills I actually use at my job recently come from your videos. I really like what you're doing. Thank you, sir :)
Protocols seem to save some time writing in exchange for some time reading.
I prefer clarity over simplicity. But I think is the matter of taste :)
Yeah if I'm writing a game, I often come across the issue of I kind of want my class to inherit from multiple objects. If I'm making a big bad evil guy, that's a vampire, I might want to use the statblock of other vampire enemies, but slightly change them, but also I might want the player to be able to engage in dialogue with this enemy, like with normal NPCs to make the whole thing more dramatic. Now if I were to solve this using Abstract Base Classes, I'm not sure how I would go about that, but thankfully, my BBEG can implement multiple protocols at once, which is one of the great things about protocols.
From reading the comments:
- ABC is better for subclassing because you can see and your IDE what is signature of this class and its objects
- Protocol is better for type hinting arguments, that are being passed to class or function. Without coupling it with single type of classes.
You have to also adhere to your protocol and dont use any other methods than what ia defined in the protocol.
The main take, dont define a class that adheres to a Protocol. Use ABC instead.
Protocols are only for cient code not library code.
I.e. RabbitQueue, RedisQueue has add() method, however they are not inherting from the same class and dont share any superclass (either you dont have control over those class, or you dont want to play with multiple inheritance, or any other reason)
You can define a Protocol with add() method.
This way you have more flexibility while having 100% IDE support, and also makes sure that you dont assume anything from the passed object more than calling .add() on it.
Check @Oliver Voggenreiter comment for more info.
However he is wrong about inheritance
Any criticism is welcome
I am current reading the GoF book. Its pretty dense. I may revisit everytime I need.
Also, Im interested in your editing shortcuts
Vim tip since you mentioned in a recent video that you’ve started using it, and perhaps you do this now. You can type di( while cursor in inside a pair of parentheses to delete everything in between them. Or you can use da( to delete everything inside the parentheses and the parentheses themselves.
as a new vim user I thank you
Or if you want to just change what's inside the parentheses, a ci( command will do it, too :-)
What about simply decoupling the generation of audio and video exporters so that they don't have to be generated together? I think this would solve a few issues without any tradeoffs (though correct me if I'm wrong):
1. You can now specify different qualities for video and audio exporters
2. Your factory mapping size won't explode with the square of the number of qualities
3. You can now easily pass config functions for video and audio as they each have their own function
From what I understand, The whole idea of (Abstract) Factory method is providing an interface for creating "families of related or dependent objects". Your points are correct, but then it won't be this pattern. I feel, some better example could have been taken to demonstrate this pattern. "Head First Design patterns" book has a great example around Pizza ingredients.
If you use protocols you won't get warnings for not implementing a given method. I use ABC for making sure the objects I use from third parties have all the methods I want (provided I use them as subclasses of ABC).
Yes - that's a very good reason to use ABCs instead of Protocol. Though you could in principle also inherit from Protocol to get the same warnings.
i feel like this problem would work well in a more functional approach, you can keep the QUALITY mapping for lookups, but each quality can be a mapping in itself to functions that handle the implementations themselves,
and this problem clearly is already separated into steps: 1) prepare audio 2) prepare video 3) export audio 4) export video, and it already being split up would work well more functionally IMO
thank you so much for taking time out to make such an informative video
Code review holy war power be with you, the new features users.
Like before watching this channel now days 😊
Use `__slots__` to reduce dataclass's speed disadvantage (vs tuple), as you showed in another video, Arjan. `@property` can be used to add a `_` before each class field's name, which is similar to being immutable -- but also likely makes it even slower.
Can you please provide example of how would you use `@property` and `_`?
@@AloisMahdal I don't recall the video exactly, but here's a general approach for a 'name' field that can be read without the '_' but written only with the '_' prefix:
@property
def name(self):
return self._name
@@yehoshualevine Ah, I misread your comment, I thought you were referring some dataclasses magic I was not aware of 🪄🔍 🙂
Novice Python programmer here.
Protocols seem to sacrifice a lot of clarity for the sake of saving not that many characters. It seems counter to the Python credo of "explicit > implicit".
One more great useful video! Thank you Arjan!
I prefer Data Classes at 9 of 10 cases.
Thanks Antonis, happy you’re enjoying the content!
I truly enjoy watching your content as a I always learn something new and useful -- even if I've previously used or read about the topic. I had seen the protocol pattern used in "pythonic code" but was not aware of the Protocol class! I just assumed it was implemented as a coding convention :)
Saw Vim being used. Here’s an unsolicited tip:
dab to delete around brackets.
Also the more explicits: da( da{ da[ da" da' ( and inner equivalents di( di{ di[ di" di' ) . Love your channel btw
A command I use constantly
Are there a dib and cib command macros? (delete/change) in brackets
@@JohnWasinger They are not macros, but yes. They do exist. And do what you're expecting them to do.
@@thichquang1011 Thank you! :D
I like to create a special factory for each and every class. The factory takes one dependency, the dependency injection container. All factories use the container to create all instances when possible. When the class object is needed to do work, the factory create instance method is invoked and you'll have a instances to complete your task.
All dependencies for all class are these factories. The container can be configured so all factories are only ever created once. And difference factories will either return the same instances, or a new instances.
This prevents deep dependency trees from being created. As well as preventing stack overflow from happening when injection dependency trees.
When you do this, your code become super easy to write unit tests, and to refactor. You can also throw away all abstract classes, and replace it with the decorator pattern
Why not use __new__() instead? It seems that it's what you want (i.e a default factory) and I think my_class(dependency_container, *other_args) is clearer than my_class_factory(dependency_container, *other_args). The only reason not to would be to let the class be instanted with manual dependency injection.
Using protocol seems same as interface impl in golang where the "implements" declaration is not explicit.
One benefit is that if you have 3-4 interfaces and you implement all, you don't need to explicitly specify in the class declaration. And your concrete implementation is completely decoupled from the interface. Even import is not needed. Just implementing the method in the interface is enough. There can be an advantage to typecast builtins to a super type without having explicit boxing classes, like interface{} in golang)
Overall I feel Protocol is the way to go.
Dear Arjan, I actually thing the best approach is to get rid of most classes, and use dictionaries and tuples to create factories. As we know, factories and design patterns are actually design principles with a proposed java implemenation in the OG book. Java is more verbose, and Python can be way more concise and readable without it being detrimental to maintainability. I think the rigid multiclass factory system you've implemented is not the optimal way to solve this issue in Python.
Do you have an example you could throw together?
Coming from a strong-typing background, I like it if the classes that implement the protocol still explicitly stated `implements theProtocol` rather than relying on duck-typing... which is how we'd deal with an `interface` in PHP and sort-of-but-not-quite like how Rust uses `impl` blocks for `traits`. A wise man once said "Explicit is better than implicit." ;)
Nice video as always!
Good presentation and well thought out and insightful ideas. Thanks, subscribed.
Thank you so much, glad you like the videos!
Personally I would have kept it as an abstract factory like in the beginning, because then the consume code does not need to care if its an audio or video exporter, where as using tuples sort of couples them together more.
I don't get the last approach with the dataclasses. You went full circle back to high coupling. The program is now tightly coupled to the MediaExporterFactory and its implementation. How am I supposed to use a MockMediaExporterFactory now for my tests? By inheriting from it? Then you need to know the exact details of its implementation and especially the __call__ method as its not at all clear how the sub class is supposed to override it. Protocols were better
Great video, thanks for sharing your knowledge!
Is the final code available somewhere? Would be really interested to read through it for further understanding. Thanks in advance!
why i have to define Protocol class if it does nothing , how do i know that if LoselessVideoexporter implements VideoExporter class?
This need for mypyc static analysis and type hints in your IDE
i get the point, now my ide knows if some class implements same method this should be sub class, but for me it is not clean. am i getting wrong?
In the original example, the main use of ABCs was purely for static type checking, which Protocols are designed to aid. If you want to know whether LosslessVideoExporter implements VideoExporter, Python has the @runtime_checkable decorator, which allows for use of isinstance() and issubclass().
Edit: And, well, they can be used as regular abstract base classes.
@@NateROCKS112 thanks
Seems like Protocols were meant to replace factories but were never "finished". I like what you showed in the video but the questions you raise regarding input parameters lead to problems with long term maintainability. It's nice to know what they are though, great video as always Arjan!
Usually, I also prefer to use ABC, but, for example, in Django I’m trying to write logic which can work both with model instances and data transfer objects and Protocols are good for typing in this case.
You can use @dataclass(frozen=True) for immutable dataclasses ?
Protocol reminds me to the interfaces in go. You don't tell explicitly like in java which type or contract you follow but the method definition will do in runtime
Is there a purpose to protocols if you're not using type annotation?
Thank you. Amazing stuff again!
Thank you - happy you enjoyed it!
Golang uses protocols. It does not have abstract base classes. Only compositional OO.
Don't forget to make dataclass frozen. It's faster at reading the object, nested properties and executing functions.
Why don't you link to the PEP, PEP 544, so that people can read what the actual purpose of Protocols are?
Lots of questions in the comments here are asking things that are very nicely detailed in the PEP. Design rationale, when to use it and when not to use it.
Check if export_quality is in the dictionary. If not then print error message and add continue statement.
Hi Arjan, many thanks for your videos, its probably my first go-to programming channel now a days. I am wondering how do you delete thinks so quickly in your code editor, it reminds me to vim, but on VSC. Is it a kind of plugin or just black magic? hahaha thanks again!
This factory requires expliciting extending the lookup dictionary, you can just do a "self registering factory by" schema_classes = {
class_type.class_key(): class_type
for class_type in SchemaProtocol.__subclasses__()
}
That way, every class that inherits from the protocol is automatically registered in the factory. You only need to agree on an interface on how the loop up key works, no need to create a dictionary by hand, nor even touch the factory. Just inherit, done
Whit this method, you can delete the entire function used to select the factory and the object, since you the dictionary (better than tuple) will be automatically generated and extended by the inheritance, plus it's dinamic. Imagina if you have a mini repo and your factory is a dictionary of 100 classes....that dictionary is going to loop suspicious if you do it by hand...
What is going on with your line count ? How did you do it, and what is the point ?
Arjan, are you familiar with the DCI (Data, Context, Interaction) architecture / late binding (Alan Kay - OOP, James Coplien - Trygve etc.). I'd love to see you do several episodes on those.
- E.g. realizing actual OOP à la Alan Kay in Python (not just class oriented / Data Structures with methods) where Objects themselves communicate with each other via messages instead of imperatively.
- Building a simple DCI architected application, implementing roles, late binding and a context in Python
- exploring how Python built-in concepts and protocols are already geared towards true Object orientation (Network of Objects, Behaviour focused programming)
Could you do a video on config management? Suppose there is json config, which users provide and there can be some mandatory and some optional fields. Optional fields might need to be filled with default values. I guess this is a common pattern ...
Coming soon (I just recorded a video covering exactly this).
No github link?
How do I deal with subclasses that have different parameters? In my case is even more complicated because I can put them in the constructor as those are machine learning models. One of them receives a two strings and the other a list of numbers. How do I make a factory for these classes?
I still am not sure on the usefulness of protocols if not using a static type checker like mypy. How are they different from having no protocol at all? The function will accept any type and lets say it tries to call the ".fancy_method" on one of its arguments, if the method is not there, it will just raise the usual AttributeError. I tried this with and without protocol and unless I use mypy or the runtime_checkable+isinstance checks it really makes no difference on the code behavior. Is it just to define an interface instead of writing the expected API in the documentation?
I think for such issues, the most reliable way is to use pydantic.
Basically it defines dataclasses, too, but it enforces the type hints it gets (but of course with a slight performance penalty what shouldn't be an issue for high io bound video/audio exporters anyway).
mypy always has the disadvantage, that if you don't use it from the start of a project, it probably will ring everywhere and just changing code everywhere might mean breaking it. You could of course decide to reduce nr of alerts from PR to PR, but as pydantic could be used without touching other code, so that's a nice solution for adding new features to an existing code base.
Probably would have been mentioned by Arjan himself (he has a more recent video about it, too), but it wasn't so famous and production ready at the time of this video here produced.
Random question, what font do you use?
brilliant
I enjoy your explanations; they are clear. thoughtful and insightful. But I think this video highlights the fact that much of programming activity is based on a subjective assessment of needs, and you often face a situation where personal style determines the design choices made. How far do you pursue the 'Pythonic', OOP patterns (or, dare I say it), testing, as opposed to defensive programming? Where do you step back and say 'this is good enough'?
True. At some point things do become a matter of taste. It's way more important to apply the design principles than the specific Python feature you choose to use for it. My idea for these kinds of videos is that I hope they help people understand also that following a design pattern exactly is not always necessary and that you can use a variety that better fits with your codebase and your way of working. And in the end, functions, classes and objects are all callables, so it's just syntactical sugar ;).
You should do the pluggable backend pattern next
Thanks for the suggestion! The pluggable backend pattern is in fact an example of an Adapter. But it's definitely a good use case for it. I'll add it to the list :).
Question: Please correct me if I'm wrong. But I was not able to see that you get any syntax errors from NOT IMPLEMENTING a Protocol.
This would be a huge downside to them if you ask me. You would still be able to catch exceptions down the line, but not catching syntax problems ahead of time leads to a really bad workflow where you use a lot of time where you do not realize things were forgotten and have to go back.
I think it becomes easier to read when things are explicitly stated. Such as abstract classes.
Where as from what I understand from Protocols, they can be considered valid for many scenarios as long as the requested method exist.
I would not dare to use these as it would allow all sorts of un-intended functionality.
Since a class has no direct relationship with the Protocol, how does it compare it to it?
Kind of looks like automatic black magic to me comming from a C++/C# background.
The relationship is setup differently when you use protocols versus inheritance. With inheritance the relationship is defined between the superclass and its subclasses. With protocols it's defined between the protocol and the thing that uses the protocol. If you pass an object of some sort to that thing, the structural typing system checks whether that's allowed when you run the program. If not, it's going to raise a runtime error
If you use VSCode together with a tool like Pylance (which I highly recommend), then you are going to get a typing error in your IDE when you try to use an object that doesn't adhere to the Protocol's structure. So you will be able to catch these problems before you run the code.
@@ArjanCodes Ah, that is at least good. I mostly work in the game industry, so not detecting a problem before runtime can mean up towards 15 minutes of wasted setup to start up the editor and run through content to get to the test case in some scenarios. We try to keep it as short as possible but sometimes, sub-sequent events matters a lot for the test case. So it stacks up quickly. Thanks for the answer.
Could the MediaExporterFactory also have just been a function, rather than a dataclass?
In principle yes, but you need some way to keep track of the video and audio exporter classes to use. To achieve this, you could turn it into a function that returns a function, where the first function has the two classes as a parameter and it returns a function that can create instances of those classes.
I am bit lost with respect to __call__ .... what is its practical usage/advantage, other than it makes an instance callable ?
Why wouldn't you have the factory accept Callable[[], AudioExporter] and Callable [[], VideoExporter] instead of classes? Then you don't need any extra classes to accomplish configuration. You just make a parameterless function that can initialize the exporter objects with the config you want.
You certainly could do that as well. Actually, classes and functions are both callables so this would allow you to choose either classes or functions to do the job.
Why do you need `Type` at 16:58? Can you just use `video_class: VideoExporter` ?
What? A call for initialization of MediaExporterFactory does not actually return an instance of MediaExporterFactory?
Both VideoExporter and AudioExporter are protocols with the exact same methods and signatures, isn't it the case ? How does python decide which one to match to if you have 2 protocols with the exact same method names and signatures ?
Protocols are matched at runtime via duck typing. So actually, it doesn't matter if two protocols overlap or are the same. The only thing that counts is that it defines the interface that is expected. As long as the interface of the class matches what's defined in the protocol that's expected, there's no issue.
In this particular case, I liked that ABC provided stronger typing - you couldn't use an AudioExporter where a VideoExporter was required. But in general I like Python's use of duck typing with dunders and Protocols seem to mirror that well. An alternative in Ruby is the respond_to? method, which is in my opinion the duckiest duck typing :)
I'm not entirely happy with the dictionary solution. I had hoped you'd build something more dynamic like a type where you register the exporters that maybe are decorated with some "tag" decorator specifying aliases for that exporter that the container uses to find the requested factory.
Thank you
You’re welcome!
So a data class is basically a struct?
How I can get the code?
So I ran mypy over the "with_protocol.py" file and got the following error:
with_protocol.py:142: error: Incompatible return value type (got "object", expected "ExporterFactory")
Looking at the "type hints" related docs it appears that the FACTORIES dictionary was missing a type hint for its contained key/values. Adding the following type hint at line 127 corrects the error:
FACTORIES: Mapping[str,ExporterFactory] = {
"low": FastExporter(),
"high": HighQualityExporter(),
"master": MasterQualityExporter(),
}
Personally I don't think using Protocols is a good idea.
With ABCs you have `SubClass(SuperClass)` where it is very clear that `function(argument: SuperClass)` will accept SubClass.
With Protocols you have `Subclass`; One can only tell that `function(argument: SuperClass)` will accept SubClass by:
1. finding where SuperClass is,
2. looking at what functions it has,
3. and then checking if SubClass has them.
One is basically giving up a lot of maintainability because they are too lazy to type `(SuperClass)`.
To me, Protocols seem more like a "better than nothing, aka `function(argument)`" than a successor to ABC.
In the final variant, why even bother with creating class for factory, when it is literally a simple function to get data class instance?
If there is no mutable state to manage, simple combination of frozen data classes and functions may save lines and debugging time imho
Thanks Arjan for this very clear explanation. I find the class inference of the new Protocol (interface) lead to code less readeble and more error prone. I prefer to stick with ABC. Wondering what are your thoughts on this
I believe protocols are better than anything, it just "export the video even in master quality, as expected" like it was nothing
Hi Arjan,
Thanks for the great videos as always.
In various videos (ie. DS project refactoring video 1), you have been replacing ABC with protocols for interface. If possible, can you please make a video to explain the advantage of doing so?
To me and some comments I saw below, it seems like the protocol makes the interface less explicit, but no obvious advantage I can see except for less code is written (no abstract method, no inheritance by the implementation class). I try to google protocol vs ABC but no one talks about it. Hope you can explain the pros and cons and when to use what.. Thanks!
Watch out for this Friday’s video 😉.
@@ArjanCodes Thank you so much for the video, I have watched it, it's excellent 🥰
In your (original and protocol-version of) read_exporter function and FACTORIES dictionary, you could save some memory and a few parentheses by not calling the constructor ("building the factory") before factory type is selected.
Thanks - good point!
Cool solution with dataclasses. The only thing I didn't like was the Demeter's Law violation when doing exporter.video.method()
Thanks! I do think the Demeter’s law violation here is acceptable since the class only acts as a container for easy access to the video and audio exporter and nothing else, so its job is to expose these things. But you’re right that in most cases this should be avoided.
@@ArjanCodes Fair enough, Arjan. Keep it up with the good work!
where is the code in this video
Oops, I forgot to add it to the description, it's in there now. Here's the link: github.com/ArjanCodes/2021-pythonic-factory.
@@ArjanCodes Thanks
Would have been better to use _ _ new _ _ instead of _ _ call _ _ and you won't have to perform the awkward manuever of making the instance of MediaExportFactory callable.
I must admit I'm watching this video third time and I still don't understand the Pythonic way you're describing. But I have the impression that some parts of your code (btw. it's a pity there's no link to the code) is completely unnecessary. Would it still work if you removed inheriting from Protocol? Or if you deleted FastExporter class? Or get_video_exporter functions? It would really help if you showed some simple diagram of what inherits what, what uses what, what goes where. A kingdom for a big picture, please!
As I think of the options, my unfortunate answer is "it depends".
Protocols are powerful, bit can hide the intention. I use them most when parts of an app needs to be tested and DI demands that a dependency be externalized.
Other data structures are useful when they do not get in the way of DI.
The beauty of Python is its flexibility in this area - use the data structure that make the most sense.
And if performance is a significant consideration - TEST!
Then make the most informed design decision.
2 comments:
1. I prefer ABC since makes code much more readable to other developer.
2. Factories shine when need to create complex objects, like example when we are building an usecase object that needs 4 services to do the job. So having to instantiate 4 objects to inject in an use case, i could use a factory for that.
if you were to type-hint your FACTORIES dict, how would it look? `FACTORIES: dict[str, ExporterFactory]` ?
What's your python config in vscode? Looks so responsive!
not giving you my email
And there's really a lot of copying and pasting the same code over and over again. This factory pattern is not dry...
your channel is awesome, can you recommend some awesome youtube channel to learn java?
I think you "pythonized" the _Abstract Factory_ design pattern and not the _Factory_ design pattern.
It’s quite common to shorten Abstract Factory to Factory. There’s also the Factory Method pattern, which is indeed something else.
I didn't use protocols yet but they seem pretty inconvenient. To be honest, I still don't see any advantage over ABCs. I mean, it defeats the zen "explicit is better than implicit" in any ways since it is no way obvious that any of the classes that use the protocol are logically derived of that protocol. The other thing is that it is meant to be used with type hints, which are meant to be optional.
The proximity effect of the mic spoils the voice.
Just watch in 2x you sound like whispering 😑
@typing.runtime_checkable
dt: