Python's "methodcaller" is very useful

Indently

มุมมอง 27 618

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 23 ม.ค. 2025

ความคิดเห็น • 119

@Indently 8 หลายเดือนก่อน ⁺¹¹
Hello everyone!
As a few of you pointed out, my benchmark was wrong. I didn't use methodcaller() directly inside the test giving it an unfair advantage from the start. I'll try to get people to check my tests next time before uploading the video, sorry about that bit of inconsistency in this video!
@bartolhrg7609 7 หลายเดือนก่อน
You also didn't run anything, since the filter doesn't iterate through the sequence (like stream in Java, until you call some function that does, like list, or iterate yourself)
@callbettersaul 8 หลายเดือนก่อน ⁺⁴²
I tried your exact tests and got 0.09 for methodcaller and 0.15 for lambda.
However, when I refactored the tests to make them more fair (move lambda declaration outside as well), they both resulted in 0.09.
@coladock 8 หลายเดือนก่อน ⁺²
Complied into bytecode and compare the instructions is also a good choice. I'm not familiar with though😊
@__wouks__ 8 หลายเดือนก่อน ⁺⁹
As far as I know "filter" is lazy and actually nothing happened. Those filter functions should have also been wrapped with list in order to actually use filter.
@spotlight-kyd 8 หลายเดือนก่อน ⁺⁶
Yeah, spotted that same test difference immediately and did the same refactoring and used 'sorted' instead of 'filter'. Timings for methodcaller vs lambda are virtually identical.
@jonathanfeinberg7304 8 หลายเดือนก่อน ⁺⁵²
Wouldn't iterator expression be shorter and a bit more readable? Like
[name for name in names if name.startswith("b")]
No methodcaller, magic strings, filter, nor list casting. Probably faster too since fewer distinct operations take place.
For the sort example I agree that methodcaller could be useful.
@fcolecumberri 8 หลายเดือนก่อน ⁺⁵
Also since startswith in methodcaller is a string, it can't be validated by anything.
@oommggdude 8 หลายเดือนก่อน ⁺⁵
The point of this video is not to show the easiest way to perform the specific task of filtering names starting with B. It's simply a demo of methodcaller....
@dandyddz 8 หลายเดือนก่อน ⁺²
I think, "Probably faster too since fewer distinct operations take place" is right for 3.11+ , but I am not sure about the older versions of python. Here are the code samples I ran with python 3.11 with the results:
1:
from timeit import repeat
list_to_filter = [
'Why', 'didnt', 'you', 'put', '"from operator import methodcaller"',
'and', 'the', 'code', 'related', 'to', 'the', 'initialization', 'of', 'the', '"methodcaller"', 'object',
'there?', 'I', 'aint', 'love', 'that'
]
warm_up_code = 'for i in range(10): pass'
lambda_code = '''
list(filter(lambda x: x.startswith('a'), list_to_filter))
'''
operator_code = '''
from operator import methodcaller
list(filter(methodcaller('startswith', 'a'), list_to_filter))
'''
list_comprehension_code = '''
[s for s in list_to_filter if s.startswith('a')]
'''
g = {'list_to_filter': list_to_filter}
warm_up_r = min(repeat(warm_up_code))
print(f'{warm_up_r = :.5f}')
operator_r = min(repeat(operator_code, globals=g))
print(f'{operator_r = :.5f}')
lc_r = min(repeat(list_comprehension_code, globals=g))
print(f'{lc_r = :.5f}')
lambda_r = min(repeat(lambda_code, globals=g))
print(f'{lambda_r = :.5f}')
warm_up_r = 0.20486
operator_r = 3.08152
lc_r = 1.82864
lambda_r = 2.47760
2:
from timeit import repeat
list_to_filter = ['asd', 'asdmkalsdas', 'lsadmasd', 'klasd', 'amsd']
warm_up_code = 'for i in range(10): pass'
lambda_code = '''
list(filter(lambda x: x.startswith('a'), list_to_filter))
'''
operator_code = '''
from operator import methodcaller
list(filter(methodcaller('startswith', 'a'), list_to_filter))
'''
list_comprehension_code = '''
[s for s in list_to_filter if s.startswith('a')]
'''
g = {'list_to_filter': list_to_filter}
warm_up_r = min(repeat(warm_up_code))
print(f'{warm_up_r = :.5f}')
operator_r = min(repeat(operator_code, globals=g))
print(f'{operator_r = :.5f}')
lc_r = min(repeat(list_comprehension_code, globals=g))
print(f'{lc_r = :.5f}')
lambda_r = min(repeat(lambda_code, globals=g))
print(f'{lambda_r = :.5f}')
warm_up_r = 0.20873
operator_r = 1.51674
lc_r = 0.60170
lambda_r = 0.79107
@callbettersaul 8 หลายเดือนก่อน ⁺¹
@@fcolecumberri No one is forcing you to pass the method as a magic string.
@dandyddz 8 หลายเดือนก่อน ⁺¹
@@callbettersaul What? What do you think he is talking about? I guess, about the operator.methodcaller. And when you run this (3.11):
import operator
m = operator.methodcaller(str.startswith, 'A')
list(filter(m, ['asdsd', 'ASd']))
You get:
Traceback (most recent call last):
File "... .py", line 3, in
m = operator.methodcaller(str.startswith, 'A')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: method name must be a string
@ZimoNitrome 8 หลายเดือนก่อน ⁺¹
someone must have already asked but
what about the pythonic way of list comprehension?
filtered = [n for n in names if n.startswith('B')]
@InforSpirit 8 หลายเดือนก่อน ⁺⁸
There is two way to reference to class method in python:
a: str
a.startswith()
and
str.startswith(a)
latter is the form which IDE can understand inside of lambdas and type reference it properly.
lambda a: str.startswith(a, 'B')
Thousand times cleaner and not error prone, than using methodcaller and strings, which itself is just complex wrapper over getattr() function..
@callbettersaul 8 หลายเดือนก่อน
or you could use str.startswith.__name__ with methodcaller to eliminate the magic strings and potential errors
@Qbe_Root 8 หลายเดือนก่อน ⁺³
def methodcaller(methodname: str, *args, **kwargs):
return lambda obj: obj[methodname](*args, **kwargs)
@tacticalassaultanteater9678 8 หลายเดือนก่อน ⁺⁵²
It's just a lambda, jesus.
@gorandev 8 หลายเดือนก่อน ⁺³
3:18
@tacticalassaultanteater9678 8 หลายเดือนก่อน ⁺¹¹
@@gorandev1. No, once you know what a lambda is, a lambda is not more confusing than a custom std function that only python has, and of which python has so many that even seasoned developers don't know them all.
2. The performance difference arises from the lambda being re-instantiated despite none of its bindings changing. Promoting lambda construction to the scope of the lowest bound variable is an easy optimization that isn't implemented in Python because if you cared about that little difference you should probably not be using Python.
@tengun 8 หลายเดือนก่อน ⁺²
Just assign the lambda to a variable before using. It's gonna be the same performance.
@sadhlife 8 หลายเดือนก่อน ⁺³
@@tacticalassaultanteater9678 it's not just that, methodcaller (and partial and itemgetter and others) are intentionally early binding while lambda is intentionally late binding, due to conventions in other languages and how people expected them to behave. and having two syntaxes for two separate behaviours is also kinda nice as it lets you choose which one you need for your usecase.
@Master_of_Chess_Shorts 8 หลายเดือนก่อน ⁺¹
thanks again for this short, I like your approach for digging up the Python modules that are not well known or used
@iso_2013 7 หลายเดือนก่อน
The way in which you are using the lambda function is the reason it's slower -- every single time that line is called, it's creating a new function. That takes time.
If instead you did a similar thing to line 5 of your timing file, like this:
starts_with_b = lambda v: v.startswith(b)
Then the performance would probably have been identical or near identical.
@TheLolle97 8 หลายเดือนก่อน ⁺³¹
Why not use partial(str.startswith, "B") ? Would be easier to understand IMO and also not throw static code analysis out the window by using strings to access members...
@joshix833 8 หลายเดือนก่อน ⁺³
partial(str.startswith, 'B') does something different to methodcaller. That binds 'B' to self and because of that checks if 'B' startswith the other strings. That's not what you want...
@callbettersaul 8 หลายเดือนก่อน ⁺⁴
Easier to understand, perhaps. The only problem is that IT DOESN'T WORK. partial(str.startswith, "B") results in "B" being the self parameter (like in "B".startswith(...)), but we need it to be the second argument. You can't fill positional arguments the other way around with partial.
Also, you can easily use methodcaller without using magic strings as well (str.startswith.__name__).
@TheLolle97 8 หลายเดือนก่อน ⁺²
Ahh sorry I meant partial(str.startswith, value="B"). That should work.
@TheLolle97 8 หลายเดือนก่อน
Point is you can always achieve the same thing with partial, because methods are just functions with a bound self parameter in Python.
@TheLolle97 8 หลายเดือนก่อน ⁺²
Nevermind. I was wrong, sorry. functools.partial does not work for accessing positional args by name, which quite surprises me tbh. I still prefer the lambda solution though because attribute access via strings is the enemy 🙃
@felicytatomaszewska 8 หลายเดือนก่อน ⁺²
It always good to know what exists in programming language. You never know when you may need it.
@iestyn129 8 หลายเดือนก่อน ⁺¹¹
hopefully in the future you’ll be able to use a method descriptor instead of just a string of the function name. passing str.startswith instead of ‘startswith’ makes more sense to me, it would also insure that you are using the correct type aswell
@joshix833 8 หลายเดือนก่อน
That would just be like partial but without providing the first argument
@DuncanBooth 8 หลายเดือนก่อน
For that I would use `partial(str.startswith, 'B')` rather than methodcaller. The only times you would need methodcaller would be if the list had mixed types that shared the same method (even if they have a common base class you need late binding to get the correct method for each element) or perhaps if the method name is actually the result of an expression (though even there with a single type in the list you can use partial with getattr).
@iestyn129 8 หลายเดือนก่อน
@@DuncanBooth ah i see, you learn something new everyday
@joshix833 8 หลายเดือนก่อน
@@DuncanBooth partial(str.startswith, 'B') does something different to methodcaller. That binds 'B' to self and because of that checks if 'B' startswith the other strings. That's not what you want...
@callbettersaul 8 หลายเดือนก่อน ⁺²
You can pass in str.startswith.__name__ to methodcaller and it will work.
@herberttlbd 8 หลายเดือนก่อน ⁺³
Priming the interpreter in bencmarking is meant to account for cache effects. I don't know how timeit.repeat works but since it is being given a string then I'd imagine it is reparsing every run. That the time difference between the two is close to the size difference, I'd suggest not all variables are being addressed.
@rondamon4408 8 หลายเดือนก่อน ⁺¹
Can I use methodcaller to call a lambda function?
@twelvethis3979 8 หลายเดือนก่อน ⁺³
I'm not convinced to use methodcaller due to would-be performance improvements. 99 percent of the time Python's standard performance will be good enough. I was wondering, though: could you give an example where it has advantages to use methodcaller from a software engineering perspective?
@EyalShalev 8 หลายเดือนก่อน ⁺²
IMHO if you care about a 30ms performance improvement, you should switch to a more performant algorithm/language.
Either by rewriting your application or use c binding for specific operations.
@blahblah49000 8 หลายเดือนก่อน
And that's 30 ms over 1 million iterations, too.
@fyrweorm 8 หลายเดือนก่อน
I'm new to python, can someone explain what is the warm up? what does it do? or point to me to what shouls I search for in order to find more
@thisoldproperty 8 หลายเดือนก่อน
I sometimes wonder if there is any caching going on in the background when I perform the timeit test on something. Either at the programming or hardware layers. It'd be interesting to prove if this is, or is not the case.
Another great short , to the point no nonsense video. Thank you.
@Sinke_100 8 หลายเดือนก่อน ⁺¹
I think it's cool to sometimes use build in, like for example I used permutations in one of recent projects and it is much more optimised then write your own. For this though it's easy to write alternative, but it's nice to know that it has slightly better performance. Even though I think difference is minimal compared to permutations. I guess it depends on project and what you use it for. Personaly since not everyone uses this, lambda could be more readable
@edivaldoluisbonfim3194 8 หลายเดือนก่อน
what IDE are you using ?
@oommggdude 8 หลายเดือนก่อน ⁺¹
PyCharm
@timelschner8451 8 หลายเดือนก่อน
How does the methodcaller knows that it should use string methods? or does it only work with strings?
@callbettersaul 8 หลายเดือนก่อน ⁺¹
It doesn't know what methods to use. It probably uses the built-in getattr function. getattr('Bob', 'startswith')('B') is the same as 'Bob'.startswith('B').
@cmcintyre3600 8 หลายเดือนก่อน ⁺¹
Wait, what does “warm up the interpreter” mean?
@Ankara_pharao 8 หลายเดือนก่อน ⁺³
This is foreplay for programmers.
@emman100 8 หลายเดือนก่อน ⁺¹
@@Ankara_pharao 🤣🤣
@frankfrei6848 8 หลายเดือนก่อน ⁺¹
Hiding functions in strings. Just had a flashback to the Eighties when we could run functions from a string in the ZX Spectrum's BASIC. Beautiful! 😂 Always irked me that C64 BASIC didn't have that.
@photon6156 8 หลายเดือนก่อน ⁺²
The testing part was... bad.
It doesn't evaluate filter, it just creates the filter object, so all the difference is in the compilation time (not the c-like compilation, python just creates byte code from function). So the main reason, why the second is much slower is because it has to compile lambda. And that's why one should avoid redeclaring the same functions/lambdas in a loop.
If you pass lambda as global, it should show that time is the same. Evaluating `filter` also should be the same.
Warming up the interpreter seems kinda unneeded, but I wouldn't be surprised if it yields more stable results.
About the method itself - I see it kinda megatively, mostly because the ide won't hint the method name (since it's a string) and it also won't deduce the result type (and type hinting just greatly reduces the number of bugs). Otherwise, the method seems convenient though
@Indently 8 หลายเดือนก่อน
It was a huge oversight on my part, I need to make sure people check my tests next time before I share them. Thanks for bringing it up :)
@Indently 8 หลายเดือนก่อน
I heard a lot of mixed opinions on warming up the interpreter. On my computer it leads to more consistent results, on my friends computer it doesn't really do anything, so at the end of the day I just thought, it can't hurt to have it. Maybe it varies from Python version, implementation, computer specs, etc etc.
@tengun 8 หลายเดือนก่อน
Exactly.
@glensmith491 8 หลายเดือนก่อน
Other ways to do this but this is the easiest one to understand. For example, I had to be told what partial does but I pretty much knew what methodcaller did by just looking at the syntax.
@Coding_Fun22 8 หลายเดือนก่อน ⁺²
Hi! I got 'Python: The Professional Guide For Beginners (2024 Edition)' and 'The Complete Guide To Mastering Python In 2024' and I am very excited to start my journey but, out of which two should I start first?
@Indently 8 หลายเดือนก่อน ⁺³
I would start with: The Professional Guide For Beginners
More realistic and better explanations since it's my most recent course. You don't need both though, so feel free to refund The Complete Guide To Mastering Python In 2024 since a most of it is repeating the same content :)
@Coding_Fun22 8 หลายเดือนก่อน
@@Indently :D
@blahblah49000 8 หลายเดือนก่อน ⁺¹
Protip: Just dive in. Try one and then switch if it isn't helping you enough. The attitude of "ask, wait for answer, then try" will hold you back in programming and learning.
@joshix833 8 หลายเดือนก่อน ⁺⁴
I think your benchmark is flawed. You create the lambda in the benchmark and the methodcaller outside of it.
Additionally taking the min is probably a bad idea. min, max, mean and average are all good to know. Why didn't you use timeit and directly passed function objects that do the things instead of code as string?
@callbettersaul 8 หลายเดือนก่อน ⁺⁴
Yup, they're flawed. I tried the benchmark exactly like he did and results were 0.09 for methodcaller and 0.15 for lambda, but upon declaring that lambda outside as well, the lambda time dropped to 0.09.
@Indently 8 หลายเดือนก่อน ⁺¹
Taking the average is a common misconception for benchmarks, it doesn't tell you the fastest possible time, but generalises it.
@Indently 8 หลายเดือนก่อน ⁺²
but you two are right, AGH, I should have created the method caller directly inside the test. Midnight coding got me once again.
@joshix833 8 หลายเดือนก่อน ⁺²
@@Indently yes only average isn't enough, but only min also isn't.
@SimplyWondering 8 หลายเดือนก่อน ⁺¹
@@Indently In future I think with testing it would also be good to use a more advanced profiler. Im a huge fan of the timeit function, it will just give you a bettter variety of statistics and prevents some common mistakes in profiling.
PS: do you have a discord, id be interested in joining
@gardnmi 8 หลายเดือนก่อน
I could see this as a better alternative to exec. Definitely didn't know this existed.
@callbettersaul 8 หลายเดือนก่อน ⁺²
First of all, I've never seen a situation, where there isn't an alternative to exec. Secondly, this is rather an alternative to lambda and not necessarily a better one.
@oommggdude 8 หลายเดือนก่อน
For the people saying you can just use partial, it won't work on objects like in this example, i.e. strings, functions are suitable for partial, functions called on an instance for methodcaller.
@callbettersaul 8 หลายเดือนก่อน ⁺¹
What are you talking about? Partial works perfectly with objects and their methods. It just won't work in this case, because partial doesn't allow to specify the position of a positional argument. Like for example, you can't use partial to create an equivalent function to lambda x: str.startswith(x, 'b').
@oommggdude 8 หลายเดือนก่อน
@@callbettersaul Exactly, like this example... Partial isnt great for funcs designed to be called on instances. The issue lies in whether the func supports using keyword arguments.
@callbettersaul 8 หลายเดือนก่อน
@@oommggdude Partial is great for any function or method, for which you need to provide arguments in the same order as the function's/method's signature declares the params. Or if it allows keyword arguments. But whether it's a function or method is 100% irrelevant. A method bound to an instance is just another function, but with an instance as the first positional parameter.
@bradentoone895 8 หลายเดือนก่อน ⁺²
You don’t have to “warm up” the interpreter to my knowledge - the python interpreter only has a startup time delay that varies, not the actual interpreter speed.
@blahblah49000 8 หลายเดือนก่อน
AFAICT from timeit's docs, it handles that anyway. timeit.timeit is probably what should be generally used.
@samanazadi8952 8 หลายเดือนก่อน
I think average or median of the times took is a better criteria.
@FighterAceee94 8 หลายเดือนก่อน
Do you ever worry about these kinds of "syntactic sugar" functions being changed or deprecated in future Python releases? It could also get out of hand when using a lot of different ones, and then having to check each one in release notes when updating your project Python version.
@Indently 8 หลายเดือนก่อน ⁺¹
It's the operator module, it will probably get deprecated when Python becomes obsolete.
@tubero911 8 หลายเดือนก่อน
Another feature that programmers can misuse to create injection flaws in their code?
@longphan6548 8 หลายเดือนก่อน ⁺³
4:10 Was just about to say "So basically a glorified lambda 😂".
@joshix833 8 หลายเดือนก่อน
3:32 the annotation of the filtered variable is really bad. It loses all the information about the contents of the iterable. Annotating it as collections.abc.Iterable[str] would be better.
@bozo_456 8 หลายเดือนก่อน
What software
@nouche 8 หลายเดือนก่อน ⁺¹
Readability counts.
@qraxiss 8 หลายเดือนก่อน
i think lambda is better for solid principles
@rishiraj2548 8 หลายเดือนก่อน
Thanks 👍👍
@Nipppppppppp 8 หลายเดือนก่อน ⁺¹
This is actually pretty neat
@SusanAmberBruce 8 หลายเดือนก่อน
Very interesting
@dandyddz 8 หลายเดือนก่อน
Ran this with 3.11:
from timeit import repeat
list_to_filter = [
'Why', 'didnt', 'you', 'put', '"from operator import methodcaller"',
'and', 'the', 'code', 'related', 'to', 'the', 'initialization', 'of', 'the', '"methodcaller"', 'object',
'there?', 'I', 'aint', 'love', 'that'
]
warm_up_code = 'for i in range(10): pass'
lambda_code = '''
list(filter(lambda x: x.startswith('a'), list_to_filter))
'''
operator_code = '''
from operator import methodcaller
list(filter(methodcaller('startswith', 'a'), list_to_filter))
'''
list_comprehension_code = '''
[s for s in list_to_filter if s.startswith('a')]
'''
g = {'list_to_filter': list_to_filter}
warm_up_r = min(repeat(warm_up_code))
print(f'{warm_up_r = :.5f}')
operator_r = min(repeat(operator_code, globals=g))
print(f'{operator_r = :.5f}')
lc_r = min(repeat(list_comprehension_code, globals=g))
print(f'{lc_r = :.5f}')
lambda_r = min(repeat(lambda_code, globals=g))
print(f'{lambda_r = :.5f}')
Here is what I got:
warm_up_r = 0.20486
operator_r = 3.08152
lc_r = 1.82864
lambda_r = 2.47760
@saadzahem 8 หลายเดือนก่อน
import operator as op
@tinahalder8416 8 หลายเดือนก่อน ⁺¹
Just use lambda....
@Indently 8 หลายเดือนก่อน ⁺¹
If you watch the entire video, I cover that part :)
@emman100 8 หลายเดือนก่อน ⁺¹
🤣🤣😂😂
@iestyn129 8 หลายเดือนก่อน
python 🤠
@godwinv4838 8 หลายเดือนก่อน
hello sir
@bloody_albatross 8 หลายเดือนก่อน
Ok, I fixed the bugs in the test code (noticed by people in the comment section) that caused it to do nothing *and* to be unfair (constructing methodcaller outside of the test but lambda inside, filter() is lazy and thus the function was *never* called) and got these numbers (Python 3.9.9):
method caller: 1.045
lambda: 1.063
With names *= 10:
method caller: 6.386
lambda: 6.529
I.e. only a very minimal difference.
Here the full code:
from collections import deque
from operator import methodcaller
from timeit import repeat
names = ['Bob', 'James', 'Billy', 'Sandra', 'Blake']
def consume(iterator):
deque(iterator, maxlen=0)
starts_with_b_method_caller = methodcaller('startswith', 'B')
starts_with_b_lambda = lambda name: name.startswith('B')
warm_up = '''
consume(range(3))
'''
method_caller_test = '''
consume(filter(starts_with_b_method_caller, names))
'''
lambda_test = '''
consume(filter(starts_with_b_lambda, names))
'''
warm_up_time = min(repeat(warm_up, globals=globals()))
method_caller_time = min(repeat(method_caller_test, globals=globals()))
lambda_time = min(repeat(lambda_test, globals=globals()))
print(f'method caller: {method_caller_time:8.3f}')
print(f'lambda: {lambda_time:8.3f}')

ต่อไป

เล่นอัตโนมัติ