For #17 there is a minor gotcha being you can't end a raw string with a lone backslash (i.e. r"windows\path" is legal, r"windows\path\" is not). Trivial to workaround but something that might catch people out the first time they try to prepend with r.
@@DanCojocaru2000 So it makes sense when you realise it’s not a raw string, it’s a string with the escape sequences left unparsed. But they still have to be legal escape sequences. So “abc\” isn’t legal because it assumes \” is the unescaped text and the string is now missing the closing quote. It’s also why “abc\\” is legal.
I agree about map and filter. Comprehensions are usually more readable and avoid ugly function expressions, but there is a niche use case for map: if your generator comprehension would look like this: (f(a, b, c) for a, b, c in zip(A, B, C)), map(f, A, B, C) is a lot more elegant, and often more readable. For example, (a + b for a, b in zip(A, B)) becomes map(add, A, B). (need to import add from operator)
I started learning python a couple years ago to fill a gap left by depression. I'm close to 50 now (not that I wasn't 2 years ago), in that time I've managed to write an AOL3.0+ compatible server - for nostalgia - in python utilizing your videos for guidance, and other content creators as well. I've recently switched to using #pycharm from vscode. Thank you very much for your content, it has helped me tremendously in improving my project and my comprehension of python.
I've just found out that I never really understood the behavior of "and" and "or" for non-boolean types before this video. Awesome content, as always! #pycharm
I love the “and” and “or” explanation, it’s so much easier to understand compared to all the resources out there on the web that have some convoluted explanation of it that’s really difficult to understand. #pycharm
#13 deleting while iterating - one can use for key, val in list(d.items(): del d[key] In fact you have shown this in one of your videos and I have been using that ever since. It makes code much more readable that collecting the things to delete. #pycharm
3: I'm ALWAYS trying to tell people to use pathlib.Path instead of strings!! 10: Ooh divmod looks awesome! Can't wait to use it! 14: Interesting take on filter/map, I think I could agree. 17: I *roughly* knew about raw strings but feel more confident with them after you're 20 second explanation (which I watched at 2x speed) 21: Yes, forcing prints at import is the worst!! Love all your videos, thank you! #pycharm
pathlib just seems to be a sop to Microsoft Windows. String manipulation is perfectly straightforward for POSIX paths. 7:45 Fun fact: “/” works as a path separator even on Python on Windows.
@@lawrencedoliveiro9104 I've never worked in a pure POSIX environment, and now I work in an environment where code is deployed on both Windows and Linux systems. I don't know the difference between all different types of forward and backward slashes and which operating systems uses which types in which cases and that's the way I want it.
@@lawrencedoliveiro9104 it's fine, but still pathlib is very convenient, it helps with splitting paths into their components and has a nice functional interface that allows chaining. Also, it helps a lot if you need to handle paths that need to follow specs of a foreign OS (handling POSIX paths in Windows or vice versa).
@@lawrencedoliveiro9104 not true. I had a case recently where I needed to handle Windows paths on a POSIX system and pathlib was perfect for that. No, I couldn't use POSIX paths, I explicitly needed Windows paths.
Regarding #14: Mostly, one should go for generator expressions like you did, as wirting out "lambda x: ...." as function doesn't save many characters and makes the thing harder to read - granted. However, When stacking multiple of those expressions (map, filter, reduce, etc.), they can be quite cool. Especially when all those variable names used in your generator would happen to be longer than the usual line together (so you're force to line-break), it looks cleaner to line-break between arguments of a function (effectively creating an easily traceable staircase of transformations functions) that to line break before the "if" of a generator - the latter looks much uglier when stacked into each other. One could argue that I should define my generator expression, then use it when defining the next generator expression and so on, until I did all my transformations - fair point. However, this encourages defining multiple useless variables which serve so purpose and uses up the same amount of lines a stacked reduce(sum(map(fm, filter(ff, myList))) does. I agree, that real use-cases with enough transformations to make this worthwhile are rarer that isolated cases of a map for filter transformation. When confronted with the latter, you should obviously choose the generator-expression approach.
For that I'm currently writing a library. That'll help to handle this nesting mess without needing to define many variables. It's typed-stream on pypi. It's usable, but hasn't reached 1.0 yet. e.g.: Stream.counting().limit(100).map(operator.mul, 2).filter(lambda x: x%5).peek(print).sum()
2:17 for this particular example (though I'm not criticizing the tip), the following is even better (20% faster for n = 100000 for me -- and clearer): s = "".join(f"some string {i}" for i in range(n))
Honestly, I am disagreeing with filter and lambda. I got so used to these functions (also due to other languages maybe) that I now find them more readable than list comprehensions, at least usually. Sometimes I still think list comprehensions are better, but nowadays it only rarely happens.
I have a question about #7. If I have some global constants, is it ok to use them directly in functions, or should I pass them through few layers of functions from main?
Great question! ALL these nooby habits are okay and they all have situations where they are, in fact, quite reasonable things to do. Just as you suspected, using a global constant is a perfectly reasonable thing to do.
I'm happy that I've learned a few things again here, it's not always being new at the language that makes you do less optimal stuff. sometimes it's being old at it. Format strings are, for me, fairly recent, I know they've been here for years, but I've been coding python for way longer than they've been here, and it's not always easy to let old habits die and to make sure you know exactly all the new ways to do things in the new version you're starting using, at least I've been using format strings, but now I'll have a look into the doc for those extra formating parameters to up my game a little. And thanks for the io.stringio trick, I didn't knew that one before. #pycharm
Awesome tips! Love your videos ☺️ Personally I prefer to iterate over a copy of the dict instead of the extra for loop. While it's less memory efficient, I don't mind it too much since it's a shallow copy and in most cases, I prioritize the readability of the code over efficiency. Also, I'm a bit torn on the filter and map tips (well, less on the map part and more on the filter part 😉) When there is no lambda involved, and depending on the specific variable names we are working with, the filter function can sometimes be closer to human language (though I admit this doesn't happen often). I wonder if in the future there will be a clear runtime difference between the two, there is already a PEP open to make comprehensions more efficient. #pycharm
No 7: also yes. If I want that functionality, but think it may need to me flexible, a store the default as a module constant: STRICT = True def number7(*args, strict=STRICT):
I personally usually prefer filter or map if possible over list comprehensions, I think filter and map have quite an unfortunate syntax in python, as at least in my opinion the order is completely flipped compared to how you read it, I read it as “iterator, operation, function” and instead it reads as “operation, function, iterator”. This also means chaining them is a nightmare because it’s nested instead of appearing one after another, and neither one is ergonomic nor readable when chaining more than two… also the lambda syntax is overly verbose, and not particularly accessible especially to non native speakers as it’s quite an obscure word
I agree that map and filter's syntax in Python isn't the best. As much as I like list comprehensions, sometimes I find myself wanting to use map or filter, but I'm not a huge fan of Python's lambda syntax. I wish there were a better way to do multi-line inline functions in Python (plus don't even get me started on the Callable[] type hint). You might appreciate the PyFunctional library which adds a convenient way of chaining map/filter. Unfortunately, it doesn't currently have support for type hints so it's maybe not the best if you use Pylance's type checker.
it makes sense when you're used to writing math things right-to-left, M_fi, which would be an amplitude to go from the initial state (i) to the final state (f)
worth mentioning for no 8: that non-empty collections are considered Truthy , plus the fact that and operator returns the LAST EVALUATED value if all values are true
I experimented with concStringPlus and concStringIO and got a result that, at first, contradicted your assumption, since for small and non-chaotic strings the plus method is faster. So I added some complexity to the strings and the numbers got crazy for the plus, jumping from 0.55s to 1.61s, while remaining the same for the StringIO func (near 0.61s for 100000 runs with 100 operations each). My conclusion is: if you're incrementing your string with very small parts, stick to +=. If it's random-sized or chaotic, StringIO will keep the excellent work throughout.
CPython optimizes str+= if there are no other references to the left hand string, but that relies on it being able to reallocate the string data; if your other code was allocating new objects, the risk that reallocation requires a copy goes up drastically. StringIO can reduce these reallocations by keeping a margin for growth, much like list does. Your small and non-chaotic strings may have been previously allocated (e.g. interned), making the ideal case for str+= to reallocate without copying.
Totally agree with all of these, but the 2 that irk me the most is altering the iterator and using eval. There's almost always a better way to do it than to use eval, but I think in the age of computer generated code it's especially important to understand the dangers of.
This is Good Stuff! 17 years of assembler followed by 20+ years of C have left me unprepared for these newfangled gadgets. Thanks ever so much for helping me see through the blur ....
6:30 I actually faced this recently with a project I’m working on. Do they both do the same thing under the hood (same byte code) or is one preferred over the other for different reasons? (In reference to using functional functions like map or filter over comprehensions).
been programming in python for years and consider myself somewhere between intermediate and advanced, and even your beginner-oriented videos have something new to me in them. there are so many beginner python creators out there and i'm thankful that it's so accessible now, but i definitely appreciate the higher-level "niche" you tend to make videos in, even when you claim it's for noobs. also #pycharm gimme that license mDaddy
As far as I know, something was changed in string concatenation. Concatenation of 10 million elements with StringIO beats basic string concatenation just by 10%
#5 is slightly misleading, since it depends on the python implementation. The statement is true for PyPy, but CPython cleverly uses a reference count of one to make the string mutable. The proposed fix has the exact same performance in CPython (replace 100 with 10000000 to get a proper measurement).
Is there still any point in using NamedTuples now that frozen dataclasses exist? I guess the one case I can think of is if you have a particular reason why your class needs to be a subtype of tuple, but other than that, dataclasses offer some very nice features such as being able to easily convert to a dictionary with the built-in asdict function, which makes dumping to JSON much easier.
2:59 Weird to promote pydantic as a parsing library. It's just a data mapper with some poorly designed validation features.
ปีที่แล้ว
For iterating and delete, I often just iterate the copy of it. d.items().copy(). Therefore, I dont need the second loop and the copied object will be deleted after loop anyway, so memory ussage should be the same.
I disagree on number 10: You are right in that applying both // and % is less performant than divmod, but I think the former is more readable (and actually could as well as should better be caught by the parser during byte code creation; e.g. by looking from // or % a few operations ahead if the other operation is called from the exact same arguments, too). I do not understand the coexistence of //, % together with divmod in python without any imports necessary. I think this violates the python zen ...
I once was creating a parser in python and, naturally, was using bs4 for finding stuff. This, however, resulted in a bottleneck and replacing bs4 with regex made a 12x speed improvement from a couple of seconds for each operation to a couple dozens of milliseconds
started using dataclasses because I rely heavily on sending dicts around as messages, and then I immediately got the issue of exporting/importing to json. Kind of an annoyance that whole pickle business. So why is using raw dicts a noobie practice btw? You just said not to do it, but not why.
#11 I don't actually like @property, when someone creates a setter it usually has a side effect. If I'm looking at where it is used, `obj.x = something` it is really surprising when I find out that this does something more than just changing the x variable. But if I see an `obj.set_x(something)` that immediately raises the alarm: `set_x()` probably does something more - why would anyone bother to write this function otherwise? I suppose if your setter really has no side effect it is safe to use property, but in my experience that is rare (maybe you want to log where it changed or something like that, it's fine). Usually however, it re-computes related values or notifies other objects that a value has been updated - you should be aware of these.
Hey James, how do you feel about accepting using `or` to give a value to an attribute if it not set yet like `self.some_dependency = dependency or DefaultDependency()` in the constructor when the parameter `dependency` defaulted to `None` . Or used in a property to lazy load and cache an expensive object: `self._connection = self._connection or DatabaseConnection()`?
Small annotation to 10: The reason you do this is that most processors use a divmod that is built in already. That means the processor will most likely perform the operation twice only to discard the mod in the first go around and the result in the second.
I disagree with #14. You shouldn't be using lambdas as arguments to map and filter, but functions not created just for the purpose of this statement are fine. A lot of people don't know you can pass goodies from the `operator` module(such as `itemgetter`) and unbound methods(such as `str.upper`) as the first argument.
On #5, searching for answers gives some mixed messages, where the immediate evidence is that += is faster than the alternatives in most cases due to some python interpreter trickery, but in testing it's pretty clear that io.StringIO is substantially faster. For 50,000 appends I get 0.00428s using perf_counter, and with StringIO I get 0.00299s. That enormously faster.
Not very bad. Path objects include the context that they are paths. For instance, path / subpath instead of os.path.join(path, subpath), and you can read an entire file with path.read_text(). pathlib is implemented using os.path and friends (path._flavour links to os.path, posixpath or ntpath).
"Concatenating strings with plus" I doubt it gives a meaningful performance improvement if you have string concatenation once or twice in a program. Sure, when I will be writing extra long strings I might actually use this, otherwise it's just unnecessary code obfuscation.
A "nooby" numpy habbit is also doing inp = np.array(inp) at the beginning of a function to ensure that the input is numpy for the rest of the function. This is bad because the np.array() operation actually makes a new copy of the array even if it is already a numpy array, so if there is a lot of data it will take some time. #pycharm
I have a question. Parsing data structure such as dict from a string was mentioned as a bad thing, but I found it to be the best ways to store nested dictionaries parsed with json in redis. Are there alternative ways? hset and hmset dont work well with nesting, and return all values as strings, which is not the case whien using json dumps and loads
I'll defend facilitating variable rw with global instead of adding it as an input and outputting it again. In some cases, anyway, it's the same thing in shorter syntax. Problem where?
Adding persons together at 6:53 is hillarious! Making friends is not an associative (or even commutative) operation! The friendship relationship is also not reflective, symmetric or transitive. People are not numbers!
9: Some weird audio from the other side during this one from me? Maybe the ghosts like single letter variables. 13: d = {key: val for key, val in d.items() if not val % 2 == 0}
I saw production code somewhere where very good software engineers used global variables for doing multiprocessing with class functions. Any idea if this is necessary for performance? #pycharm
Depends on the language/framework. A modern framework should provide better alterantives than using global variables. But shared memory space is still common in C/C++, especially in embedded development.
I think filter/map functions are not that bad to read. List comprehension is good, but I would rather not rush rewriting all filter calls to list comp style #pycharm
Is it more efficient to create a set of keys to delete and looping a second time or to create a copy of the dict's keys with list(d.keys()) and iterating over that? Because that's what I usually do. #PyCharm
Great video. Do you have any related content on #12 expensive attributes? I have been trying to subclass or in general make classes that I actually need to have access to secondary attributes. The problem is at times setting values to secondary attributes as self.attr1.attr2 = value does not work.
Law of Demeter code smell on that. You can use dunder setattr(self, attr2, value) so that you can write: >>>self.attr2 = value throws an AttributeError (which is caught) and invokes dunder setattr when you explicitly tell it to: self.attr1.attr2 = value. or. setattr(self.attr1, attr2, value) Same for getattr....it helps when you object is a chain of HAS-A objects.
I also agree that eval should not be used for parsing data, but you can pretty much completely sandbox it eval("{'h':6}", {"__builtins__":{}}) Note: someone might still find a way to evaluate unsafe code, so don't do this. It's just a nice thing to know.
No global exists within this sandbox, if a vulnerability exists, it would abuse the built-in datatypes like `(5).__mul__(8)` But I can't think of a real exploit for this.
Never knew about Python getters and setters having special properties so that will be my next amendment I have to make for my existing project #pycharm
I am a big fan of iterators, and I really disagree about map and filter. I get that sometimes they are less readable, but they also make it way easier to build pipelines of lazy data manipulation. Also, generators and comprehensions have poorer performance, due to the for keyword, afaik. I think this isn't a n00b habit, as most n00bs don't understand iterators and functional / declarative styles, just seems more like a question of taste.
1 kinda don't agree 7 I'm conflicted on this one. I've been in JS land and got used to the "file-as-module" pattern(?). Now to me it seems so wasteful to define a dedicated class just so I can avoid global variables. Happy to proven wrong about this ofc. Maybe I should just embrace this instead lol 12 the results can also be stored in a "private" variable, yes? Assuming other conditions are taken care of 14 sometimes I use map/filter when I'm too lazy to think of a loop variable name lol. Though more often than tht I consider hsing "_" instead. Dk which is worse heh
Historically, python didn't have bool, everything just used 1 or 0. When python went to introduce the bool type, they made it derive from int so all old code would still work when passed a bool.
I found a case that contradicts 2 or 3 of these. #5, #17 and possibly #3. Using os.path with r"c:\\" was giving me incorrect/broken results. You can't do r"c:\". r"c:" is still broken. "c:\\" works and r"c:" + "\\" works
Man, that behavior of and/or with falsy/truthy values is really strange - it almost reminds me of some of the type coercion weirdness that can occur in Javascript. Much as I like using Python (#pycharm) for certain projects, I definitely appreciate languages that can give compile-time errors for this sort of thing - same goes for deleting items from an iterable during iteration as well.
mean while i struggle to explain how .get works and that you can do things inside an f string so you dont need to assign it to a var on the line above an f string to use it only once in the entire code in that one f string. although this is better than the last person i gave up on the step of asking him to write his name.
thanks mate, pretty useful information, some of these I’ve been doing myself, some I didn’t even know about. keep going with this great content. #pycharm
Definitely enjoyed this, and picked up some great tips! I have to say, I’m not a fan of main() as a construct. __main__ is great, but then having it just call a main() to do everything feels unnecessary unless there’s a really good reason, and then the person importing your code has to weed it back out to avoid pulling it in. (Plus, if I’m being honest, main() feels a bit too Java-like. :) )
the point of main() is to keep code out of the if/then block for dunder main...devs should see the module end in that block and it should be short, and then they know a main() or usage() [for scripts] is right above it.
Regexes really can't parse mathematical expressions or xml, I give you that. I learned that from people more formally educated in computer science than I am. I learned it AFTER presenting a working example that they indeed can, while being simpler and a whooooole lot faster than parsers. Yet somehow, this new realization didn't break my already written nooby code. 🤣
For #17 there is a minor gotcha being you can't end a raw string with a lone backslash (i.e. r"windows\path" is legal, r"windows\path\" is not). Trivial to workaround but something that might catch people out the first time they try to prepend with r.
That's a very weird restriction.
@@DanCojocaru2000 So it makes sense when you realise it’s not a raw string, it’s a string with the escape sequences left unparsed. But they still have to be legal escape sequences.
So “abc\” isn’t legal because it assumes \” is the unescaped text and the string is now missing the closing quote.
It’s also why “abc\\” is legal.
@@notenoughmonkeys Quite weird design.
@@DanCojocaru2000 Completely agree. Caught me completely off guard when I first encountered it, but once you know, you know.
Not the first time I've prepended with r, but this actually happened to me today and I couldn't see why it shouldn't work. What a timely coincidence.
Love the "don't parse HTML with regex" reference. One of my favorite pieces of programming humor.
I agree about map and filter. Comprehensions are usually more readable and avoid ugly function expressions, but there is a niche use case for map: if your generator comprehension would look like this: (f(a, b, c) for a, b, c in zip(A, B, C)), map(f, A, B, C) is a lot more elegant, and often more readable. For example, (a + b for a, b in zip(A, B)) becomes map(add, A, B). (need to import add from operator)
idk see anything wrong with map(int, arr) tbh
idk, at returning and passing in 3+ tuples you really should start thinking about using a namedtuple at that point imo
@@thirtysixnanoseconds1086 yeah, that's another good use for map. I guess I should have said there are few niche use cases for map.
@@cleverclover7 none of these examples are returning 3 tuples. They all return iterators. Can you elaborate on the use of named tuples here?
not to mention that in certain situations map is a tiny bit faster...
I started learning python a couple years ago to fill a gap left by depression. I'm close to 50 now (not that I wasn't 2 years ago), in that time I've managed to write an AOL3.0+ compatible server - for nostalgia - in python utilizing your videos for guidance, and other content creators as well. I've recently switched to using #pycharm from vscode. Thank you very much for your content, it has helped me tremendously in improving my project and my comprehension of python.
Congratulations for your hard work!
Thanks for making a sequel to the first Nooby Habits video, would be awesome if you could do another one for C++ as well.
please keep the comments python pure. I don't need to hear the J word(s), nor anything involving the letter between B and D.
I've just found out that I never really understood the behavior of "and" and "or" for non-boolean types before this video. Awesome content, as always! #pycharm
It's called "Short Circuit Logic".
I love the “and” and “or” explanation, it’s so much easier to understand compared to all the resources out there on the web that have some convoluted explanation of it that’s really difficult to understand. #pycharm
#13 deleting while iterating - one can use
for key, val in list(d.items():
del d[key]
In fact you have shown this in one of your videos and I have been using that ever since. It makes code much more readable that collecting the things to delete.
#pycharm
This is a gem! I love this
3: I'm ALWAYS trying to tell people to use pathlib.Path instead of strings!!
10: Ooh divmod looks awesome! Can't wait to use it!
14: Interesting take on filter/map, I think I could agree.
17: I *roughly* knew about raw strings but feel more confident with them after you're 20 second explanation (which I watched at 2x speed)
21: Yes, forcing prints at import is the worst!!
Love all your videos, thank you! #pycharm
pathlib just seems to be a sop to Microsoft Windows. String manipulation is perfectly straightforward for POSIX paths.
7:45 Fun fact: “/” works as a path separator even on Python on Windows.
@@lawrencedoliveiro9104 I've never worked in a pure POSIX environment, and now I work in an environment where code is deployed on both Windows and Linux systems. I don't know the difference between all different types of forward and backward slashes and which operating systems uses which types in which cases and that's the way I want it.
@@lawrencedoliveiro9104 it's fine, but still pathlib is very convenient, it helps with splitting paths into their components and has a nice functional interface that allows chaining.
Also, it helps a lot if you need to handle paths that need to follow specs of a foreign OS (handling POSIX paths in Windows or vice versa).
@@slash_me Everything does POSIX these days. It’s no point developing for anything else.
@@lawrencedoliveiro9104 not true. I had a case recently where I needed to handle Windows paths on a POSIX system and pathlib was perfect for that. No, I couldn't use POSIX paths, I explicitly needed Windows paths.
Regarding #14: Mostly, one should go for generator expressions like you did, as wirting out "lambda x: ...." as function doesn't save many characters and makes the thing harder to read - granted.
However, When stacking multiple of those expressions (map, filter, reduce, etc.), they can be quite cool. Especially when all those variable names used in your generator would happen to be longer than the usual line together (so you're force to line-break), it looks cleaner to line-break between arguments of a function (effectively creating an easily traceable staircase of transformations functions) that to line break before the "if" of a generator - the latter looks much uglier when stacked into each other.
One could argue that I should define my generator expression, then use it when defining the next generator expression and so on, until I did all my transformations - fair point. However, this encourages defining multiple useless variables which serve so purpose and uses up the same amount of lines a stacked reduce(sum(map(fm, filter(ff, myList))) does.
I agree, that real use-cases with enough transformations to make this worthwhile are rarer that isolated cases of a map for filter transformation. When confronted with the latter, you should obviously choose the generator-expression approach.
For that I'm currently writing a library. That'll help to handle this nesting mess without needing to define many variables. It's typed-stream on pypi. It's usable, but hasn't reached 1.0 yet.
e.g.:
Stream.counting().limit(100).map(operator.mul, 2).filter(lambda x: x%5).peek(print).sum()
So... Functional programming?
pathlib is definitely something I could be using more and StringIO was completely new to me! Thank you, love your content #pycharm
2:17 for this particular example (though I'm not criticizing the tip), the following is even better (20% faster for n = 100000 for me -- and clearer):
s = "".join(f"some string {i}" for i in range(n))
I'll do old school:
func = "some string {}".format
s = ''.join(map(func, range(n)))
but I hate loops.
Honestly, I am disagreeing with filter and lambda. I got so used to these functions (also due to other languages maybe) that I now find them more readable than list comprehensions, at least usually. Sometimes I still think list comprehensions are better, but nowadays it only rarely happens.
I have a question about #7. If I have some global constants, is it ok to use them directly in functions, or should I pass them through few layers of functions from main?
I think constants are fine, denote them by writing in UPPER_CASE_SNAKE_CASE
Great question! ALL these nooby habits are okay and they all have situations where they are, in fact, quite reasonable things to do. Just as you suspected, using a global constant is a perfectly reasonable thing to do.
I like:
STRICT = True
def func(*args, strict=STRICT):
so:
It works like a global
It can be overridden if necessary
the interface is exposed to users
I'm happy that I've learned a few things again here, it's not always being new at the language that makes you do less optimal stuff. sometimes it's being old at it. Format strings are, for me, fairly recent, I know they've been here for years, but I've been coding python for way longer than they've been here, and it's not always easy to let old habits die and to make sure you know exactly all the new ways to do things in the new version you're starting using, at least I've been using format strings, but now I'll have a look into the doc for those extra formating parameters to up my game a little.
And thanks for the io.stringio trick, I didn't knew that one before. #pycharm
Old habits be like *dr_who_in_da_rain.gif"
totally, and my company is late with the versions for security reasons, so I'm always behind.
Awesome tips! Love your videos ☺️
Personally I prefer to iterate over a copy of the dict instead of the extra for loop. While it's less memory efficient, I don't mind it too much since it's a shallow copy and in most cases, I prioritize the readability of the code over efficiency.
Also, I'm a bit torn on the filter and map tips (well, less on the map part and more on the filter part 😉)
When there is no lambda involved, and depending on the specific variable names we are working with, the filter function can sometimes be closer to human language (though I admit this doesn't happen often).
I wonder if in the future there will be a clear runtime difference between the two, there is already a PEP open to make comprehensions more efficient. #pycharm
I’ve coded python for about 3 years now and never knew divmod existed. I learn something knew every time I watch your channel
No 7: also yes. If I want that functionality, but think it may need to me flexible, a store the default as a module constant:
STRICT = True
def number7(*args, strict=STRICT):
I personally usually prefer filter or map if possible over list comprehensions, I think filter and map have quite an unfortunate syntax in python, as at least in my opinion the order is completely flipped compared to how you read it, I read it as “iterator, operation, function” and instead it reads as “operation, function, iterator”. This also means chaining them is a nightmare because it’s nested instead of appearing one after another, and neither one is ergonomic nor readable when chaining more than two… also the lambda syntax is overly verbose, and not particularly accessible especially to non native speakers as it’s quite an obscure word
I agree that map and filter's syntax in Python isn't the best. As much as I like list comprehensions, sometimes I find myself wanting to use map or filter, but I'm not a huge fan of Python's lambda syntax. I wish there were a better way to do multi-line inline functions in Python (plus don't even get me started on the Callable[] type hint). You might appreciate the PyFunctional library which adds a convenient way of chaining map/filter. Unfortunately, it doesn't currently have support for type hints so it's maybe not the best if you use Pylance's type checker.
I'm currently writing a library called typed-stream. That allows you to easily chain such operations. It's fully typed and checked with mypy
it makes sense when you're used to writing math things right-to-left, M_fi, which would be an amplitude to go from the initial state (i) to the final state (f)
7:03 But what if I want to use the "+" operator? Then, wouldn't I HAVE to implement the "add" dunder method, instead of just a regular method?
worth mentioning for no 8: that non-empty collections are considered Truthy , plus the fact that and operator returns the LAST EVALUATED value if all values are true
I experimented with concStringPlus and concStringIO and got a result that, at first, contradicted your assumption, since for small and non-chaotic strings the plus method is faster. So I added some complexity to the strings and the numbers got crazy for the plus, jumping from 0.55s to 1.61s, while remaining the same for the StringIO func (near 0.61s for 100000 runs with 100 operations each). My conclusion is: if you're incrementing your string with very small parts, stick to +=. If it's random-sized or chaotic, StringIO will keep the excellent work throughout.
CPython optimizes str+= if there are no other references to the left hand string, but that relies on it being able to reallocate the string data; if your other code was allocating new objects, the risk that reallocation requires a copy goes up drastically. StringIO can reduce these reallocations by keeping a margin for growth, much like list does. Your small and non-chaotic strings may have been previously allocated (e.g. interned), making the ideal case for str+= to reallocate without copying.
Totally agree with all of these, but the 2 that irk me the most is altering the iterator and using eval. There's almost always a better way to do it than to use eval, but I think in the age of computer generated code it's especially important to understand the dangers of.
This is Good Stuff! 17 years of assembler followed by 20+ years of C have left me unprepared for these newfangled gadgets. Thanks ever so much for helping me see through the blur ....
Syntactic Sugar is GOOD FOR YOU.
6:30 I actually faced this recently with a project I’m working on. Do they both do the same thing under the hood (same byte code) or is one preferred over the other for different reasons? (In reference to using functional functions like map or filter over comprehensions).
Love your channel! Have you considered making a video on 'async for' and 'async with'?
been programming in python for years and consider myself somewhere between intermediate and advanced, and even your beginner-oriented videos have something new to me in them. there are so many beginner python creators out there and i'm thankful that it's so accessible now, but i definitely appreciate the higher-level "niche" you tend to make videos in, even when you claim it's for noobs. also #pycharm gimme that license mDaddy
As far as I know, something was changed in string concatenation. Concatenation of 10 million elements with StringIO beats basic string concatenation just by 10%
#5 is slightly misleading, since it depends on the python implementation. The statement is true for PyPy, but CPython cleverly uses a reference count of one to make the string mutable. The proposed fix has the exact same performance in CPython (replace 100 with 10000000 to get a proper measurement).
Is there still any point in using NamedTuples now that frozen dataclasses exist? I guess the one case I can think of is if you have a particular reason why your class needs to be a subtype of tuple, but other than that, dataclasses offer some very nice features such as being able to easily convert to a dictionary with the built-in asdict function, which makes dumping to JSON much easier.
2:59 Weird to promote pydantic as a parsing library. It's just a data mapper with some poorly designed validation features.
For iterating and delete, I often just iterate the copy of it. d.items().copy(). Therefore, I dont need the second loop and the copied object will be deleted after loop anyway, so memory ussage should be the same.
I disagree on number 10: You are right in that applying both // and % is less performant than divmod, but I think the former is more readable (and actually could as well as should better be caught by the parser during byte code creation; e.g. by looking from // or % a few operations ahead if the other operation is called from the exact same arguments, too). I do not understand the coexistence of //, % together with divmod in python without any imports necessary. I think this violates the python zen ...
which zen?
@@DrDeuteron the python zen -> google
I once was creating a parser in python and, naturally, was using bs4 for finding stuff. This, however, resulted in a bottleneck and replacing bs4 with regex made a 12x speed improvement from a couple of seconds for each operation to a couple dozens of milliseconds
started using dataclasses because I rely heavily on sending dicts around as messages, and then I immediately got the issue of exporting/importing to json. Kind of an annoyance that whole pickle business.
So why is using raw dicts a noobie practice btw? You just said not to do it, but not why.
#11 I don't actually like @property, when someone creates a setter it usually has a side effect. If I'm looking at where it is used, `obj.x = something` it is really surprising when I find out that this does something more than just changing the x variable. But if I see an `obj.set_x(something)` that immediately raises the alarm: `set_x()` probably does something more - why would anyone bother to write this function otherwise?
I suppose if your setter really has no side effect it is safe to use property, but in my experience that is rare (maybe you want to log where it changed or something like that, it's fine). Usually however, it re-computes related values or notifies other objects that a value has been updated - you should be aware of these.
omg you actualy a solved a python issue I am currently facing lol. Always a greeat watch. Thank you :D
Hey James, how do you feel about accepting using `or` to give a value to an attribute if it not set yet like `self.some_dependency = dependency or DefaultDependency()` in the constructor when the parameter `dependency` defaulted to `None` . Or used in a property to lazy load and cache an expensive object: `self._connection = self._connection or DatabaseConnection()`?
Small annotation to 10: The reason you do this is that most processors use a divmod that is built in already. That means the processor will most likely perform the operation twice only to discard the mod in the first go around and the result in the second.
yes definitely microoptimize your assembly code in python, the performance increase will be tremendous
I disagree with #14. You shouldn't be using lambdas as arguments to map and filter, but functions not created just for the purpose of this statement are fine. A lot of people don't know you can pass goodies from the `operator` module(such as `itemgetter`) and unbound methods(such as `str.upper`) as the first argument.
Great tips, man! One of my favorite channels. Keep up the good work. #pycharm
On #5, searching for answers gives some mixed messages, where the immediate evidence is that += is faster than the alternatives in most cases due to some python interpreter trickery, but in testing it's pretty clear that io.StringIO is substantially faster. For 50,000 appends I get 0.00428s using perf_counter, and with StringIO I get 0.00299s. That enormously faster.
I'm a noob and I've been using os.path to handle paths, how bad is it as opposed to using pathlib?
Not very bad. Path objects include the context that they are paths. For instance, path / subpath instead of os.path.join(path, subpath), and you can read an entire file with path.read_text(). pathlib is implemented using os.path and friends (path._flavour links to os.path, posixpath or ntpath).
@@0LoneTech thanks! what are the benefits of learning pathlib, then?
"Concatenating strings with plus" I doubt it gives a meaningful performance improvement if you have string concatenation once or twice in a program. Sure, when I will be writing extra long strings I might actually use this, otherwise it's just unnecessary code obfuscation.
Not sure about avoiding map/filter all the time but awesome tips, thanks #pycharm
I like hiding loops, but I've written some itertools/map/filter constructions that are abominable
13. Would it be beneficial if a had another dict to collect the filtered ones and return that? #pycharm
My man lost his voice for me! Hell yes what a guy. Great vid bro thanks a lot!
A "nooby" numpy habbit is also doing inp = np.array(inp) at the beginning of a function to ensure that the input is numpy for the rest of the function. This is bad because the np.array() operation actually makes a new copy of the array even if it is already a numpy array, so if there is a lot of data it will take some time. #pycharm
time to refactor.
I have a question. Parsing data structure such as dict from a string was mentioned as a bad thing, but I found it to be the best ways to store nested dictionaries parsed with json in redis. Are there alternative ways? hset and hmset dont work well with nesting, and return all values as strings, which is not the case whien using json dumps and loads
I'll defend facilitating variable rw with global instead of adding it as an input and outputting it again. In some cases, anyway, it's the same thing in shorter syntax. Problem where?
Adding persons together at 6:53 is hillarious! Making friends is not an associative (or even commutative) operation! The friendship relationship is also not reflective, symmetric or transitive. People are not numbers!
9: Some weird audio from the other side during this one from me? Maybe the ghosts like single letter variables.
13: d = {key: val for key, val in d.items() if not val % 2 == 0}
I saw production code somewhere where very good software engineers used global variables for doing multiprocessing with class functions. Any idea if this is necessary for performance? #pycharm
Depends on the language/framework. A modern framework should provide better alterantives than using global variables. But shared memory space is still common in C/C++, especially in embedded development.
Can you make a video about #pycharm itself? Why should I find it exciting?
Yes! Only 3 of 21. But honestly, some points are so advanced I never had the chance to do them wrong :D
Still found areas to improve. Thank you for your videos 👍
#pycharm
I think filter/map functions are not that bad to read. List comprehension is good, but I would rather not rush rewriting all filter calls to list comp style #pycharm
Exactly
Is it more efficient to create a set of keys to delete and looping a second time or to create a copy of the dict's keys with list(d.keys()) and iterating over that? Because that's what I usually do. #PyCharm
Damned. I taught being proficient in my python skills but some nooby habits are still sticking. Thanks for making us realising that. #pycharm
I would amend point #3 to encourage using os.path over path lib. I personally find pathlib to be very slow
6:07 a lot of modules use filter and map in this way #pycharm
Damn, I thought I was being proficient in my python skills but some nooby habits are still sticking. Thanks for making us realising that. #pycharm
Map and filter are fine. I like functional programming with functools and itertools. #pycharm
fewer loops is better, imho.
regarding 5: why that and not a list and then join at the end?
Great video. Do you have any related content on #12 expensive attributes? I have been trying to subclass or in general make classes that I actually need to have access to secondary attributes. The problem is at times setting values to secondary attributes as self.attr1.attr2 = value does not work.
Law of Demeter code smell on that.
You can use dunder setattr(self, attr2, value) so that you can write:
>>>self.attr2 = value
throws an AttributeError (which is caught) and invokes dunder setattr when you explicitly tell it to:
self.attr1.attr2 = value. or. setattr(self.attr1, attr2, value)
Same for getattr....it helps when you object is a chain of HAS-A objects.
Hola! Puedes explicar él biblioteca io? Gracias. Saludos de Brasil.
I noticed you used single letter variables in the example prior to your point about not using single letter variables.
I also agree that eval should not be used for parsing data, but you can pretty much completely sandbox it
eval("{'h':6}", {"__builtins__":{}})
Note: someone might still find a way to evaluate unsafe code, so don't do this.
It's just a nice thing to know.
No global exists within this sandbox, if a vulnerability exists, it would abuse the built-in datatypes like `(5).__mul__(8)`
But I can't think of a real exploit for this.
Never knew about Python getters and setters having special properties so that will be my next amendment I have to make for my existing project #pycharm
don't forget about "@attr.deleter" too.
3:59 overlapping sound
#pycharm
I am a big fan of iterators, and I really disagree about map and filter. I get that sometimes they are less readable, but they also make it way easier to build pipelines of lazy data manipulation. Also, generators and comprehensions have poorer performance, due to the for keyword, afaik. I think this isn't a n00b habit, as most n00bs don't understand iterators and functional / declarative styles, just seems more like a question of taste.
#19 what is wrong with passing structured data as a dictionary?
I thought str.join was the idiomatic/pythonic way to concatenate strings?
#pycharm
why would we still use named tuple if it looks and is used like a dataclass?
What's the benefit of StringIO over a list and "".join(the_list)?
Stackoverflow says StringIO can be used where a file-like object is expected.
Really cool video, as always! :) #pycharm
1 kinda don't agree
7 I'm conflicted on this one. I've been in JS land and got used to the "file-as-module" pattern(?). Now to me it seems so wasteful to define a dedicated class just so I can avoid global variables. Happy to proven wrong about this ofc. Maybe I should just embrace this instead lol
12 the results can also be stored in a "private" variable, yes? Assuming other conditions are taken care of
14 sometimes I use map/filter when I'm too lazy to think of a loop variable name lol. Though more often than tht I consider hsing "_" instead. Dk which is worse heh
Sometimes, even after 8 years of python, you have some nooby habits :) . As always great video !
Hi mr mCoding,
I recently stumbled upon this:
isinstance(True, int) returns True.
What’s up with that?
Historically, python didn't have bool, everything just used 1 or 0.
When python went to introduce the bool type, they made it derive from int so all old code would still work when passed a bool.
I found a case that contradicts 2 or 3 of these. #5, #17 and possibly #3.
Using os.path with r"c:\\" was giving me incorrect/broken results. You can't do r"c:\". r"c:" is still broken.
"c:\\" works and r"c:" + "\\" works
Man, that behavior of and/or with falsy/truthy values is really strange - it almost reminds me of some of the type coercion weirdness that can occur in Javascript. Much as I like using Python (#pycharm) for certain projects, I definitely appreciate languages that can give compile-time errors for this sort of thing - same goes for deleting items from an iterable during iteration as well.
short circuit logic reduces cyclomatic complexity.
Hi, im watching your video lately. I have suggestion as nooby. Can you put more visual like dont and do sign? It can be helpful for someone, i’m sure
These videos are a huge help, man. Thank you. #pycharm
mean while i struggle to explain how .get works and that you can do things inside an f string so you dont need to assign it to a var on the line above an f string to use it only once in the entire code in that one f string.
although this is better than the last person i gave up on the step of asking him to write his name.
thanks mate, pretty useful information, some of these I’ve been doing myself, some I didn’t even know about. keep going with this great content.
#pycharm
I come to python from R. I rarely use classes. Is this is a bad practice? I rarely see uses for them. I mostly do data analysis.
05:5 why use set instead of list?
I want to make all of my junior engineers watch this video. Especially after the round of code reviews I went through today
Definitely enjoyed this, and picked up some great tips! I have to say, I’m not a fan of main() as a construct. __main__ is great, but then having it just call a main() to do everything feels unnecessary unless there’s a really good reason, and then the person importing your code has to weed it back out to avoid pulling it in. (Plus, if I’m being honest, main() feels a bit too Java-like. :) )
the point of main() is to keep code out of the if/then block for dunder main...devs should see the module end in that block and it should be short, and then they know a main() or usage() [for scripts] is right above it.
Why is HTML not regular? Is it proven using pumping lemma?
json.loads as oppose to eval won't parse a python dict but a json, which will often differ, for example in json you have null, in python None
True. ast.literal_eval() reads Python literals.
Well, it's nice to see I'm not as much of a noob as I was when the last video was released. #pycharm
Thank you so much for these videos! #pycharm
#pycharm
"... That tech job you somehow got" oof, felt that!
Great videos though, these videos never fail to help!
2:16 is more of a problem with Python than a problem with the programmer. But yeah, you should use mutable strings when each string is only used once.
"A lot of noobs, especially people coming from Java" - love it (4:35)
Hey Java was my first language so I can say that!
maybe a video showing which version of #pycharm offer what ?
Thank you so much! I didnt even know about the named tuple! #pycharm
Regexes really can't parse mathematical expressions or xml, I give you that. I learned that from people more formally educated in computer science than I am. I learned it AFTER presenting a working example that they indeed can, while being simpler and a whooooole lot faster than parsers. Yet somehow, this new realization didn't break my already written nooby code. 🤣
Big fan of this channel. Can't tell you the number of times I've revisited the args, kwargs, /, and * video! #pycharm