21 more pandas tricks
ฝัง
- เผยแพร่เมื่อ 22 ก.ค. 2024
- You're about to learn 21 tricks that will help you to work faster, write better pandas code, and impress your friends. These are the BEST tricks that I couldn't fit into my FIRST tricks video!
📔 JUPYTER NOTEBOOK:
nbviewer.org/github/justmarkh...
🔥 MY TOP 25 PANDAS TRICKS:
• My top 25 pandas tricks
🐼 MORE PANDAS VIDEOS:
• Data analysis in Pytho...
TRICKS:
0:00 Introduction
0:36 1. Check for equality
1:27 2. Check for equality (alternative)
2:38 3. Use NumPy without importing NumPy
3:42 4. Calculate memory usage
4:10 5. Count the number of words in a column
4:45 6. Convert one set of values to another
6:59 7. Convert continuous data into categorical data (alternative)
8:05 8. Create a cross-tabulation
8:55 9. Create a datetime column from multiple columns
9:34 10. Resample a datetime column
11:07 11. Read and write from compressed files
12:10 12. Fill missing values using interpolation
12:45 13. Check for duplicate merge keys
13:50 14. Transpose a wide DataFrame
14:47 15. Create an example DataFrame (alternative)
16:06 16. Identify rows that are missing from a DataFrame
17:09 17. Use query to avoid intermediate variables
19:06 18. Reshape a DataFrame from wide format to long format
21:19 19. Reverse row order (alternative)
22:25 20. Reverse column order (alternative)
23:21 21. Split a string into multiple columns (alternative)
NOTE: Tricks 3 and 15 were deprecated in pandas 1.0
LET'S CONNECT!
- Newsletter: www.dataschool.io/subscribe/
- Twitter: / justmarkham
- Facebook: / datascienceschool
- LinkedIn: / justmarkham
THANKS for watching! 🙌 Which trick are you most excited to start using? CLICK REPLY and let me know!
No. 7 and no 15
That was really Helpful, man! Glad that I saw your tweet in my timeline!
That’s great to hear, thanks Reza! 🙌
love it. I really enjoyed this video. thanks for the quality content
That’s awesome to hear, thank you! 🙏
Thank you so much! Your tutorials are the best
Thank you, Kris! 🙏
What a great video! Thanks!
You’re very welcome!
Thank you for you amazing videos! Super informative and helpful, as always.
That's so nice of you to say! You're very welcome! 🙌
I love these videos. I always learn so much.
What a nice thing to say, thank you so much! 🙏
Even after re-watching them, your videos are outstanding.
Thank you so much! 🙏
Lots of time savers here thanks well done
You're very welcome, Frank!
Thanks a lot for the excellent tricks (25 + 21) ! Really amazing and useful.
You're very welcome! 🙌
excellent as always
Thank you!
Great tips, holy! there is always something new to learn.
Agreed!
I was becoming sad that you were no longer making pandas videos. What a come back like that of Real Madrid :)). Thanks Kevin!!
You are very welcome!
Thanks so much,really helpful
Glad it was helpful to you! 🙌
I completed 36 videos in this series. Thanks a lot Kevin for such amazing tutorial.
That is awesome to hear! Congratulations!
How to use text classification approach where target column is bigram word...?
Can you please show us ..?
Great tutorial ❤
Thank you!
amazing content!!
Thank you!
Cool. pandas and numpy seem to have a ton of functions and it’s hard to remember them all. Would appreciate a video focused on multilevel data frames, as I always forget how to index, etc those.
I actually have a video about the MultiIndex! Here it is: th-cam.com/video/tcRGa2soc-c/w-d-xo.html
Great tips and tricks as always!
do you have a tutorial video about writing efficient pandas code? for example implementing vectorization?
I've watched plenty of your videos but I think I haven't seen one about the topic
Thanks very much! I think this video might be of interest to you: th-cam.com/video/dPwLlJkSHLo/w-d-xo.html
Awesome, will check that out!
Hi Kevin, Thank you for creating such amazing contents. These videos are really helpful for doing real time projects. I wanted to request you to make video on particular topic. if you can make video on how to use pandas to write, read and edit google sheets, that would be very helpful. It can include putting values in range of cells as well as one cell in google sheet, reading data, etc. If there is already any video you have made on this topic, let me know.
Thanks for your suggestion!
Sir , please make a video on user defined functions in pandas dataframe
Thank you 😊😊
You’re welcome!
I miss your videos , man. Best like to best tutor. 👌
Thank you!
Hi Kevin, Thanks for the insightful video. I love your videos and courses they are so subtle and impactful. Would it be possible for you to make a video on Python Class/objects? Its a daunting concept for someone like me (not from coding background) who has limited understanding of OOPs. Additionally I have observed that many coders use python classes for ML scripts/pipeline, scripting files on Github. So it would be helpful if you could make an video on the same.
Thanks in advance !!
Thanks for your suggestion! I'll consider it for the future. For now, maybe start here: th-cam.com/play/PL-osiE80TeTsqhIuOqKhwlXsIBIdSeYtc.html
Hope that helps!
Hi Kevin. The most important question of all: How to remember all of those things? Do you have any means? Any way that lets you retain the tricks/knowledge in your mind for longer? Would you please share any thoughts on that? Thanks.
Great question! I don't have a system for memorization, rather it just comes naturally the more I use something. However, I also don't worry about forgetting, because I usually remember where to look in order to refresh my memory. Thus, my advice is (1) practice, and (2) keep track of good resources so that you can look up things easily whenever you forget. Hope that helps!
Just doing your datacamp RI Police course. It's a really well laid out course and I think you're adorable. :-)
Thank you!
Thanks for the tips.
Q: How crosstab provided the results compared to pivot where no target value 'Survived' is provided
Unable to view the Notebook from link provided. Please re upload
Great question! With the pivot table, I just selected a column with no missing values (Survived in this case) and counted them. With crosstab, it automatically does a count, so you don't need to select a specific column.
Regarding the notebook, you can view it here: github.com/justmarkham/pandas-videos/blob/master/21_more_pandas_tricks.ipynb
Hope that helps!
why below code is not giving any output
pd.testing.assert_series_equal(df.c, df.d, check_names=False, check_dtype=False)
Great question! The assertion passed, thus no error was raised, thus no output was generated. If you're new to assertions, just try running assert(1==1) and assert(1==2) in Python, and you'll see that when an assertion passes, there is no output. Hope that helps!
This resample methood is incredibly useful
Thanks a ton
I agree! It really helped me once I started thinking of resample as a groupby for datetime data.
good
Thanks!
Hi Kevin, I hear a lot about API's. Maybe one day you can demo how to create a basic custom API.
Thanks for your suggestion!
Hi, for example I have a table, which I've got by left outer join:
person - vehicle
dad - car
dad - motorcycle
dad - bicycle
mom - car
mom - bicycle
son - None/NA/NaN/NaT
How to group by person and count with condition (car and motorcycle)?
When I use for example:
df = df.groupby(['person'])['vehicle'].apply(lambda x: x[x == 'car'].count())
But I can't use a list in condition lambda x: x[x in ['car']], pandas says:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
jupyter notebook link broken
You can use this link instead: github.com/justmarkham/pandas-videos/blob/master/21_more_pandas_tricks.ipynb
The meaning of 🐼 is 46!
😂
Hi Kevin, a question question, why the code (titanix.SibSp > 0).astype(' int ') can set different integers into 0 and 1?
Great question! The boolean value True gets converted to 1, and False gets converted to 0. Hope that helps!
On tips 14, I think a better solution is to use df.style to display all columns and rows of your DataFrame. If you have a lot of rows and are only interested in the columns just use df.head().style
Love it! Thank you for sharing! 🙌