Data Science Best Practices with pandas (PyCon 2019)

แชร์
ฝัง
  • เผยแพร่เมื่อ 15 ต.ค. 2024
  • The pandas library is a powerful tool for multiple phases of the data science workflow, including data cleaning, visualization, and exploratory data analysis. However, the size and complexity of the pandas library makes it challenging to discover the best way to accomplish any given task.
    In this tutorial, you'll use pandas to answer questions about a real-world dataset. Through each exercise, you'll learn important data science skills as well as "best practices" for using pandas. By the end of the tutorial, you'll be more fluent at using pandas to correctly and efficiently answer your own data science questions.
    EXERCISES:
    05:14 1. Introduction to the TED Talks dataset
    10:45 2. Which talks provoke the most online discussion?
    18:58 3. Visualize the distribution of comments
    34:20 4. Plot the number of talks that took place each year
    50:30 5. What were the "best" events in TED history to attend?
    1:01:28 6. Unpack the ratings data
    1:13:36 7. Count the total number of ratings received by each talk
    1:22:55 8. Which occupations deliver the funniest TED talks on average?
    DOWNLOAD the dataset and Jupyter notebook:
    github.com/jus...
    WATCH my introductory series, Data Analysis with pandas:
    • Data analysis in Pytho...
    JOIN the "Data School Insiders" community:
    / dataschool
    LET'S CONNECT!
    Email Newsletter: www.dataschool...
    LinkedIn: / justmarkham
    Twitter: / justmarkham
    Facebook: / datascienceschool
    TH-cam: www.youtube.co...

ความคิดเห็น • 294