Was looking for a Seaborn tutorial which could really help me understand everything concisely and it's the best one I found. Thank you so much, please keep it up and introduce us to some more data visualization tools and techniques :)
Thanks a lot mam I have recently begun my DS journey and have referred to many amazing tutorials for numpy, pandas , matplotlib etc. But yours was the best one I have seen so far. Thanks again
Thanks youu!!! I spent my time watching many tutorials that helped me very little until I found your TH-cam channel. Your videos are definitely the best.
Hi there -- random_state just allows for reproducibility. Since I'm selecting a random subset of the data, using random_state sets the seed of my random number generator. That means if you run this code, you will get the same random rows as me. And the number 22 is completely arbitrary! I often choose 42 in honor of "The Hitchhiker's Guide to the Galaxy." :) But you can pick any number you'd like.
Thanks -- I love this idea! I have been thinking about doing a few videos showing my thought process when selecting which figure to use and how I select my styling. It definitely comes with practice and depends on the data story, but maybe it would be helpful to show the thought process for a few examples. 👍
@@KimberlyFessel Yes please! As a beginner, I mostly end up using bar plot, histograms and scatter plots. I would really appreciate if you could make a video on this.
Yes, this is a relatively recent seaborn update. The data that come with seaborn have "category" data types for the strings. This means they have a property called .cat.categories. This gives all the categories (even the ones that aren't present), and this is what seaborn builds the legend from. You can override this either by not having the category data types (converting to strings, say) or by setting hue_order like I did in my recent countplot video here: th-cam.com/video/8U5h3EJuu8M/w-d-xo.html
Hi! Could you explain how you get the legend to show the correct 'custom' markers, if this is even possible? (Referring to 9:10) Thanks a lot for these videos!
Hi there -- the legend in the figure you referenced was autogenerated by Seaborn, so you can match up the hue to the diamond cut and the marker shape to the diamond color. But a couple of things: 1. With the most recent update, Seaborn treats many columns as the "category" data type rather than strings. This means that if you do the filtering like I did in this video (with the old version of Seaborn), you will see many unused categories in the legend. You can either convert these columns to strings (with pandas .astype() method) or drop these unused categories (with pandas cat.remove_unused_categories() method: pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.cat.remove_unused_categories.html#pandas.Series.cat.remove_unused_categories) to make your legend look like mine. 2. Also -- you can create your own custom legend if you'd like to match up the color and marker and, say, use a labels like "Premium, F", "Premium, D", etc. This StackOverflow post walks you through how you could do that: stackoverflow.com/questions/54682473/change-legend-location-and-labels-in-seaborn-scatter-plot
Hi Kimberly, Thank you very much for the course! I especially like your clear explanations. I would like to add some information. I am using seaborn 0.11.2. In the diamond dataset cut, color and clarity are now type of category. So the filtered out values also appear on legend when hue is used. They also appear when you type diamonds['cut'].value_counts() with count 0. To avoid this I changed the these fields back to string with diamonds['cut'] = diamonds['cut'].astype('str'). Do you have a better solution for this? And please add this information to your notes or to the new version of this video. Thank you again.
Hi there - you can add text to seaborn figures by using matplotlib pyplot's text function. And here is my past video on that: th-cam.com/video/NBYzSaTbodM/w-d-xo.html You can even automatically update the text positions so they don't overlap on the scatterplot using a library called AdjustText (Video here: th-cam.com/video/xSS59Ga64rQ/w-d-xo.html)
Hi there - once you create your color bar you can just add it to the figure and remove the current legend. This reference should be able to help you out: stackoverflow.com/questions/62884183/trying-to-add-a-colorbar-to-a-seaborn-scatterplot
Tried to annotate the figure: #plt.text(x_pos, y_pos, f"y = {m:.2f}x {b:.1f}", bbox=dict(facecolor='white', alpha=0.5)) #plt.text(5, 5, f'R$^2$ = {R_value:.4f}', bbox=dict(facecolor='red', alpha=0.5)) plt.annotate( # Label and coordinate 'R$^2$ = {R_value:.4f}', xy=(5, 50), xytext=(0, 80), # Custom arrow bbox=dict(facecolor='red', alpha=0.5)) but my y values no matter what way I try isn't going up or down.
Was looking for a Seaborn tutorial which could really help me understand everything concisely and it's the best one I found. Thank you so much, please keep it up and introduce us to some more data visualization tools and techniques :)
Thanks a lot mam
I have recently begun my DS journey and have referred to many amazing tutorials for numpy, pandas , matplotlib etc. But yours was the best one I have seen so far.
Thanks again
Thanks youu!!! I spent my time watching many tutorials that helped me very little until I found your TH-cam channel. Your videos are definitely the best.
Wow -- thanks so much! Glad you are enjoying my videos and happy to hear they are helping!
Totally agree!
Thank you so much for sharing these content. I have watched several of your videos and they have made my life at lab way better.
Thank you, that was very helpful!
Super -- very glad to hear that!
This video is very helpful. Tank you
Most welcome! Glad it helped!
Excellent tutorial. Learned a lot. :)
In minute: 1:32, what does the argument random_state means? And why did you choose = 22?
Hi there -- random_state just allows for reproducibility. Since I'm selecting a random subset of the data, using random_state sets the seed of my random number generator. That means if you run this code, you will get the same random rows as me.
And the number 22 is completely arbitrary! I often choose 42 in honor of "The Hitchhiker's Guide to the Galaxy." :) But you can pick any number you'd like.
@@KimberlyFessel Thank you😃
great set of videos,could you also do one to explain which plots to use when , i know it comes with practice
Thanks -- I love this idea! I have been thinking about doing a few videos showing my thought process when selecting which figure to use and how I select my styling. It definitely comes with practice and depends on the data story, but maybe it would be helpful to show the thought process for a few examples. 👍
@@KimberlyFessel Yes please! As a beginner, I mostly end up using bar plot, histograms and scatter plots. I would really appreciate if you could make a video on this.
on setting hue, cuts with value zero like 'fair', 'ideal' are also getting plotted, how to get rid of them
hey nice video!
Though 'cut' column contains only two categories: 'Premium' and 'Good' but in legend it is showing all the categories. How it is possible?
Yes, this is a relatively recent seaborn update. The data that come with seaborn have "category" data types for the strings. This means they have a property called .cat.categories. This gives all the categories (even the ones that aren't present), and this is what seaborn builds the legend from. You can override this either by not having the category data types (converting to strings, say) or by setting hue_order like I did in my recent countplot video here: th-cam.com/video/8U5h3EJuu8M/w-d-xo.html
i love your vedios.. you will make me data analyst soon..
Yes! All the best on your data analyst journey!
Thanks!
Welcome!
Hi! Could you explain how you get the legend to show the correct 'custom' markers, if this is even possible? (Referring to 9:10)
Thanks a lot for these videos!
Hi there -- the legend in the figure you referenced was autogenerated by Seaborn, so you can match up the hue to the diamond cut and the marker shape to the diamond color. But a couple of things:
1. With the most recent update, Seaborn treats many columns as the "category" data type rather than strings. This means that if you do the filtering like I did in this video (with the old version of Seaborn), you will see many unused categories in the legend. You can either convert these columns to strings (with pandas .astype() method) or drop these unused categories (with pandas cat.remove_unused_categories() method: pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.cat.remove_unused_categories.html#pandas.Series.cat.remove_unused_categories) to make your legend look like mine.
2. Also -- you can create your own custom legend if you'd like to match up the color and marker and, say, use a labels like "Premium, F", "Premium, D", etc. This StackOverflow post walks you through how you could do that: stackoverflow.com/questions/54682473/change-legend-location-and-labels-in-seaborn-scatter-plot
Hi Kimberly, Thank you very much for the course! I especially like your clear explanations. I would like to add some information. I am using seaborn 0.11.2. In the diamond dataset cut, color and clarity are now type of category. So the filtered out values also appear on legend when hue is used. They also appear when you type diamonds['cut'].value_counts() with count 0. To avoid this I changed the these fields back to string with diamonds['cut'] = diamonds['cut'].astype('str'). Do you have a better solution for this? And please add this information to your notes or to the new version of this video. Thank you again.
Yes, I had the same problem!
Thanks a lot!
thank you so much maam
thanks!! this helped a lot
Awesome -- glad it helped!
Hi, Dr. Fessel! Thank you so much for the video! How can I add a trend line on the scatter plot?
Try using seaborn's lmplot()
How do you make a midpoint for the colors?
How to write text i want to show with scatter plot
Hi there - you can add text to seaborn figures by using matplotlib pyplot's text function. And here is my past video on that: th-cam.com/video/NBYzSaTbodM/w-d-xo.html You can even automatically update the text positions so they don't overlap on the scatterplot using a library called AdjustText (Video here: th-cam.com/video/xSS59Ga64rQ/w-d-xo.html)
hi, may I know how to change hue that shows the legend into the colorbar?
Hi there - once you create your color bar you can just add it to the figure and remove the current legend. This reference should be able to help you out: stackoverflow.com/questions/62884183/trying-to-add-a-colorbar-to-a-seaborn-scatterplot
Also explain remaining parameters
Tried to annotate the figure:
#plt.text(x_pos, y_pos, f"y = {m:.2f}x {b:.1f}", bbox=dict(facecolor='white', alpha=0.5))
#plt.text(5, 5, f'R$^2$ = {R_value:.4f}', bbox=dict(facecolor='red', alpha=0.5))
plt.annotate(
# Label and coordinate
'R$^2$ = {R_value:.4f}', xy=(5, 50), xytext=(0, 80),
# Custom arrow
bbox=dict(facecolor='red', alpha=0.5))
but my y values no matter what way I try isn't going up or down.
I love you Kimberly🤩🤩🤩🤩