Correlation | Heatmap | Exploratory data analysis
ฝัง
- เผยแพร่เมื่อ 17 ก.พ. 2020
- In this video we will do basic EDA work on House price prediction dataset :
we will cover:
- how to find correlation
- Selecting top correlated features
- plotting correlation heatmap
Statistics Tutorial for data science playlist : th-cam.com/users/playlist?list...
Python Tutorial to learn Python programming for Data Science with examples: bit.ly/2u4zzCz
Reach out to me if you have any questions :
Email: baskyutsav@gmail.com
Linkedin: / utsav-agg. .
Website: www.skilltoai.com - วิทยาศาสตร์และเทคโนโลยี
You really deserve much more likes and views. You explained why we made heatmaps and how to analyze them and make meaningful decision. It was a practical teaching, I love it.
Thankyou Prachi, I hope to bring more such informative content to you 😊
I hoped we could have access to the codes but thank you for all the videos they are great!
Good one...👍
Woooooow!! Thank you so much sir!
You made it sound as it was a game ! for the first time in my life I've had fun while learning something..
Please keep posting because we'll keep watching ^_^
Thanks 😊 I'll keep posting
Hi, is there a video where you explained how to predict the price?
hi i am new to this, can you please tell me from where can i learn this type of coding.
thanks
What about the negative highly correlated columns?
thank u fam
Love you
Could you please share the notebooks?
hello utsav , i have one doubt
numerical_feature=[feature for feature in raw_data.columns if raw_data[feature].dtypes != 'O']
print('number of numerical variable',len(numerical_feature))
raw_data[numerical_feature]
discrete_feature=[feature for feature in numerical_feature if len(raw_data[feature].unique())
Could you please tell me above what percentage in a heatmap is considered multicollinear?
Correlation greater than 0.8
@@utsavaggarwal_ds thanks 🙂
When I write the same, I am getting error instead of heat map. Any idea? Why?
f , ax = plt.subplots(figsize = (14,12))
plt.title('Correlation of Numeric Features with Sale Price',y=1,size=16)
sns.heatmap(correlation,square = True, vmax=0.8)
---------------------------
AttributeError Traceback (most recent call last)
in
----> 1 f , ax = plt.subplots(figsize = (14,12))
2
3 plt.title('Correlation of Numeric Features with Sale Price',y=1,size=16)
4
5 sns.heatmap(correlation,square = True, vmax=0.8)
AttributeError: module 'matplotlib' has no attribute 'subplots'
can u send complete notebook on my email: baskyutsav@gmail.com
try this
plt.figure(figsize=(10,10))
plt.title("Correlation of Numeric Features with Sale Price")
sns.heatmap(correlation, square=True, vmax=0.8)
Error again.
k=11
cols = correlation.nlargest(k,'SalePrice')['SalePrice'].index
print(cols)
cm = np.corrcoef(data[cols].values.T)
f , ax = plt.subplots(figsize = (14,12))
sns.heatmap(cm, vmax=.8, linewidths=0.01,square=True,annot=True,cmap='viridis',linecolor="white", xticklabels = cols.values ,annot_kws = {'size':12},yticklabels = cols.values)
----------------------------------
Index(['SalePrice', 'OverallQual', 'GrLivArea', 'GarageCars', 'GarageArea',
'TotalBsmtSF', '1stFlrSF', 'FullBath', 'TotRmsAbvGrd', 'YearBuilt',
'YearRemodAdd'],
dtype='object')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in
3 print(cols)
4 cm = np.corrcoef(data[cols].values.T)
----> 5 f , ax = plt.subplots(figsize = (14,12))
6 sns.heatmap(cm, vmax=.8, linewidths=0.01,square=True,annot=True,cmap='viridis',linecolor="white", xticklabels = cols.values ,annot_kws = {'size':12},yticklabels = cols.values)
AttributeError: module 'matplotlib' has no attribute 'subplots'