Correlation | Heatmap | Exploratory data analysis

แชร์
ฝัง
  • เผยแพร่เมื่อ 17 ก.พ. 2020
  • In this video we will do basic EDA work on House price prediction dataset :
    we will cover:
    - how to find correlation
    - Selecting top correlated features
    - plotting correlation heatmap
    Statistics Tutorial for data science playlist : th-cam.com/users/playlist?list...
    Python Tutorial to learn Python programming for Data Science with examples: bit.ly/2u4zzCz
    Reach out to me if you have any questions :
    Email: baskyutsav@gmail.com
    Linkedin: / utsav-agg. .
    Website: www.skilltoai.com
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 20

  • @prachinainawa3055
    @prachinainawa3055 3 ปีที่แล้ว +8

    You really deserve much more likes and views. You explained why we made heatmaps and how to analyze them and make meaningful decision. It was a practical teaching, I love it.

    • @utsavaggarwal_ds
      @utsavaggarwal_ds  3 ปีที่แล้ว +2

      Thankyou Prachi, I hope to bring more such informative content to you 😊

  • @mehradmortazavi5433
    @mehradmortazavi5433 ปีที่แล้ว +1

    I hoped we could have access to the codes but thank you for all the videos they are great!

  • @navdhaagarwal5862
    @navdhaagarwal5862 4 ปีที่แล้ว +1

    Good one...👍

  • @amiraaliIsInLove
    @amiraaliIsInLove 2 ปีที่แล้ว +3

    Woooooow!! Thank you so much sir!
    You made it sound as it was a game ! for the first time in my life I've had fun while learning something..
    Please keep posting because we'll keep watching ^_^

  • @sadhana0002
    @sadhana0002 ปีที่แล้ว +1

    Hi, is there a video where you explained how to predict the price?

  • @abhijeet7256
    @abhijeet7256 2 ปีที่แล้ว +2

    hi i am new to this, can you please tell me from where can i learn this type of coding.
    thanks

  • @dotaswot5870
    @dotaswot5870 3 หลายเดือนก่อน

    What about the negative highly correlated columns?

  • @diakhadiop2128
    @diakhadiop2128 9 หลายเดือนก่อน

    thank u fam

  • @hammadshahzad8994
    @hammadshahzad8994 2 ปีที่แล้ว +1

    Love you

  • @hkemal2743
    @hkemal2743 4 ปีที่แล้ว +1

    Could you please share the notebooks?

  • @user-zl2pj6rx1z
    @user-zl2pj6rx1z 10 หลายเดือนก่อน

    hello utsav , i have one doubt
    numerical_feature=[feature for feature in raw_data.columns if raw_data[feature].dtypes != 'O']
    print('number of numerical variable',len(numerical_feature))
    raw_data[numerical_feature]
    discrete_feature=[feature for feature in numerical_feature if len(raw_data[feature].unique())

  • @timetraveller7513
    @timetraveller7513 2 ปีที่แล้ว +1

    Could you please tell me above what percentage in a heatmap is considered multicollinear?

  • @hkemal2743
    @hkemal2743 4 ปีที่แล้ว +2

    When I write the same, I am getting error instead of heat map. Any idea? Why?
    f , ax = plt.subplots(figsize = (14,12))
    plt.title('Correlation of Numeric Features with Sale Price',y=1,size=16)
    sns.heatmap(correlation,square = True, vmax=0.8)
    ---------------------------
    AttributeError Traceback (most recent call last)
    in
    ----> 1 f , ax = plt.subplots(figsize = (14,12))
    2
    3 plt.title('Correlation of Numeric Features with Sale Price',y=1,size=16)
    4
    5 sns.heatmap(correlation,square = True, vmax=0.8)
    AttributeError: module 'matplotlib' has no attribute 'subplots'

    • @utsavaggarwal_ds
      @utsavaggarwal_ds  4 ปีที่แล้ว

      can u send complete notebook on my email: baskyutsav@gmail.com

    • @rajesh5201
      @rajesh5201 11 หลายเดือนก่อน

      try this
      plt.figure(figsize=(10,10))
      plt.title("Correlation of Numeric Features with Sale Price")
      sns.heatmap(correlation, square=True, vmax=0.8)

  • @hkemal2743
    @hkemal2743 4 ปีที่แล้ว +1

    Error again.
    k=11
    cols = correlation.nlargest(k,'SalePrice')['SalePrice'].index
    print(cols)
    cm = np.corrcoef(data[cols].values.T)
    f , ax = plt.subplots(figsize = (14,12))
    sns.heatmap(cm, vmax=.8, linewidths=0.01,square=True,annot=True,cmap='viridis',linecolor="white", xticklabels = cols.values ,annot_kws = {'size':12},yticklabels = cols.values)
    ----------------------------------
    Index(['SalePrice', 'OverallQual', 'GrLivArea', 'GarageCars', 'GarageArea',
    'TotalBsmtSF', '1stFlrSF', 'FullBath', 'TotRmsAbvGrd', 'YearBuilt',
    'YearRemodAdd'],
    dtype='object')
    ---------------------------------------------------------------------------
    AttributeError Traceback (most recent call last)
    in
    3 print(cols)
    4 cm = np.corrcoef(data[cols].values.T)
    ----> 5 f , ax = plt.subplots(figsize = (14,12))
    6 sns.heatmap(cm, vmax=.8, linewidths=0.01,square=True,annot=True,cmap='viridis',linecolor="white", xticklabels = cols.values ,annot_kws = {'size':12},yticklabels = cols.values)
    AttributeError: module 'matplotlib' has no attribute 'subplots'