How do I handle missing values in pandas?

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ต.ค. 2024

ความคิดเห็น • 376

  • @dataschool
    @dataschool  6 ปีที่แล้ว +20

    In pandas version 0.21 (released October 2017), they added 'isna' and 'notna' as aliases for 'isnull' and 'notnull'. Learn more in my latest video, "5 new changes in pandas you need to know about": th-cam.com/video/te5JrSCW-LY/w-d-xo.html

    • @bragattemas
      @bragattemas 4 ปีที่แล้ว +2

      Even in the final of 2019 your material form 2016 still gives incredible help.
      I have certainty the DataSchool will keep been a success and helping people.
      Excellent job Kevin Markham. Thanks.

    • @Taranggpt6
      @Taranggpt6 4 ปีที่แล้ว

      Why after replacing na with *various* the count is different .
      Coubts of various must be equal to na values earlier which was 2644

    • @EdgardThreat
      @EdgardThreat 4 ปีที่แล้ว

      ​@@Taranggpt6 hi, that's because there is already a category named "VARIOUS" in the dataset, so the new filled in data gets added up to the existing count of "VARIOUS".

    • @vigneshpadmanabhan
      @vigneshpadmanabhan 3 ปีที่แล้ว

      can we get a video on how to handle missing values for data time related datasets. may be sensor values or any sensitive values. multiple varieties of handling missing value would be very useful.

    • @nadyamoscow2461
      @nadyamoscow2461 3 ปีที่แล้ว +1

      @@bragattemas I must say even in 2021 it is still completely up to date

  • @339059331
    @339059331 3 ปีที่แล้ว +33

    I like his way of teaching, he doesn't assume that the audience knows by default. He breaks down the explanation piece by piece, it is a great learning experience, concise and clear stated lectures as always! Thanks!

    • @dataschool
      @dataschool  3 ปีที่แล้ว +1

      You're very welcome! Thanks for your kind words!

    • @depokboy
      @depokboy 3 ปีที่แล้ว +2

      @@dataschool first time watch,,,,those positves comments are true,,,,,thanks a lot

  • @codesandroads
    @codesandroads 4 ปีที่แล้ว +14

    I never leave this place unsatisfied or without answers, total treasure.

    • @dataschool
      @dataschool  4 ปีที่แล้ว

      Thank you so much!

  • @nasserabachi9625
    @nasserabachi9625 7 ปีที่แล้ว +10

    Now i am in love with Pandas just by seeing a couple of your videos, Shukran Jazeelan !

    • @dataschool
      @dataschool  7 ปีที่แล้ว +2

      That's awesome! Thanks for sharing!

  • @BAIBHAVPATHYBEE
    @BAIBHAVPATHYBEE ปีที่แล้ว

    6 years has gone released this video and i m watching it now and it still made me fall in love with the series ... beautifully explained every concept in detail.

    • @dataschool
      @dataschool  ปีที่แล้ว

      Thank you so much! 🙏

  • @luqikong283
    @luqikong283 4 ปีที่แล้ว +3

    The most amazing python tutorial I've watched so far. Fell in love with python.

  • @kuldipchauhan524
    @kuldipchauhan524 6 ปีที่แล้ว +20

    Awesome- you are gifted.... -- your explanation and content are clean and effective.

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      Thanks very much for your kind words!

  • @rahuldeepdraws8699
    @rahuldeepdraws8699 3 ปีที่แล้ว +1

    This is actually the most clearly explained video on DataFrames that I have ever come across. Glad I found you. Thank you so much.

    • @dataschool
      @dataschool  3 ปีที่แล้ว +1

      Glad it was helpful!

  • @gabrielreilly7010
    @gabrielreilly7010 3 ปีที่แล้ว +1

    Great videos covering the basics. I enjoy how the additional values within the functions are covered, i.e. axis, etc.

    • @dataschool
      @dataschool  3 ปีที่แล้ว +1

      Glad it was helpful!

  • @kiranachanta9741
    @kiranachanta9741 5 ปีที่แล้ว +4

    I have been watching Kevin Videos, needless to say he is an Awesome Instructor. His explanation in all of his videos is Conceptual, In-depth and breaking down any complex topic into the easiest way.
    Thanks Kevin for your great Work!!!
    It would be great if you could make videos on visualization using Matplotlib & Seaborn.

    • @dataschool
      @dataschool  5 ปีที่แล้ว

      Thanks for your kind words, and for your suggestions! :)

  • @saachishivhare4836
    @saachishivhare4836 4 ปีที่แล้ว

    I am really loving your videos. Explored your channel just 2 days back!! Earlier I had no idea about pandas but after watching your video, I feel that I will be able to work on my assignment. Great Work! Thank you!

  • @FULLCOUNSEL
    @FULLCOUNSEL 7 ปีที่แล้ว +23

    You are doing an excellent job. You are called to do this for sure. Cheers

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      Wow, thank you so much for your comment! I really appreciate it.

  • @stevechops3226
    @stevechops3226 3 ปีที่แล้ว

    I cannot tell you how much you have helped me, with all sorts of problems! You have the clearest way of explaining things, thank you so much!

    • @dataschool
      @dataschool  3 ปีที่แล้ว

      You're so very welcome, thanks for your kind words! 🙏

  • @astroinceptor
    @astroinceptor 2 ปีที่แล้ว +1

    You saved my life twice today, your videos are great and the way you explain is really good. Thank you!

  • @fredcalo
    @fredcalo 7 ปีที่แล้ว

    This was awesome. Clear, concise, incredibly easy to follow. Your explanations (and bonus) were exactly what I was looking for.

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      Excellent! I'm glad the video was helpful to you!

  • @sinabaghaei3504
    @sinabaghaei3504 3 ปีที่แล้ว

    Your way of teaching makes learning Data Analysis very interesting to me. I really appreciate and wish you success.

  • @mingqian813
    @mingqian813 3 ปีที่แล้ว +2

    Thanks for all your well-made videos! I got to know you and your classes from Datacamp. As a beginner in the ML field, please allow me to ask a silly question. So if we have categorical features with missing values, do we need to handle missing values first then do categorical feature transformation using encoders? Or the order doesn't matter? Thanks!

    • @dataschool
      @dataschool  3 ปีที่แล้ว

      Great question! Previous to scikit-learn 0.24, missing values need to be handled first if you are going to one-hot encode them. Starting in 0.24, OneHotEncoder can handle missing values itself. Hope that helps!

  • @Person_Not_Known
    @Person_Not_Known 6 ปีที่แล้ว +2

    Thanks for your videos. most of the python online course i took... i just couldn't get into. Something about your cadence, data sets, and or approach just clicks with me. Thanks for the content.

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      That's awesome! Thanks so much for sharing!

  • @TR3NDSETR
    @TR3NDSETR 4 ปีที่แล้ว

    Thanks so much for making this video. You spoke slowly, clearly and very concise. Other videos I have to rewind and watch over, but i dont have to do that here. Looking forward to watching other.

    • @dataschool
      @dataschool  4 ปีที่แล้ว

      That's awesome to hear! Thanks for watching my videos 👍

  • @TrevorHigbee
    @TrevorHigbee 4 ปีที่แล้ว +1

    Great videos. I love how all the CSVs are available online.

  • @indreshkumar2002
    @indreshkumar2002 7 ปีที่แล้ว +1

    u are superb.i took a paid course but they were not able to make me explain these things as u explained me in such a easy way.thnx a lot.

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      You are very welcome! Thanks so much for your kind comment!

  • @ekehopenkiruka
    @ekehopenkiruka 6 ปีที่แล้ว

    Awesome, simple and straight to point with code, what i have been looking for weeks. Thank you so much. Do you have any video where you have used NSL KDD or KDD 99 data set to demonstrate data pre-processing as this is driving me naught.

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      I'm sorry, I don't have a video like that... good luck!

  • @sammy0722
    @sammy0722 4 ปีที่แล้ว +1

    Good video. Learnt a lot in short and crisp way

  • @SachinGairola
    @SachinGairola 6 ปีที่แล้ว

    great video series, I always fall back here whenever I'm stuck..Thanx for making them so informative...cheers

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      Thanks very much for your kind words!

  • @jiaxufan7050
    @jiaxufan7050 7 ปีที่แล้ว

    The best pandas tutorial ever. Hands down.

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      Wow! Thank you so much for your kind comment!

  • @rahulbhusari1478
    @rahulbhusari1478 2 ปีที่แล้ว +1

    Really clear and amazing tutorial

    • @dataschool
      @dataschool  2 ปีที่แล้ว

      Glad it was helpful!

  • @elilavi7514
    @elilavi7514 8 ปีที่แล้ว +14

    Awesome as usual !

  • @bagushari1886
    @bagushari1886 ปีที่แล้ว +1

    Could you please make a video on how to handling missing values in multiple sheets in pandas? Or any recommendation source that I can read about it?
    Thanks in advance

    • @dataschool
      @dataschool  ปีที่แล้ว

      Thanks for your suggestion!

  • @grumpyae86
    @grumpyae86 3 ปีที่แล้ว +1

    Exceptional would be a single word to describe your tutorial. Looking forward to binging on your videos lol. Thank you for such clear explanation.

  • @sapnasinha804
    @sapnasinha804 5 ปีที่แล้ว

    Fantastic explanation , however at the end would be good to mention that there are more ways to fill with value_counts , eg. With the mean of all other values etc and not just merging null column with any other column. Cheers!

  • @nadyamoscow2461
    @nadyamoscow2461 3 ปีที่แล้ว +1

    Thanks a lot, your course is really helpful and very detailed. You are a great teacher!

  • @msctube45
    @msctube45 4 ปีที่แล้ว

    Excellent video Data School, very helpful, your explanations are clear and objective. Thank you !

  • @firdharamadhani5162
    @firdharamadhani5162 4 ปีที่แล้ว

    i rarely leave youtube comment but thank you!! if it werent for your video i wouldn't understand how to do my assignment at all, you did a great job at explaining!

  • @rishimusicprods
    @rishimusicprods 3 ปีที่แล้ว

    This video is quite helpful and easy to understand. Thanks a lot!

  • @mustafabohra2070
    @mustafabohra2070 5 ปีที่แล้ว +1

    The content you shared is Gold!!

  • @konradpyrz8559
    @konradpyrz8559 3 ปีที่แล้ว +1

    This yung gentleman is simply amazing.

    • @dataschool
      @dataschool  3 ปีที่แล้ว

      Thank you! I'm actually 40 years old now 😊

  • @tommonks2490
    @tommonks2490 4 ปีที่แล้ว +1

    Great explanation. This was a huge help. Thanks so much!

    • @dataschool
      @dataschool  4 ปีที่แล้ว

      You're very welcome!

  • @vishwanathg8083
    @vishwanathg8083 7 ปีที่แล้ว

    Thank you , You made learning pandas a cake walk.

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      Awesome, that's great to hear!

  • @ashishsahu2925
    @ashishsahu2925 3 ปีที่แล้ว

    Really helpful. This means if one needs to figure out number of rows with 1 or more Null values, the code should look like dataframe[dataframe.isnull().sum(axis=1) > 0].

  • @-MinhazulFerdous
    @-MinhazulFerdous 6 ปีที่แล้ว

    you are a life saver man...... i was fucked up with errors for only 2 missing values in a row of 1000 data

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      Glad to hear I could be of help!

  • @azbas
    @azbas 7 ปีที่แล้ว

    Thank you for simple and detailed explanation including the use of features.

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      You're very welcome!

  • @MrsRimouch
    @MrsRimouch 2 ปีที่แล้ว

    Thank you so much. It is always clearer to listen to you!!

    • @dataschool
      @dataschool  2 ปีที่แล้ว +1

      You are so welcome!

  • @wesleypgurira7142
    @wesleypgurira7142 2 ปีที่แล้ว +1

    and by the way i love the way you teach , its just perfect

  • @BrokenLightPole
    @BrokenLightPole 5 ปีที่แล้ว +1

    Great video and explanation as always!

  • @dishydez
    @dishydez 4 ปีที่แล้ว

    Great video btw. Just a quick question. I am trying to build a benchmark, would it be okay to make the data standardized before creating it or?

  • @nimesharya909
    @nimesharya909 8 ปีที่แล้ว +3

    Amazing, clear, precise and I got it working as well :)

    • @dataschool
      @dataschool  8 ปีที่แล้ว +2

      Great to hear!

  • @RahulKumar-bh9hb
    @RahulKumar-bh9hb 4 ปีที่แล้ว

    Explanation techniques is great........want to thank you for sharing your knowledge......Grt videos

  • @rajpaul1501
    @rajpaul1501 3 ปีที่แล้ว

    Truly amazing videos. Can you do a series on Matplotlib and Seaborn

    • @dataschool
      @dataschool  3 ปีที่แล้ว

      Thanks for your kind words and suggestion!

  • @ExcelTutorials1
    @ExcelTutorials1 2 ปีที่แล้ว +1

    This is super helpful, thank you!!!!!

    • @dataschool
      @dataschool  2 ปีที่แล้ว

      Glad it was helpful!

  • @TheBeltranito
    @TheBeltranito 3 ปีที่แล้ว

    Hey, first of all thanks a lot for your videos!
    One question regarding the fillna() method you use. At the end of the video, when you check the NAs in Shape Reported it said that there were 2644 NaN. However, when you use the fillna() method, it appears that there are 2977 VARIOUS. I dont understand why there are more VARIOUS than NaN?
    Thanks in advance

    • @TheBeltranito
      @TheBeltranito 3 ปีที่แล้ว +1

      Okay nvm, there was already a group called various with 333 observations

  • @adanpalma9160
    @adanpalma9160 5 ปีที่แล้ว

    Kevin Thank you very much for taking so much time to teaching us about pandas and much more. I have a litte question. Why when you ran this statement
    ufo['Shape Reported'].value_counts(dropna = False) the output showed NaN it was 2644 rows.
    but when you ran the fillna ufo['Shape Reported'].fillna(value='VARIOUS',inplace=True) the Various showed 2977. It shouldn't show 2644 ?
    Thanks for any comments

    • @dataschool
      @dataschool  5 ปีที่แล้ว

      Great question! I think that the Shape Reported column already had the value "VARIOUS" for some rows, and then I set that value for an additional 2644 rows.

  • @carolinasantoslages5604
    @carolinasantoslages5604 4 ปีที่แล้ว

    Excellent! Still have one doubt: how do I creat a third column (dummy variable) based on others two columns (dummy variables), considering that they have missing values. I don´t want to lose information, in other words, I want to consider the pair (NaN, 1) or (0, NaN) as 1 or 0.

  • @saranyan4123
    @saranyan4123 7 ปีที่แล้ว

    Thank you for the clear and quick explanation. Very helpful !!

  • @LonglongFeng
    @LonglongFeng 7 ปีที่แล้ว

    very nice tutorial, your style of teaching is awesome like an amazing opera singer

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      What a compliment, thanks! :)

  • @PointlessVanessa
    @PointlessVanessa 6 ปีที่แล้ว

    Great video! I learned a lot! I just wished you talked about non discrete values as well. I'm having some trouble to replace missing numerical data and I don't want to replace them with zero because that would make my dataset biased. My goal was to replace that missing data with the mean of the data that I have.

    • @PointlessVanessa
      @PointlessVanessa 6 ปีที่แล้ว

      The only problem is that I don't know how to do that (yet).

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      Glad you liked the videos! You can do something like this: df.fillna(df.mean())

  • @rohitsinghal2972
    @rohitsinghal2972 4 ปีที่แล้ว +1

    Sir what you told in the video is applicable only for the numbers and what should be done for the string values?

    • @dataschool
      @dataschool  4 ปีที่แล้ว +1

      If you like, you can impute missing string values with the most common values using scikit-learn's SimpleImputer.

    • @SkillTop
      @SkillTop 4 ปีที่แล้ว

      ROHIT SINGHAL Hi programmer🔌🤩 pleaaaase see my channel🌹

  • @HasanSuper
    @HasanSuper 3 ปีที่แล้ว

    What a beautiful video and such great explanation. Beautiful. Keep it up

  • @gouravkushwaha4488
    @gouravkushwaha4488 5 ปีที่แล้ว

    You are good. Your explanation really made it simple.

  • @mdfaiz4583
    @mdfaiz4583 5 ปีที่แล้ว

    great tutor...great way of making us understand.... so easy and intuitive

  • @rehmanullahkhan7389
    @rehmanullahkhan7389 7 ปีที่แล้ว

    Thank you so much.Very useful. I have no wordings to appreciate you. I liked your way of teaching very much. You became my ideal in teaching.

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      You're very welcome! Thanks so much for your kind words!

  • @HJ-uy6ez
    @HJ-uy6ez 3 ปีที่แล้ว +1

    Great video, how do I get the codes? Does any know or can help? Thanks

    • @dataschool
      @dataschool  3 ปีที่แล้ว +1

      See here: nbviewer.jupyter.org/github/justmarkham/pandas-videos/blob/master/pandas.ipynb Hope that helps!

  • @oysteijo
    @oysteijo 8 ปีที่แล้ว +3

    Hi Kevin!
    How can I fill na based on a condition? Say I want to fill NA for all missing cities, but only if the color is red.

    • @dataschool
      @dataschool  8 ปีที่แล้ว +7

      Great question! ufo.loc[(ufo.City.isnull()) & (ufo['Colors Reported']=='RED'), 'City'] = 'New value'

    • @ashishkhuraishy
      @ashishkhuraishy 6 ปีที่แล้ว

      Man thx btw😁

  • @inasbadr
    @inasbadr 4 ปีที่แล้ว

    Thanks a lot for your valuable video, I have a question: how can I drop the nans of all the nan rows except the rows who has values of one or two columns?

  • @康朵朵-d1r
    @康朵朵-d1r 5 ปีที่แล้ว

    all the content in the video are presented clear!!! thanks very much!! we love you!!

    • @dataschool
      @dataschool  5 ปีที่แล้ว

      And we love you! 😉

  • @akashjoshi6826
    @akashjoshi6826 7 ปีที่แล้ว

    Fantastic video Sir.Your work is really commendable.It would be great if you can make a video about imputing the missing values in Python.

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      Thanks for your suggestion, and your kind comments!

  • @MrJioYoung
    @MrJioYoung 4 ปีที่แล้ว +1

    Thank you! Great instructions!

  • @vijaysinghchauhan5118
    @vijaysinghchauhan5118 4 ปีที่แล้ว

    Do you have posted any video on how to replace NaN values for a column by deriving it from other columns like using KNN or any other imputation technique

  • @tyl9680
    @tyl9680 6 หลายเดือนก่อน

    In the last part of the video, why the number of "VARIOUS" made by fillna doesn't match the previous NA number?

  • @jieqi6341
    @jieqi6341 5 ปีที่แล้ว

    Thank you so much! you are amazing as always ! I really appreciate it ! Please don't stop making these videos !

  • @nataliyakunderevych1211
    @nataliyakunderevych1211 6 ปีที่แล้ว +1

    Super. I understood everything. Nice explanation

  • @alexhenning7086
    @alexhenning7086 4 ปีที่แล้ว +1

    Superb video! Thanks a lot it helps alot !

  • @yashugarg1815
    @yashugarg1815 5 ปีที่แล้ว +2

    Doubt: Sir , If I want to assign Na to a value suppose 5.Means where ever 5 is present in a DataFrame it will be replaced by Na.then how I have to proceed????
    Thanks

    • @mountainscott5274
      @mountainscott5274 5 ปีที่แล้ว +2

      df.column_name.replace(5, np.nan, inplace = True)
      check to make sure values are replaced with df.info()

    • @dataschool
      @dataschool  4 ปีที่แล้ว

      Nice!

  • @usmanshaikh1115
    @usmanshaikh1115 6 ปีที่แล้ว

    Very useful and easily explained.

  • @ariramkilowan8051
    @ariramkilowan8051 7 ปีที่แล้ว

    Great video, would like it twice if I could. Is there perhaps a way to fillna with a user defined function, or a nearest neighbour (that is not NaN)?

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      Thanks! Regarding fillna, it is limited to filling missing values with defined values or back-fill/forward-fill methods, but it doesn't offer any nearest neighbor approaches. You can read more here: pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.fillna.html

  • @abdozzahra
    @abdozzahra 5 ปีที่แล้ว

    hello and tnx for your greats tutorials, have a question
    is there any way to handle exceptions in pandas?

    • @dataschool
      @dataschool  5 ปีที่แล้ว

      It's no different than exception handling in every other area of Python, as far as I know. Hope that helps!

  • @nyashagracenhandara7757
    @nyashagracenhandara7757 2 ปีที่แล้ว +1

    thank you so much the explanation is very clear

    • @dataschool
      @dataschool  2 ปีที่แล้ว

      Glad it was helpful!

  • @paula805
    @paula805 6 ปีที่แล้ว +1

    What inspires a down vote on any of these videos?? Always great content!

  • @nataliaagudelo8635
    @nataliaagudelo8635 5 ปีที่แล้ว

    As always, your videos are very helpful!

    • @dataschool
      @dataschool  5 ปีที่แล้ว

      Thanks very much for your kind words!

  • @bobmli
    @bobmli 6 ปีที่แล้ว

    Hi there! Congrats for your videos you're super clear!
    I was wandering: what if I want to substitute all the NaN values of the Shape Reported series with values chosen from the normalized distribution obtained from ufo['Shape Reported'].value_counts(normalize=True)?
    How can I ask pandas to evoke values from the frequency distribution of a series? Is it possible to do it directly?
    Thanks a lot! You're doing super good!

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      Great question! I'm not sure if there's a simple way to do that in pandas.
      Thanks for your kind words!

  • @madhusaikiranml6601
    @madhusaikiranml6601 7 ปีที่แล้ว

    Hi Kevin ,
    What do you do for missing values if all your columns/features are categorical. I understand for numerical we can take mean or most frequently occurring value etc but for categorical features where we just have 'words' as values( may be converted using LE/OHE or get_dummies) what are our options and what in your opinion works best?

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      For a categorical feature, you can fill missing values with the most frequently occurring category, or you can treat missing values as their own category. There's no universally "best" strategy - it is a judgment call depending on the nature of the problem and your knowledge of the dataset. Hope that helps!

  • @uttamkumarpatra7616
    @uttamkumarpatra7616 5 ปีที่แล้ว +1

    You are simply awesome :) .thank you for making such wonderful videos

    • @dataschool
      @dataschool  5 ปีที่แล้ว

      That's so nice of you to say - thank you!

  • @amishbhat3560
    @amishbhat3560 4 ปีที่แล้ว +1

    You told how to handle NaN values but if there are some other values such as "Not Provided" then what to do?
    How to ignore them?

    • @dataschool
      @dataschool  4 ปีที่แล้ว

      Excellent answer! 👏

  • @saiftazir
    @saiftazir 7 ปีที่แล้ว

    Dear Sir , Small question here , i want to replace "..." in specific column name "Energy supply" . What i am doing is
    en1['Energy Supply']=en1['Energy Supply'].str.replace("...", "NaN")
    what this does is it disturbs all other values that are correct into NaN
    My objective here is to replace "..." to NaN

    • @dataschool
      @dataschool  7 ปีที่แล้ว

      I would not advise using the text "NaN" to denote missing values. Rather, you should be setting those values to "nan" from the NumPy library. An example is shown in this video: th-cam.com/video/4R4WsDJ-KVc/w-d-xo.html
      Does that help?

  • @joehopewell
    @joehopewell 7 ปีที่แล้ว

    Thank you!!!! All the good stuff, all in the same place...love it!

  • @nasser_omar
    @nasser_omar 3 ปีที่แล้ว

    What about displaying the rows where columns 'A' and 'B' both of them have any missing values?

  • @pegasoos
    @pegasoos 5 ปีที่แล้ว

    I watched your first video, you are legend!

  • @tresortshimbombo3133
    @tresortshimbombo3133 5 ปีที่แล้ว +1

    That's exactly what I was looking for!

  • @adelabdallah3833
    @adelabdallah3833 ปีที่แล้ว

    I actually have question, I have a dataframe grouped by month and country. Some of those countries don't have a value for a certain month which is causing anomalies in the visualization. I want to generate a record for the month and the country with zero if no record is found, how can I achieve that?
    Thanks in advance

  • @NR_Tutorials
    @NR_Tutorials 5 ปีที่แล้ว +3

    thanks for Nice lecture we love ur sir

  • @eniisy
    @eniisy 2 ปีที่แล้ว

    Dude it's just an awesome video, forgive me for saying this turning playback speed 1.25 is felts more normal hahah .Love ya, appreciate for your effort about teaching piece by piece !!!!!

  • @FE12343
    @FE12343 4 ปีที่แล้ว

    Amazing video, thanks!

  • @jonathanfriz4410
    @jonathanfriz4410 3 ปีที่แล้ว

    Hi, how you can handle the ValueError: arrays must all be same length ? when df.transpone() is not an option?

  • @salamatburj9502
    @salamatburj9502 6 ปีที่แล้ว

    I think it would be great if you can make lecture about handling missing missing values for machine learning.

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      Thanks for your suggestion!

  • @joescanlon7502
    @joescanlon7502 8 ปีที่แล้ว

    Hi, is there an acceptable cut off point for dropping an entire column if it has missing values for a large % of rows?
    For example if 90% of rows displayed NAN for City, should you consider dropping this col?

    • @dataschool
      @dataschool  8 ปีที่แล้ว

      There's no rule of thumb... rather, it depends entirely on what you are planning to do with the data. Maybe having 10% of City values present is plenty. Or maybe that column is only useful if it's 100% complete. Or maybe you only want to keep the rows in which City is present. This is where having a deep understanding of your data and the problem you are trying to solve is critical!

  • @rishusaini7874
    @rishusaini7874 6 ปีที่แล้ว

    Hi Data School,
    Hope you are doing very well,
    My Question : Can we highlight NaN value in result data?
    Waiting of your appreciate result.
    Thanks

    • @dataschool
      @dataschool  6 ปีที่แล้ว

      There's no simple way that I'm aware of.

  • @delmaregals
    @delmaregals ปีที่แล้ว +1

    Hi let's say I accidentally changed the value like the one I line 19 where NAN is change to Various can I reverse the change?

    • @dataschool
      @dataschool  ปีที่แล้ว

      No, changes made through assignment (or inplace operations) are permanent!

  • @preetammishra6468
    @preetammishra6468 5 ปีที่แล้ว

    This is great video,I just have one question how do I fill missing 25 city values with the city with most ufo sightings of the same state?

    • @dataschool
      @dataschool  5 ปีที่แล้ว

      I'm not sure how to do this off-hand. Good luck!

  • @rohitk9954
    @rohitk9954 5 ปีที่แล้ว

    Sir , you are simply awesome !! Could you please tell me from where could I get a complete Data Science Course taught by you ??

    • @dataschool
      @dataschool  4 ปีที่แล้ว +1

      Thanks! I don't have quite what you are looking for, but all of my tutorials and courses are here: www.dataschool.io/start/

    • @rohitk9954
      @rohitk9954 4 ปีที่แล้ว +1

      @@dataschool thank you so much ☺️☺️☺️☺️

  • @wesleypgurira7142
    @wesleypgurira7142 2 ปีที่แล้ว

    hey , how can we replace a NaN value with the previous value in a database like on ufos (shapes ) instead of various you place maybe rectangle shape if it was before the NaN value

  • @Andrew6James
    @Andrew6James 4 ปีที่แล้ว

    I thought we said that no values were missing from the City or Shape reported columns. Why do we see rows dropped at @11:07?

    • @dataschool
      @dataschool  4 ปีที่แล้ว

      Values are missing from both the City and Shape Reported columns.