Tutorial 7- Pandas-Reading JSON,Reading HTML, Read PICKLE, Read EXCEL Files- Part 3

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ต.ค. 2024
  • Hello All,
    Welcome to the Python Crash Course. In this video we will understand about Pandas library, how to read JSON ,HTML, PICKLE and Eexcel files.
    github url : github.com/kri...
    Support me in Patreon: / 2340909
    Connect with me here:
    Twitter: / krishnaik06
    Facebook: / krishnaik06
    instagram: / krishnaik06
    If you like music support my brother's channel
    / @ultralifeproject
    Buy the Best book of Machine Learning, Deep Learning with python sklearn and tensorflow from below
    amazon url:
    www.amazon.in/...
    You can buy my book on Finance with Machine Learning and Deep Learning from the below url
    amazon url: www.amazon.in/...
    Subscribe my unboxing Channel
    / @krishnaikhindi
    Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!
    Deep Learning Playlist: • Tutorial 1- Introducti...
    Data Science Projects playlist: • Generative Adversarial...
    NLP playlist: • Natural Language Proce...
    Statistics Playlist: • Population vs Sample i...
    Feature Engineering playlist: • Feature Engineering in...
    Computer Vision playlist: • OpenCV Installation | ...
    Data Science Interview Question playlist: • Complete Life Cycle of...
    You can buy my book on Finance with Machine Learning and Deep Learning from the below url
    amazon url: www.amazon.in/...
    🙏🙏🙏🙏🙏🙏🙏🙏
    YOU JUST NEED TO DO
    3 THINGS to support my channel
    LIKE
    SHARE
    &
    SUBSCRIBE
    TO MY TH-cam CHANNEL

ความคิดเห็น • 112

  • @aryamahima3
    @aryamahima3 5 ปีที่แล้ว +78

    I have taken udemy (1000 rs INR) course for python for data science. Your video are far better and more intense than that course.
    Thanks a lot.

    • @nandhakishore8950
      @nandhakishore8950 4 ปีที่แล้ว +1

      Exactly

    • @vijayjb1704
      @vijayjb1704 4 ปีที่แล้ว

      Whose Instructor of your course? Bcoz i also took one

    • @sunitapatil381
      @sunitapatil381 4 ปีที่แล้ว +2

      thank you for telling this i was thinking to join but now i feel this is better than you so much

    • @Virus-ke8xj
      @Virus-ke8xj 4 ปีที่แล้ว

      True

    • @shrirangsapate
      @shrirangsapate 3 ปีที่แล้ว

      Very True.

  • @robyshah6879
    @robyshah6879 4 ปีที่แล้ว +20

    In Video it is been conveyed that Wine.data file is in JSON format which is not and instead its in CSV format. Guys pls take note of this error.

    • @PriteshsRhymes
      @PriteshsRhymes 2 ปีที่แล้ว +2

      Even I was thinking the same thing, Json was never used :(

    • @fariqjamil5484
      @fariqjamil5484 ปีที่แล้ว +1

      that's why he used pd .read_csv("path"). But what is Header

  • @veltechunivalumnidept2171
    @veltechunivalumnidept2171 11 หลายเดือนก่อน

    Learning day by day from your videos. Thank you so much. Learning from basics

  • @suchitanaik6728
    @suchitanaik6728 3 ปีที่แล้ว +11

    Being a fresh learner in Python, your videos are been a blessing. Once I finish all the videos, it will be easy to get through proper certification course

  • @mohdzain1741
    @mohdzain1741 4 ปีที่แล้ว +6

    It would be really helpful if you could provide the links of the dataframes you are using 😃

  • @md.omankhan8648
    @md.omankhan8648 3 ปีที่แล้ว

    I am soo glad that I did not skip this video, learned a lot

  • @gauravmarathe3730
    @gauravmarathe3730 4 ปีที่แล้ว +21

    Everthing is Pretty much simple pretty much easy 😆🤘🤣😁

  • @tahabimuhammad4524
    @tahabimuhammad4524 5 ปีที่แล้ว +8

    If I download the wine.data from the given link, I see that it was already in csv format instead of json format not like what said in the video. After applying df.to_csv it actually added index of row from 0 to 177 and column index 0 to 13 in the new generated wine.csv file.

    • @vishwajitbhagat9515
      @vishwajitbhagat9515 3 ปีที่แล้ว

      but does it add column names too ?

    • @AnjaliGupta-cm1zo
      @AnjaliGupta-cm1zo 3 ปีที่แล้ว +2

      And also in the video we are reading this file using read_csv but he is saying abt json file, I'm not able to understand

  • @khushboochhabra2136
    @khushboochhabra2136 8 หลายเดือนก่อน +1

    You skipped the part that you couldn't answer! In the second html page, there were other tables as well with "Country" as column names, but you tried to deviate from the explanation by quickly switching the tabs. 2. You didn't mentioned to install lxml for reading xml file. I believe such small small things are important to tell a newbie

  • @sachinbairi6353
    @sachinbairi6353 3 ปีที่แล้ว +2

    Sir, doubt !!! you said [ 2:25 ] that wine.data is in JSON format so why are you reading a JSON data using read_csv ???

  • @ashamaheshk7306
    @ashamaheshk7306 5 ปีที่แล้ว +1

    I like ur teaching 🤘

  • @aryanrana5658
    @aryanrana5658 3 ปีที่แล้ว +2

    When i read excel file by using pd.read_excel(file name.xlsx) ,even that file my laptop contains but i still get the error of "there is no such file or directory ".

    • @kojorichardson4283
      @kojorichardson4283 2 ปีที่แล้ว

      check to make sure that your excel file is in the same directory as your current working directory.

  • @nareshjanjirala472
    @nareshjanjirala472 5 ปีที่แล้ว +1

    hai Krish please make a video on how to import data directly from data base to python

  • @ranaasad6132
    @ranaasad6132 3 ปีที่แล้ว

    Thank you,Sir
    Love from Pakistan

  • @rishabhtewari4357
    @rishabhtewari4357 3 ปีที่แล้ว +2

    The way you explained everything looks like too easy and interesting .Thanks for providing all the stuff. I am following the same path as suggested by you .THANKS

  • @chiragkapoor32
    @chiragkapoor32 2 ปีที่แล้ว +4

    at 11:11 i am getting error "No tables found"

    • @hrshtmlng
      @hrshtmlng 6 หลายเดือนก่อน +1

      before that i was getting error related to import of html5lib

  • @roushanraj2654
    @roushanraj2654 4 ปีที่แล้ว +4

    11:16 in case u get an import error, perform this: "pip install lxml"

    • @cartiktechnomechnobro9061
      @cartiktechnomechnobro9061 4 ปีที่แล้ว

      Still its showing module lxml not found
      i wrote that command in windows prompt and when it wasnt working then i wrote in Anaconda Prompt as well but there its showing requirements satisfied

    • @pqs403
      @pqs403 3 ปีที่แล้ว +1

      @@cartiktechnomechnobro9061 try "conda install lxml" on anaconda prompt

  • @dilippradhan94
    @dilippradhan94 5 ปีที่แล้ว +1

    Bro please upload deep learning videos...

  • @QaAutomationAlchemist
    @QaAutomationAlchemist 5 ปีที่แล้ว +2

    Great content and especially reading html pages...thanks a lot!

  • @saumyagupta2606
    @saumyagupta2606 2 ปีที่แล้ว +1

    Hello Krish after scraping the table from web, how do I save the list to csv ?

  • @omkarkabade79
    @omkarkabade79 4 ปีที่แล้ว +3

    2:36 [ is it possible to read json format file using read_csv?]

  • @ashutoshsharma6883
    @ashutoshsharma6883 3 ปีที่แล้ว

    Sir in the video you said that the wine data is in json but you are reading it with read_csv.

  • @lngwnd1
    @lngwnd1 3 ปีที่แล้ว +3

    The url from the video didn't work for me... This is the correction @11:11 'www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/'

    • @parveenparveen9384
      @parveenparveen9384 3 ปีที่แล้ว

      Thank you. But it throws error saying No tables found. How to solve this?

    • @parveenparveen9384
      @parveenparveen9384 3 ปีที่แล้ว

      @@payaldhekwar2717 , by using request module i have resolved this error
      import requests
      import pandas as pd
      url1= 'www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/'
      crypto_url = requests.get(url1)
      crypto_url

    • @dreamday4810
      @dreamday4810 2 ปีที่แล้ว

      @@parveenparveen9384

    • @dreamday4810
      @dreamday4810 2 ปีที่แล้ว

      This not working after applying your method as well. i think concerned site put some restriction on scraping

  • @sapnilpatel1645
    @sapnilpatel1645 2 ปีที่แล้ว +1

    when i run this line -> df=pd.read_csv('archive.ics.uci.edu/ml/machine-learning-database/wine/wine.data',header=None) i am getting the error that 404:not found. so anyone have new link for the same data?

    • @soofishafiya2632
      @soofishafiya2632 2 ปีที่แล้ว +1

      I am also getting this error...did you find any solution to it ??

    • @sapnilpatel1645
      @sapnilpatel1645 2 ปีที่แล้ว

      Not yet.

    • @jn9281
      @jn9281 ปีที่แล้ว

      In the link you have typed database, it will be databases

  • @kritikaverma3762
    @kritikaverma3762 4 ปีที่แล้ว +1

    anyone else getting this error when try to read json file?
    File "", line 1
    jsonData = pd.read_json('C:\Users\kritika\ML\example_1.json')
    ^
    SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

    • @sripadhavallabhagoud9590
      @sripadhavallabhagoud9590 4 ปีที่แล้ว +1

      This error occurs because you are using a normal string as a path.
      you can add r before your normal string it converts normal string to raw string: jsonData = pd.read_json(r"C:\Users\kritika\ML\example_1.json")
      or
      jsonData = pd.read_json("C:\\Users\\kritika\\ML\\example_1.json")
      or
      jsonData = pd.read_json("C:/Users/kritika/ML/example_1.json")

  • @21Gannu
    @21Gannu 5 ปีที่แล้ว +2

    When is your full tutorial going live?

  • @hardikchoudhary3596
    @hardikchoudhary3596 5 ปีที่แล้ว +1

    Pickle and to_csv are basically same except the extension right ? Is there any benefit of using pickel....thanks

    • @krishnaik06
      @krishnaik06  5 ปีที่แล้ว +4

      Pickle can be used for any data structure eg: model, files. It usually requires less space also

  • @adityanaik1800
    @adityanaik1800 ปีที่แล้ว

    This type of error coming after i type..
    df = pd.read_html(url)

  • @curious_bird
    @curious_bird 2 ปีที่แล้ว

    Hi, You told that the next tutorial will be of mongodb but there is no mongodb tutorial.

  • @w.g.ogaming2210
    @w.g.ogaming2210 ปีที่แล้ว

    HTML is not installing... Help someone 😔

  • @ashishbrahmankar2143
    @ashishbrahmankar2143 3 ปีที่แล้ว

    I am getting below error while converting Data to json, could anyone help please ?
    df1=pd.read_json(Data)
    ValueError: Invalid file path or buffer object type:

  • @roshankumargupta46
    @roshankumargupta46 5 ปีที่แล้ว +1

    (url_mcc, match='Country', header=0)
    What if two table having same column name, which one it will select?

    • @krishnaik06
      @krishnaik06  5 ปีที่แล้ว +3

      Probably the first one

  • @2167shrihari
    @2167shrihari 7 หลายเดือนก่อน

    how does your output look so nicely arranged with shading , mine op looks like list numbers when reading from html

  • @satyacenation5874
    @satyacenation5874 5 ปีที่แล้ว

    Hi Krish,
    Im getting the below error...while trying to read the table from url....
    405 # this version of raise is a syntax error in Python 3
    URLError:

  • @JacklinSibiyal
    @JacklinSibiyal 7 หลายเดือนก่อน

    Day 3 - 18/02/2024

  • @srikanthchandana4485
    @srikanthchandana4485 3 ปีที่แล้ว

    Hi Krish,
    If there are multiple tables with same column headers(for eg: in mobile country code data, there are other tables as well, with same column headers), then how to extract that specific table. Kindly let us know!...Thanks in advance.

  • @vineetkrpandey7641
    @vineetkrpandey7641 5 หลายเดือนก่อน

    tut-7//13/04/2024

  • @praveenshenoy8064
    @praveenshenoy8064 3 ปีที่แล้ว

    i am getting error as invalid syntax while working on first Json example statement , what may be the reason..?

  • @suraushareddy4454
    @suraushareddy4454 3 ปีที่แล้ว

    Hii sir , i have smAll doubtpython is enough through this vedios

  • @sunnysolanki2460
    @sunnysolanki2460 ปีที่แล้ว

    at 2:30 you are using pd.read_csv to read a json file

  • @ankitgupta8797
    @ankitgupta8797 3 ปีที่แล้ว

    if wine data is in json format, how are you reading at as csv using read_csv ?

  • @karansaini7855
    @karansaini7855 3 ปีที่แล้ว

    when i write any website address for data,i get .How to fix this

  • @datascienceexpert6524
    @datascienceexpert6524 4 ปีที่แล้ว

    please show how to read multiple csv or excel files

  • @sudeeprajput1830
    @sudeeprajput1830 3 ปีที่แล้ว

    Thanks brother for this informative video

  • @MohammedImteyazVlogs
    @MohammedImteyazVlogs 3 ปีที่แล้ว

    Hi sir when I practice my jupyter notebook showing
    Name error: name ' pd' is not defined
    Name error: name 'df' is not defined
    What I should do sir

  • @ashwini4683
    @ashwini4683 3 ปีที่แล้ว

    df1 = pd.read_json(Data) is showing error
    ValueError: Expected object or value
    please help

    • @suchitanaik6728
      @suchitanaik6728 3 ปีที่แล้ว

      Check the file name mentioned, might possible you have used a different name, I have done the same code with different assignment
      df=pd.read_json(Jdata)

  • @jagajayaraman5200
    @jagajayaraman5200 4 ปีที่แล้ว

    Hi sir
    How to convert html table to csv sir ?

  • @shrirangsapate
    @shrirangsapate 3 ปีที่แล้ว

    Sir, can you elaborate more about pickle?

  • @rahul4upandey
    @rahul4upandey 5 ปีที่แล้ว

    can you please have video on mongo db ccreated?

  • @krishanpalsingh973
    @krishanpalsingh973 5 ปีที่แล้ว +2

    I'm not getting playlist

    • @krishanpalsingh973
      @krishanpalsingh973 5 ปีที่แล้ว +1

      Please playlist name

    • @shwetaredkar734
      @shwetaredkar734 5 ปีที่แล้ว +1

      @@krishanpalsingh973 th-cam.com/video/7S865QCGL74/w-d-xo.html

  • @hometvfirestick
    @hometvfirestick 3 ปีที่แล้ว

    thanks

  • @arrooow9019
    @arrooow9019 3 ปีที่แล้ว

    What a amazing video🤩🤩🤩

  • @robinfelix3879
    @robinfelix3879 3 ปีที่แล้ว

    vera level content bro

  • @AvinashKumarMAD
    @AvinashKumarMAD 4 ปีที่แล้ว

    Hello krish i am in very starting phase of learning python in which your channel is helping a lot from which i am learning continuously.
    I am just trying to execute the below code
    import pandas as pd
    Data = '{"employee_name":"James","email":"james@gmail.com"}'
    pd.read_json(Data)
    but giving error "If using all scalar values, you must pass an index"
    but with Data = '{"employee_name": "James", "email": "james@gmail.com", "job_profile": [{"title1":"Team Lead", "title2":"Sr. Developer"}]}' this is working fine.
    I am unable to identify the difference, can you please help

    • @gulshanarya1714
      @gulshanarya1714 4 ปีที่แล้ว

      since you are using all values string i.e. a scalar you must pass list or dict in values like '{"employee_name":"James","email":["james@gmail.com"]}'

    • @atifiqbalm
      @atifiqbalm 2 ปีที่แล้ว

      the difference is [ and ]

  • @indirajithkv7793
    @indirajithkv7793 2 ปีที่แล้ว

  • @samikshabharne1251
    @samikshabharne1251 3 ปีที่แล้ว

    want to convert json file into dataframe but got this error: (Invalid file path or buffer object type: )

    • @samikshabharne1251
      @samikshabharne1251 3 ปีที่แล้ว

      got answer ,its just syntax error

    • @samikshabharne1251
      @samikshabharne1251 3 ปีที่แล้ว

      j_file='{"emp_name":"samiksha","email":"bharnesm@gmail.com","emp_address":[{"title":"mr.","name":"suhas"}]}' , i just forgot to write all the code in a single quotation mark

  • @louerleseigneur4532
    @louerleseigneur4532 3 ปีที่แล้ว

    Thanks Krish

  • @srinukondaveeti9558
    @srinukondaveeti9558 4 ปีที่แล้ว

    Sri, while dealing with json file , i got the
    data={"a" : "name", "b" : "num"}
    pd.read_json(data)
    ERROR: "Invalid file path or buffer object type" i don't understand this

    • @pakhigupta2869
      @pakhigupta2869 4 ปีที่แล้ว +6

      First of all read_json() functions require String value, not a dict. So, data should be:
      data='{"a" : "name", "b" : "num"}'
      Then with this value, you will get a 'ValueError'.
      ValueError: If using all scalar values, you must pass an index
      This is because the read_json() function has a parameter 'typ' which is DataFrame by default, while data has Series value.
      So we either convert our data value to DataFrame, or change the typ parameter:
      1. Convert data from Series to DataFrame:
      data='[{"a" : "name", "b" : "num"}]'
      pd.read_json(data)
      2. change the 'typ' parameter
      data='{"a" : "name", "b" : "num"}'
      pd.read_json(data, typ='Series')

    • @debayanmazumdar3056
      @debayanmazumdar3056 3 ปีที่แล้ว

      @@pakhigupta2869 I was getting the same error, thank you so much for your reply it was very helpful!!

    • @ruchitpatel107
      @ruchitpatel107 3 ปีที่แล้ว

      @@pakhigupta2869 thanks mannn

  • @sharathkumar1387
    @sharathkumar1387 4 ปีที่แล้ว

    11:32 Sir please explain why we are using [0] index in dfs[0]??

    • @manishpahuja8127
      @manishpahuja8127 3 ปีที่แล้ว +3

      Hey it seems that the read_html looks up all the tables available on the html page and gives out a list containing the different datasets(tables) on the said page. Using dfs[0] returns the first dataset in the list, which is what appears in Krish's code! Please let me know if this helps!

  • @nidhijakhad128
    @nidhijakhad128 4 ปีที่แล้ว

    Hello Sir , when I am using pd.read_json() it is giving a value error .
    Saying that : if using all scalar values , you must pass an index .
    Please help me out with this !
    Thanks

    • @pakhigupta2869
      @pakhigupta2869 4 ปีที่แล้ว +2

      This is because the read_json() function has a parameter 'typ' which is DataFrame by default, while data has Series value.
      So we either convert our data value to DataFrame, or change the typ parameter:
      1. Convert data from Series to DataFrame:, ie pass the json object inside [], so that each dict inside this list is treated as each row of the DataFrame
      data='[{"a" : "name", "b" : "num"}]'
      pd.read_json(data)
      2. change the 'typ' parameter
      data='{"a" : "name", "b" : "num"}'
      pd.read_json(data, typ='Series')

    • @nidhijakhad128
      @nidhijakhad128 4 ปีที่แล้ว

      @@pakhigupta2869 Thank you

  • @EcExplorer
    @EcExplorer 2 ปีที่แล้ว

    read_html does not work. Do I need something else to install as well?

    • @EcExplorer
      @EcExplorer 2 ปีที่แล้ว

      ---------------------------------------------------------------------------
      ImportError Traceback (most recent call last)
      Input In [50], in ()
      1 url_mcc = 'en.wikipedia.org/wiki/Mobile_country_code'
      ----> 2 dfs = pd.read_html(url_mcc, match='Country', header=0)
      File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py:311, in deprecate_nonkeyword_arguments..decorate..wrapper(*args, **kwargs)
      305 if len(args) > num_allow_args:
      306 warnings.warn(
      307 msg.format(arguments=arguments),
      308 FutureWarning,
      309 stacklevel=stacklevel,
      310 )
      --> 311 return func(*args, **kwargs)

  • @betsythomas5971
    @betsythomas5971 4 ปีที่แล้ว

    Can anyone tell me why do we take dfs[0] in line 166

    • @betsythomas5971
      @betsythomas5971 4 ปีที่แล้ว

      @Light Bringer understood.Thank you

  • @manishpahuja8127
    @manishpahuja8127 3 ปีที่แล้ว +1

    Hey Krish! Great video!
    You have mentioned that using pickle, we can avoid running the entire code every time while doing pre processing and model training which takes a lot of time especially for large datasets and multiple attempts at model building. Is there any video where this concept is explained in more detail? Thanks a lot!

  • @sarikamodi2986
    @sarikamodi2986 20 วันที่ผ่านมา

    "It is hard. I am not able to understand."

  • @palashmoon3808
    @palashmoon3808 5 ปีที่แล้ว

    why did we use dfs[0] instead of dfs? as the result for both will be same just the format of dfs is different than that of dfs[0].

    • @alexanderryzhkov7421
      @alexanderryzhkov7421 4 ปีที่แล้ว

      if you have several tables on the webpache the result will be different. by using dfs[0] you chose the first table

    • @manishpahuja8127
      @manishpahuja8127 3 ปีที่แล้ว

      Hey it seems that the read_html looks up all the tables available on the html page and gives out a list containing the different datasets(tables) on the said page. Using dfs[0] returns the first dataset in the list, which is what appears in Krish's code! Please let me know if this helps!