Tutorial 7- Pandas-Reading JSON,Reading HTML, Read PICKLE, Read EXCEL Files- Part 3
ฝัง
- เผยแพร่เมื่อ 2 ต.ค. 2024
- Hello All,
Welcome to the Python Crash Course. In this video we will understand about Pandas library, how to read JSON ,HTML, PICKLE and Eexcel files.
github url : github.com/kri...
Support me in Patreon: / 2340909
Connect with me here:
Twitter: / krishnaik06
Facebook: / krishnaik06
instagram: / krishnaik06
If you like music support my brother's channel
/ @ultralifeproject
Buy the Best book of Machine Learning, Deep Learning with python sklearn and tensorflow from below
amazon url:
www.amazon.in/...
You can buy my book on Finance with Machine Learning and Deep Learning from the below url
amazon url: www.amazon.in/...
Subscribe my unboxing Channel
/ @krishnaikhindi
Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!
Deep Learning Playlist: • Tutorial 1- Introducti...
Data Science Projects playlist: • Generative Adversarial...
NLP playlist: • Natural Language Proce...
Statistics Playlist: • Population vs Sample i...
Feature Engineering playlist: • Feature Engineering in...
Computer Vision playlist: • OpenCV Installation | ...
Data Science Interview Question playlist: • Complete Life Cycle of...
You can buy my book on Finance with Machine Learning and Deep Learning from the below url
amazon url: www.amazon.in/...
🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY TH-cam CHANNEL
I have taken udemy (1000 rs INR) course for python for data science. Your video are far better and more intense than that course.
Thanks a lot.
Exactly
Whose Instructor of your course? Bcoz i also took one
thank you for telling this i was thinking to join but now i feel this is better than you so much
True
Very True.
In Video it is been conveyed that Wine.data file is in JSON format which is not and instead its in CSV format. Guys pls take note of this error.
Even I was thinking the same thing, Json was never used :(
that's why he used pd .read_csv("path"). But what is Header
Learning day by day from your videos. Thank you so much. Learning from basics
Being a fresh learner in Python, your videos are been a blessing. Once I finish all the videos, it will be easy to get through proper certification course
It would be really helpful if you could provide the links of the dataframes you are using 😃
I am soo glad that I did not skip this video, learned a lot
Everthing is Pretty much simple pretty much easy 😆🤘🤣😁
hahahaha was searching for this comment
@@YourGirlPratiksha 😅😅
If I download the wine.data from the given link, I see that it was already in csv format instead of json format not like what said in the video. After applying df.to_csv it actually added index of row from 0 to 177 and column index 0 to 13 in the new generated wine.csv file.
but does it add column names too ?
And also in the video we are reading this file using read_csv but he is saying abt json file, I'm not able to understand
You skipped the part that you couldn't answer! In the second html page, there were other tables as well with "Country" as column names, but you tried to deviate from the explanation by quickly switching the tabs. 2. You didn't mentioned to install lxml for reading xml file. I believe such small small things are important to tell a newbie
Sir, doubt !!! you said [ 2:25 ] that wine.data is in JSON format so why are you reading a JSON data using read_csv ???
I like ur teaching 🤘
When i read excel file by using pd.read_excel(file name.xlsx) ,even that file my laptop contains but i still get the error of "there is no such file or directory ".
check to make sure that your excel file is in the same directory as your current working directory.
hai Krish please make a video on how to import data directly from data base to python
Thank you,Sir
Love from Pakistan
The way you explained everything looks like too easy and interesting .Thanks for providing all the stuff. I am following the same path as suggested by you .THANKS
at 11:11 i am getting error "No tables found"
before that i was getting error related to import of html5lib
11:16 in case u get an import error, perform this: "pip install lxml"
Still its showing module lxml not found
i wrote that command in windows prompt and when it wasnt working then i wrote in Anaconda Prompt as well but there its showing requirements satisfied
@@cartiktechnomechnobro9061 try "conda install lxml" on anaconda prompt
Bro please upload deep learning videos...
Great content and especially reading html pages...thanks a lot!
Hello Krish after scraping the table from web, how do I save the list to csv ?
2:36 [ is it possible to read json format file using read_csv?]
Sir in the video you said that the wine data is in json but you are reading it with read_csv.
The url from the video didn't work for me... This is the correction @11:11 'www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/'
Thank you. But it throws error saying No tables found. How to solve this?
@@payaldhekwar2717 , by using request module i have resolved this error
import requests
import pandas as pd
url1= 'www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/'
crypto_url = requests.get(url1)
crypto_url
@@parveenparveen9384
This not working after applying your method as well. i think concerned site put some restriction on scraping
when i run this line -> df=pd.read_csv('archive.ics.uci.edu/ml/machine-learning-database/wine/wine.data',header=None) i am getting the error that 404:not found. so anyone have new link for the same data?
I am also getting this error...did you find any solution to it ??
Not yet.
In the link you have typed database, it will be databases
anyone else getting this error when try to read json file?
File "", line 1
jsonData = pd.read_json('C:\Users\kritika\ML\example_1.json')
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
This error occurs because you are using a normal string as a path.
you can add r before your normal string it converts normal string to raw string: jsonData = pd.read_json(r"C:\Users\kritika\ML\example_1.json")
or
jsonData = pd.read_json("C:\\Users\\kritika\\ML\\example_1.json")
or
jsonData = pd.read_json("C:/Users/kritika/ML/example_1.json")
When is your full tutorial going live?
Pickle and to_csv are basically same except the extension right ? Is there any benefit of using pickel....thanks
Pickle can be used for any data structure eg: model, files. It usually requires less space also
This type of error coming after i type..
df = pd.read_html(url)
Hi, You told that the next tutorial will be of mongodb but there is no mongodb tutorial.
HTML is not installing... Help someone 😔
I am getting below error while converting Data to json, could anyone help please ?
df1=pd.read_json(Data)
ValueError: Invalid file path or buffer object type:
(url_mcc, match='Country', header=0)
What if two table having same column name, which one it will select?
Probably the first one
how does your output look so nicely arranged with shading , mine op looks like list numbers when reading from html
Hi Krish,
Im getting the below error...while trying to read the table from url....
405 # this version of raise is a syntax error in Python 3
URLError:
Day 3 - 18/02/2024
Hi Krish,
If there are multiple tables with same column headers(for eg: in mobile country code data, there are other tables as well, with same column headers), then how to extract that specific table. Kindly let us know!...Thanks in advance.
tut-7//13/04/2024
i am getting error as invalid syntax while working on first Json example statement , what may be the reason..?
Hii sir , i have smAll doubtpython is enough through this vedios
at 2:30 you are using pd.read_csv to read a json file
if wine data is in json format, how are you reading at as csv using read_csv ?
when i write any website address for data,i get .How to fix this
please show how to read multiple csv or excel files
Thanks brother for this informative video
Hi sir when I practice my jupyter notebook showing
Name error: name ' pd' is not defined
Name error: name 'df' is not defined
What I should do sir
Restart run all
df1 = pd.read_json(Data) is showing error
ValueError: Expected object or value
please help
Check the file name mentioned, might possible you have used a different name, I have done the same code with different assignment
df=pd.read_json(Jdata)
Hi sir
How to convert html table to csv sir ?
Sir, can you elaborate more about pickle?
can you please have video on mongo db ccreated?
I'm not getting playlist
Please playlist name
@@krishanpalsingh973 th-cam.com/video/7S865QCGL74/w-d-xo.html
thanks
What a amazing video🤩🤩🤩
vera level content bro
Hello krish i am in very starting phase of learning python in which your channel is helping a lot from which i am learning continuously.
I am just trying to execute the below code
import pandas as pd
Data = '{"employee_name":"James","email":"james@gmail.com"}'
pd.read_json(Data)
but giving error "If using all scalar values, you must pass an index"
but with Data = '{"employee_name": "James", "email": "james@gmail.com", "job_profile": [{"title1":"Team Lead", "title2":"Sr. Developer"}]}' this is working fine.
I am unable to identify the difference, can you please help
since you are using all values string i.e. a scalar you must pass list or dict in values like '{"employee_name":"James","email":["james@gmail.com"]}'
the difference is [ and ]
❤
want to convert json file into dataframe but got this error: (Invalid file path or buffer object type: )
got answer ,its just syntax error
j_file='{"emp_name":"samiksha","email":"bharnesm@gmail.com","emp_address":[{"title":"mr.","name":"suhas"}]}' , i just forgot to write all the code in a single quotation mark
Thanks Krish
Sri, while dealing with json file , i got the
data={"a" : "name", "b" : "num"}
pd.read_json(data)
ERROR: "Invalid file path or buffer object type" i don't understand this
First of all read_json() functions require String value, not a dict. So, data should be:
data='{"a" : "name", "b" : "num"}'
Then with this value, you will get a 'ValueError'.
ValueError: If using all scalar values, you must pass an index
This is because the read_json() function has a parameter 'typ' which is DataFrame by default, while data has Series value.
So we either convert our data value to DataFrame, or change the typ parameter:
1. Convert data from Series to DataFrame:
data='[{"a" : "name", "b" : "num"}]'
pd.read_json(data)
2. change the 'typ' parameter
data='{"a" : "name", "b" : "num"}'
pd.read_json(data, typ='Series')
@@pakhigupta2869 I was getting the same error, thank you so much for your reply it was very helpful!!
@@pakhigupta2869 thanks mannn
11:32 Sir please explain why we are using [0] index in dfs[0]??
Hey it seems that the read_html looks up all the tables available on the html page and gives out a list containing the different datasets(tables) on the said page. Using dfs[0] returns the first dataset in the list, which is what appears in Krish's code! Please let me know if this helps!
Hello Sir , when I am using pd.read_json() it is giving a value error .
Saying that : if using all scalar values , you must pass an index .
Please help me out with this !
Thanks
This is because the read_json() function has a parameter 'typ' which is DataFrame by default, while data has Series value.
So we either convert our data value to DataFrame, or change the typ parameter:
1. Convert data from Series to DataFrame:, ie pass the json object inside [], so that each dict inside this list is treated as each row of the DataFrame
data='[{"a" : "name", "b" : "num"}]'
pd.read_json(data)
2. change the 'typ' parameter
data='{"a" : "name", "b" : "num"}'
pd.read_json(data, typ='Series')
@@pakhigupta2869 Thank you
read_html does not work. Do I need something else to install as well?
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Input In [50], in ()
1 url_mcc = 'en.wikipedia.org/wiki/Mobile_country_code'
----> 2 dfs = pd.read_html(url_mcc, match='Country', header=0)
File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py:311, in deprecate_nonkeyword_arguments..decorate..wrapper(*args, **kwargs)
305 if len(args) > num_allow_args:
306 warnings.warn(
307 msg.format(arguments=arguments),
308 FutureWarning,
309 stacklevel=stacklevel,
310 )
--> 311 return func(*args, **kwargs)
Can anyone tell me why do we take dfs[0] in line 166
@Light Bringer understood.Thank you
Hey Krish! Great video!
You have mentioned that using pickle, we can avoid running the entire code every time while doing pre processing and model training which takes a lot of time especially for large datasets and multiple attempts at model building. Is there any video where this concept is explained in more detail? Thanks a lot!
"It is hard. I am not able to understand."
why did we use dfs[0] instead of dfs? as the result for both will be same just the format of dfs is different than that of dfs[0].
if you have several tables on the webpache the result will be different. by using dfs[0] you chose the first table
Hey it seems that the read_html looks up all the tables available on the html page and gives out a list containing the different datasets(tables) on the said page. Using dfs[0] returns the first dataset in the list, which is what appears in Krish's code! Please let me know if this helps!