Automate Excel Work with Python and Pandas
ฝัง
- เผยแพร่เมื่อ 6 ก.ค. 2024
- Excel tasks are repetitive and boring! Automate them and make your life easier using Python and Pandas. Opening CSV and XLSX files into a Pandas Dataframe is super easy, and setting up to do some basic editing and manipulation with that data can save you hours off your day job.
I will show you how automating tasks such as combining CSV files into one, moving columns around, creating pivot tables and vlookups is easy in Pandas, as well as exporting into all the popular file formats.
Learning how to use Pandas is also highly recommended for anyone interested in data as it is Pythons go to for Data Science, so starting small with some basic dataframe manipulation will set you off down the right path.
Support Me:
DISCORD (NEW): / discord
Amazon US: amzn.to/2OzqL1M
Amazon UK: amzn.to/2OYuMwo
Hosting: Digital Ocean: m.do.co/c/c7c90f161ff6
Gear Used: jhnwr.com/gear/ (NEW)
Patreon: / johnwatsonrooney (NEW)
Scraper API: www.scrapingbee.com/?fpr=jhnwr
-------------------------------------
Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
-------------------------------------
timestamps
00:00 Intro
00:38 Data
01:45 read_csv
03:46 Change Columns
08:23 Edit Column Data
09:56 Add Multiple Files Together
12:56 Pivot Tables
17:00 vlookups (merging)
20:00 Exporting
21:10 Outro - วิทยาศาสตร์และเทคโนโลยี
Yesterday, before this video, I was testing this subject and I got an error and it took me a while to figure it out.
The problem was that my Excel conversion to CSV gave a ; as delimiter instead of a comma (,)
The solution was df = pd.read_csv('data.csv', *sep=';'* )
Thanks for sharing your wisdom ;-)
Thank you very much for the gentle introduction of Panda library . It was very useful for me
Easy to follow tutorial, appreciate it john!!!!
Really useful from start till the end. Highly appreciated.
Masterly content and presentation, thanks. The ideal pace for making the viewer understand what is being done and how.
Very useful common operations for data science projects. 👍💖
Hi John Watson! I'm watching your channel regularly and updating my skills. You are my real teacher. Thanks!
Absolutely one of the most underrated TH-camrs out there. I guess Google's ML algo probably identifies John as a conservative and that's why his channel hasn't exploded in subscriber count yet.
This video is really helpful. Thank you so much.
Hi John, Thank you for the informative video! Top job
Thats perfect how you explained.
Thanks a lot for your videos. These are really helpful and easy to understand.
Thank you very kind
I learned a lot about web scrapping and data handling methods from you in very short time. Thank you
That’s great glad I could help!
I am getting started with touching python for my excel files and glad i found this video.
I can easily follow this thru
Looking forward for more :)
Thank you glad it helped!
Just wanted to say a huge THANKYOU as you have taught me so much!
That’s great!
Excellent video, thank you
great material, thanks
Great walkthrough
Excellent. Thanks!
thanks rooney, a true ace!
Great.awesome video...great learning time
you are amazing man! keep going
Nice to get tutorials from an actual work flow rather than a reinterpretation of the manual.
It's a very useful tip to get a list of df column names
Thanks John for this helpful content.
Thank you very kind
Great Video Sir.
whew! that was a good one
John, Can you target a specific sheet within an excel workbook to create a new db?
Awesome bro its save my time
Good video thanks
Thanks again, really good video.
Thank you!
Awesome brother
Thanks❤
great video, going to check if you do an in-depth video using pandas
Hello, John can you make a video about the VScode Debugger about that how to setup a debugger, how to use it and it's setting and all that stuff. Thanks in Advance
Hi - what do you do in the case that you want to merge the data but some of the data in the references tab does find a corresponding match in the original merged spreadsheet?
As someone more used to VBA, I'm starting to learn Python as my employer is going to deprecate VBA due to security concerns, so this is really useful.
However, I have yet to find any tutorials at all, anywhere on youtube, that show how to actually deploy for end users within an Excel environment. My users aren't going to have an IDE, they need to be able to click a button and set code running.
Microsoft are in the process of launching python for excel - directly use python in excel. It’s currently in developer preview but it sounds like what you are looking for
Awesome video! Can we get the real python link. I couldn't find it in your description.
You just make it soo easy, nice tutorial. Python seems be to the best alternative for VBA.
I never got into VBA but I heard it’s very powerful for this… but Python is so good for most things :)
@@JohnWatsonRooney Yes it is. Even not only for Excel/Office things, but the lack of support made VBA as zombie - dead but will exist as long as Excel will. The support/ new libraries are making me curious into Python world.
Thanks bro ,...need help .......in the realtime I am getting data from the broker terminal ......I want one condition like previous data is in percentage I want that last previous percentage is less than current percentage data and vice versa ....and I want every 5 min
Do you have a book you'd recommend to learn Pandas for this type of work? Most I'm seeing is heavy on the math. I mostly need to be able to find duplicates and in a new column assign the duplicates a new id. So Company A may have 10 records. I want to automate the process of finding them and assigning all of them a CompanyID.
maybe export into csv file and use gawk, fast lightweight utility to process text, easy to learn and use... courtesy chatgpt: write a gawk script to find duplicates and in a new column assign the duplicates a new id.
Here's a simple example of a gawk script to find duplicates and assign each duplicate a new ID:
BEGIN {
FS="," # Set the field separator to comma
OFS="," # Set the output field separator to comma
count=1 # Initialize the count to 1
id=1 # Initialize the ID to 1
}
{
if ($0 in seen) {
print $0, id
} else {
seen[$0]=count
count++
print $0, ++id
}
}
This script sets the input field separator (FS) and output field separator (OFS) to ,, and initializes two variables: count and id. The BEGIN block runs before any input is processed.
The main body of the script uses an associative array (seen) to keep track of which lines have been seen before. If the current line ($0) is in the seen array, it prints the line and the current value of id. If the current line is not in the seen array, the script adds it to the seen array with a value of count, increments count, and prints the line with a new value of ++id.
So when you do something (like when you changed the price of having $ to not having it) the memory saves for it? Like that part is stored so it remembers you did it? The reason im asking is because you deleted the lines of code when you made that change. So it must remember that you originally made that change to get rid of the $ right?
Cool stuff bro
Thanks
Is there a way to save or create a function of the lines of code for repeated tasks with new data sets that come in lets say weekly? So essentially have like a .exe or .bat file, or even a GUI with a run button that when clicked on it automates the process and gives results fast
Hey what I do for weekly reports is tailor my script to get them from a specific folder, put the new files in there and run the script from the terminal when ready
Hi, thanks for the tutorial, where can i get the csv files you are using ?
Hi I never put them up online but I generated them from a free service called mockaroo
Hi John, very well explained and covered most topics which are used in excel. Superb. if you can make one for using python create pivot table and paste it in excel. How to dissect the original data to make smaller data which can then be used to create chart in excel. so I don't have to rely on formula or excel pythons to make charts. Python would simple process the larger dataset and format data in a way which would put it in excel which can then be used for charts
did you figure it out?
@@bearingoutward1302 no
Hey John. Long time watcher. First time student( spending my Sunday time programming instead of watching ). . So, John, Why doesn’t the code from selenium IDE for chrome work; when it’s generated for an ?
I want to add the code to a Python script.
The gets recorded, but doesn’t work when I replay it.
Hello sir .
I watched u r videos of how to read google sheet data using pandas.i got it.but after getting data i want to clear my data from database automatically. What can i do for that
Your tutorials are sublime :) :)
However, where can I get access to the excel test files, as I want to reproduce your demo.
Thanks! Ahh I’ll try to find them, but all my fake data comes from mockaroo
Hi John, glad to see a pandas video here. Very good one, the Timestamps are really appreciated. John could you make a video only about the vlookup, merge, iloc, drop duplicated? Even when I manage to used them, I couldn't say I really understand it.
Thanks. Sure I can do a more deep dive on those
Thanks a lot for your videos. These are really helpful and easy to understand.
Can we connect ?had something to discuss
great job . how about Heroku ? is it in your plans for future contents ?
Yes 100%
Where can we get this sample data
You could have used the jupyter extension of vscode for easy interaction
Yeah absolutely. I don’t know why I’ve just never been a fan of notebooks. I should probably revisit that idea though
Very new to python. How do I get the same user interface as you? When I downloaded it, it just gave me the shell/IDLE
Download VS Code from Microsoft
subscribed 🤩
Thanks!
Not sure if you'll see this... but I just noticed you're running Ubuntu in WSL?
That would be an interesting series to do - especially when production level scrapers almost always need to use a rotating IP and that's usually only possible in Linux.
I'm still doing the good old VM way of using Linux haha
Sure, I’ve used WSL or dual booted into Linux ever since I’ve been coding properly. I just got used to the commands. I could look at doing a video on the benefits
Good video
Thanks
what theme are you using? it looks really nice
this is gruvbox material i believe!
@@JohnWatsonRooney thank you!
how do you output data that's not able to merge with the reference data?
Got it, df[reference].isna()
Can You Create an Telgram Group To discuss more about WebScraping & Python?
21:28 : how do you get the output colorized ?
Rainbow CSV addon
@@bhavik15 Thank you kindly
Installed and it works ;-)
Hello John
we read in certain supplier invoices for customers only with a number of suppliers (invoices) we have problems reading in. via sep='\t' have tried but no result.
we now first go to excel and read it in and then we save it to csv then it is changed to sep=';' then this we read in. ??
what are we doing wrong when reading this format csv
greetings alex
Hi Alex. Hard to say without seeing the file and how it reads in, what file type is the first invoice? Csv, xlsx or something else?
@@JohnWatsonRooney Hello John
can i send you the file?
Sure, my email is on my main TH-cam page
May be the below given code should work,
dataframe_name = pd.read_csv('filename.extension', delimiter='\t')
@@Arvinth14 Thanks for your suggestion, unfortunately that doesn't work either. what I find strange if I first divide it into columns in excel and then write it to csv it works fine. this only happens with the dutch supplier at csv from the web i have no problems
please provide the sample file?
How can i contact you.
Your content is amazing, you need to work on your thumbnail to get more impression & click.
Thanks, I’ve tried a few different types of thumbnail, if you have some good examples you can link them to me?
@@JohnWatsonRooney I just found out TH-cam deleted my comment because I linked to a thumbnail photo 😑
21:12 you don't say John😊
Hi John,
Thank you for all your videos.. It really helps.
I have one request ,Can you help us on how to scrape dynamic website ,i mean the website which changes its query string parameters on pagination..
If you want i can share the link with you..
Just drop me your email id
Thanks.. i am stuck and not able to understand how to scrape that website
Hey, glad you enjoy the videos. My email is on my main TH-cam page if you want to drop me a message I will have a look when I get a chance
Thank you for taking time and responding to my comment.i have dropped you an email regarding the website, kindly have a look whenever it is convenient and Please keep making amazing knowledgeable content :D
Yo...
why my comment is being deleted?
Was it a link? I haven’t deleted any comments from this video it must be TH-cam
@@JohnWatsonRooney Ohh there was a link indeed! I didn't know youtube doesn't allow it. I had asked if you could help me to solve a scrape issue. I'm trying to scrape a supermarket webpage (carrefouruae) to get the name and the price of a product but because the data is rendered throught javascript and my script is running inside a container where it doesn't have a browser, I don't know which library I can use to scrape. Thank you in advance! I Thank you for your awesome videos!
You can render the page with requests-html or have a look in the page source code for “next_data” you might be able to get something useful from the script tag there
@@JohnWatsonRooney thank for the reply! I've tried and i got "Access Denied" :(
The video title says "Automate Excel Work..." but its content is all about csv data. This manipulation is disappointing.
No wonder it's called “data manipulation” ;)
Hi johny😊.
I got a job recently as of data analyat but work is quite biring copy pasting in excel....
So i want to know ia there any tool like chatgpt which can do this work for me....