Automate Excel Work with Python and Pandas

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ก.ค. 2024
  • Excel tasks are repetitive and boring! Automate them and make your life easier using Python and Pandas. Opening CSV and XLSX files into a Pandas Dataframe is super easy, and setting up to do some basic editing and manipulation with that data can save you hours off your day job.
    I will show you how automating tasks such as combining CSV files into one, moving columns around, creating pivot tables and vlookups is easy in Pandas, as well as exporting into all the popular file formats.
    Learning how to use Pandas is also highly recommended for anyone interested in data as it is Pythons go to for Data Science, so starting small with some basic dataframe manipulation will set you off down the right path.
    Support Me:
    DISCORD (NEW): / discord
    Amazon US: amzn.to/2OzqL1M
    Amazon UK: amzn.to/2OYuMwo
    Hosting: Digital Ocean: m.do.co/c/c7c90f161ff6
    Gear Used: jhnwr.com/gear/ (NEW)
    Patreon: / johnwatsonrooney (NEW)
    Scraper API: www.scrapingbee.com/?fpr=jhnwr
    -------------------------------------
    Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
    -------------------------------------
    timestamps
    00:00 Intro
    00:38 Data
    01:45 read_csv
    03:46 Change Columns
    08:23 Edit Column Data
    09:56 Add Multiple Files Together
    12:56 Pivot Tables
    17:00 vlookups (merging)
    20:00 Exporting
    21:10 Outro
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 117

  • @RS-Amsterdam
    @RS-Amsterdam 3 ปีที่แล้ว +22

    Yesterday, before this video, I was testing this subject and I got an error and it took me a while to figure it out.
    The problem was that my Excel conversion to CSV gave a ; as delimiter instead of a comma (,)
    The solution was df = pd.read_csv('data.csv', *sep=';'* )
    Thanks for sharing your wisdom ;-)

  • @maryamzarabian4617
    @maryamzarabian4617 2 ปีที่แล้ว +3

    Thank you very much for the gentle introduction of Panda library . It was very useful for me

  • @ruffhouse9760
    @ruffhouse9760 2 ปีที่แล้ว +1

    Easy to follow tutorial, appreciate it john!!!!

  • @user-kp7re7nb6q
    @user-kp7re7nb6q ปีที่แล้ว +1

    Really useful from start till the end. Highly appreciated.

  • @training7574
    @training7574 3 หลายเดือนก่อน +1

    Masterly content and presentation, thanks. The ideal pace for making the viewer understand what is being done and how.

  • @tubelessHuma
    @tubelessHuma 3 ปีที่แล้ว +2

    Very useful common operations for data science projects. 👍💖

  • @jisuresh
    @jisuresh 3 ปีที่แล้ว +10

    Hi John Watson! I'm watching your channel regularly and updating my skills. You are my real teacher. Thanks!

    • @pr0skis
      @pr0skis 3 ปีที่แล้ว

      Absolutely one of the most underrated TH-camrs out there. I guess Google's ML algo probably identifies John as a conservative and that's why his channel hasn't exploded in subscriber count yet.

  • @priya-ok9ur
    @priya-ok9ur 11 หลายเดือนก่อน +1

    This video is really helpful. Thank you so much.

  • @TheeMatrixUno
    @TheeMatrixUno ปีที่แล้ว +1

    Hi John, Thank you for the informative video! Top job

  • @ammadkhan4687
    @ammadkhan4687 ปีที่แล้ว +1

    Thats perfect how you explained.

  • @wladcapiekla
    @wladcapiekla 2 ปีที่แล้ว +2

    Thanks a lot for your videos. These are really helpful and easy to understand.

  • @senthilsds
    @senthilsds 3 ปีที่แล้ว +1

    I learned a lot about web scrapping and data handling methods from you in very short time. Thank you

  • @AlvinRyellPrada
    @AlvinRyellPrada 2 ปีที่แล้ว +1

    I am getting started with touching python for my excel files and glad i found this video.
    I can easily follow this thru
    Looking forward for more :)

  • @OBPagan
    @OBPagan 3 ปีที่แล้ว +1

    Just wanted to say a huge THANKYOU as you have taught me so much!

  • @user-qe5st3wz2f
    @user-qe5st3wz2f 9 หลายเดือนก่อน +1

    Excellent video, thank you

  • @bartdziubek327
    @bartdziubek327 2 ปีที่แล้ว +1

    great material, thanks

  • @Greygon313
    @Greygon313 2 ปีที่แล้ว +1

    Great walkthrough

  • @noone370
    @noone370 ปีที่แล้ว +1

    Excellent. Thanks!

  • @lifted1785
    @lifted1785 ปีที่แล้ว

    thanks rooney, a true ace!

  • @ceknowledge9342
    @ceknowledge9342 2 ปีที่แล้ว +1

    Great.awesome video...great learning time

  • @patocarlo69
    @patocarlo69 ปีที่แล้ว +1

    you are amazing man! keep going

  • @Destide
    @Destide 2 ปีที่แล้ว +3

    Nice to get tutorials from an actual work flow rather than a reinterpretation of the manual.

  • @yannhk
    @yannhk 2 ปีที่แล้ว +1

    It's a very useful tip to get a list of df column names

  • @ajayrawat5046
    @ajayrawat5046 ปีที่แล้ว +1

    Thanks John for this helpful content.

  • @tnssajivasudevan1601
    @tnssajivasudevan1601 3 ปีที่แล้ว +1

    Great Video Sir.

  • @Robert-hb5tc
    @Robert-hb5tc 5 หลายเดือนก่อน +1

    whew! that was a good one

  • @TheeMatrixUno
    @TheeMatrixUno ปีที่แล้ว

    John, Can you target a specific sheet within an excel workbook to create a new db?

  • @ceknowledge9342
    @ceknowledge9342 2 ปีที่แล้ว +1

    Awesome bro its save my time

  • @Chrisfagan8881
    @Chrisfagan8881 2 ปีที่แล้ว +1

    Good video thanks

  • @bisratgetachew8373
    @bisratgetachew8373 3 ปีที่แล้ว +1

    Thanks again, really good video.

  • @NotBeHaris
    @NotBeHaris 3 ปีที่แล้ว +1

    Awesome brother

  • @leandrov07013
    @leandrov07013 ปีที่แล้ว +1

    Thanks❤

  • @PeterFletcherDNADeliverer
    @PeterFletcherDNADeliverer ปีที่แล้ว

    great video, going to check if you do an in-depth video using pandas

  • @absoluteRandom69
    @absoluteRandom69 3 ปีที่แล้ว

    Hello, John can you make a video about the VScode Debugger about that how to setup a debugger, how to use it and it's setting and all that stuff. Thanks in Advance

  • @roybuchler7502
    @roybuchler7502 4 หลายเดือนก่อน

    Hi - what do you do in the case that you want to merge the data but some of the data in the references tab does find a corresponding match in the original merged spreadsheet?

  • @mariuscheek
    @mariuscheek 10 หลายเดือนก่อน +3

    As someone more used to VBA, I'm starting to learn Python as my employer is going to deprecate VBA due to security concerns, so this is really useful.
    However, I have yet to find any tutorials at all, anywhere on youtube, that show how to actually deploy for end users within an Excel environment. My users aren't going to have an IDE, they need to be able to click a button and set code running.

    • @JohnWatsonRooney
      @JohnWatsonRooney  10 หลายเดือนก่อน +1

      Microsoft are in the process of launching python for excel - directly use python in excel. It’s currently in developer preview but it sounds like what you are looking for

  • @TheBIMCoordinator
    @TheBIMCoordinator 3 ปีที่แล้ว

    Awesome video! Can we get the real python link. I couldn't find it in your description.

  • @SimpleExcelVBA
    @SimpleExcelVBA 2 ปีที่แล้ว +2

    You just make it soo easy, nice tutorial. Python seems be to the best alternative for VBA.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 ปีที่แล้ว +1

      I never got into VBA but I heard it’s very powerful for this… but Python is so good for most things :)

    • @SimpleExcelVBA
      @SimpleExcelVBA 2 ปีที่แล้ว

      ​@@JohnWatsonRooney Yes it is. Even not only for Excel/Office things, but the lack of support made VBA as zombie - dead but will exist as long as Excel will. The support/ new libraries are making me curious into Python world.

  • @mangeshw9766
    @mangeshw9766 3 ปีที่แล้ว

    Thanks bro ,...need help .......in the realtime I am getting data from the broker terminal ......I want one condition like previous data is in percentage I want that last previous percentage is less than current percentage data and vice versa ....and I want every 5 min

  • @HadarsGrasp
    @HadarsGrasp ปีที่แล้ว

    Do you have a book you'd recommend to learn Pandas for this type of work? Most I'm seeing is heavy on the math. I mostly need to be able to find duplicates and in a new column assign the duplicates a new id. So Company A may have 10 records. I want to automate the process of finding them and assigning all of them a CompanyID.

    • @markgreen2170
      @markgreen2170 ปีที่แล้ว

      maybe export into csv file and use gawk, fast lightweight utility to process text, easy to learn and use... courtesy chatgpt: write a gawk script to find duplicates and in a new column assign the duplicates a new id.
      Here's a simple example of a gawk script to find duplicates and assign each duplicate a new ID:
      BEGIN {
      FS="," # Set the field separator to comma
      OFS="," # Set the output field separator to comma
      count=1 # Initialize the count to 1
      id=1 # Initialize the ID to 1
      }
      {
      if ($0 in seen) {
      print $0, id
      } else {
      seen[$0]=count
      count++
      print $0, ++id
      }
      }
      This script sets the input field separator (FS) and output field separator (OFS) to ,, and initializes two variables: count and id. The BEGIN block runs before any input is processed.
      The main body of the script uses an associative array (seen) to keep track of which lines have been seen before. If the current line ($0) is in the seen array, it prints the line and the current value of id. If the current line is not in the seen array, the script adds it to the seen array with a value of count, increments count, and prints the line with a new value of ++id.

  • @nishant1998
    @nishant1998 ปีที่แล้ว

    So when you do something (like when you changed the price of having $ to not having it) the memory saves for it? Like that part is stored so it remembers you did it? The reason im asking is because you deleted the lines of code when you made that change. So it must remember that you originally made that change to get rid of the $ right?

  • @mushinart
    @mushinart 3 ปีที่แล้ว +1

    Cool stuff bro

  • @creyes879
    @creyes879 2 ปีที่แล้ว +1

    Is there a way to save or create a function of the lines of code for repeated tasks with new data sets that come in lets say weekly? So essentially have like a .exe or .bat file, or even a GUI with a run button that when clicked on it automates the process and gives results fast

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 ปีที่แล้ว

      Hey what I do for weekly reports is tailor my script to get them from a specific folder, put the new files in there and run the script from the terminal when ready

  • @ismahenelarbi5403
    @ismahenelarbi5403 2 ปีที่แล้ว +1

    Hi, thanks for the tutorial, where can i get the csv files you are using ?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 ปีที่แล้ว

      Hi I never put them up online but I generated them from a free service called mockaroo

  • @rajesh9sn
    @rajesh9sn 2 ปีที่แล้ว +4

    Hi John, very well explained and covered most topics which are used in excel. Superb. if you can make one for using python create pivot table and paste it in excel. How to dissect the original data to make smaller data which can then be used to create chart in excel. so I don't have to rely on formula or excel pythons to make charts. Python would simple process the larger dataset and format data in a way which would put it in excel which can then be used for charts

  • @RedMaster-mw6ti
    @RedMaster-mw6ti 3 ปีที่แล้ว

    Hey John. Long time watcher. First time student( spending my Sunday time programming instead of watching ). . So, John, Why doesn’t the code from selenium IDE for chrome work; when it’s generated for an ?
    I want to add the code to a Python script.
    The gets recorded, but doesn’t work when I replay it.

  • @ajinkyapatil1642
    @ajinkyapatil1642 3 ปีที่แล้ว

    Hello sir .
    I watched u r videos of how to read google sheet data using pandas.i got it.but after getting data i want to clear my data from database automatically. What can i do for that

  • @edbull4891
    @edbull4891 2 ปีที่แล้ว +1

    Your tutorials are sublime :) :)
    However, where can I get access to the excel test files, as I want to reproduce your demo.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 ปีที่แล้ว +1

      Thanks! Ahh I’ll try to find them, but all my fake data comes from mockaroo

  • @jonathanfriz4410
    @jonathanfriz4410 3 ปีที่แล้ว +3

    Hi John, glad to see a pandas video here. Very good one, the Timestamps are really appreciated. John could you make a video only about the vlookup, merge, iloc, drop duplicated? Even when I manage to used them, I couldn't say I really understand it.

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว +1

      Thanks. Sure I can do a more deep dive on those

  • @rohitmethare1986
    @rohitmethare1986 3 ปีที่แล้ว

    Thanks a lot for your videos. These are really helpful and easy to understand.
    Can we connect ?had something to discuss

  • @maciekpaciarski9343
    @maciekpaciarski9343 3 ปีที่แล้ว +1

    great job . how about Heroku ? is it in your plans for future contents ?

  • @shaikhzishan5342
    @shaikhzishan5342 2 ปีที่แล้ว

    Where can we get this sample data

  • @ugwuanyiarinze5626
    @ugwuanyiarinze5626 3 ปีที่แล้ว +3

    You could have used the jupyter extension of vscode for easy interaction

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว

      Yeah absolutely. I don’t know why I’ve just never been a fan of notebooks. I should probably revisit that idea though

  • @josephbrown8968
    @josephbrown8968 หลายเดือนก่อน +1

    Very new to python. How do I get the same user interface as you? When I downloaded it, it just gave me the shell/IDLE

    • @JohnWatsonRooney
      @JohnWatsonRooney  หลายเดือนก่อน

      Download VS Code from Microsoft

  • @zohebdholakia3782
    @zohebdholakia3782 10 หลายเดือนก่อน +1

    subscribed 🤩

  • @pr0skis
    @pr0skis 2 ปีที่แล้ว +1

    Not sure if you'll see this... but I just noticed you're running Ubuntu in WSL?
    That would be an interesting series to do - especially when production level scrapers almost always need to use a rotating IP and that's usually only possible in Linux.
    I'm still doing the good old VM way of using Linux haha

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 ปีที่แล้ว

      Sure, I’ve used WSL or dual booted into Linux ever since I’ve been coding properly. I just got used to the commands. I could look at doing a video on the benefits

  • @sheikhshah2593
    @sheikhshah2593 3 ปีที่แล้ว +1

    Good video

  • @SDILUYNTsiu39fnd
    @SDILUYNTsiu39fnd ปีที่แล้ว +1

    what theme are you using? it looks really nice

  • @MasterofPlay7
    @MasterofPlay7 2 ปีที่แล้ว +1

    how do you output data that's not able to merge with the reference data?

    • @MasterofPlay7
      @MasterofPlay7 2 ปีที่แล้ว +1

      Got it, df[reference].isna()

  • @adnankattekaden7568
    @adnankattekaden7568 3 ปีที่แล้ว

    Can You Create an Telgram Group To discuss more about WebScraping & Python?

  • @RS-Amsterdam
    @RS-Amsterdam 3 ปีที่แล้ว +1

    21:28 : how do you get the output colorized ?

    • @bhavik15
      @bhavik15 3 ปีที่แล้ว +1

      Rainbow CSV addon

    • @RS-Amsterdam
      @RS-Amsterdam 3 ปีที่แล้ว +1

      @@bhavik15 Thank you kindly
      Installed and it works ;-)

  • @alexlijesen6197
    @alexlijesen6197 3 ปีที่แล้ว +1

    Hello John
    we read in certain supplier invoices for customers only with a number of suppliers (invoices) we have problems reading in. via sep='\t' have tried but no result.
    we now first go to excel and read it in and then we save it to csv then it is changed to sep=';' then this we read in. ??
    what are we doing wrong when reading this format csv
    greetings alex

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว

      Hi Alex. Hard to say without seeing the file and how it reads in, what file type is the first invoice? Csv, xlsx or something else?

    • @alexlijesen6197
      @alexlijesen6197 3 ปีที่แล้ว +1

      @@JohnWatsonRooney Hello John
      can i send you the file?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว

      Sure, my email is on my main TH-cam page

    • @Arvinth14
      @Arvinth14 3 ปีที่แล้ว

      May be the below given code should work,
      dataframe_name = pd.read_csv('filename.extension', delimiter='\t')

    • @alexlijesen6197
      @alexlijesen6197 3 ปีที่แล้ว +1

      @@Arvinth14 Thanks for your suggestion, unfortunately that doesn't work either. what I find strange if I first divide it into columns in excel and then write it to csv it works fine. this only happens with the dutch supplier at csv from the web i have no problems

  • @aperxmim
    @aperxmim หลายเดือนก่อน

    please provide the sample file?

  • @LaoshiBhel
    @LaoshiBhel ปีที่แล้ว

    How can i contact you.

  • @mrnargil
    @mrnargil 3 ปีที่แล้ว +1

    Your content is amazing, you need to work on your thumbnail to get more impression & click.

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว

      Thanks, I’ve tried a few different types of thumbnail, if you have some good examples you can link them to me?

    • @mrnargil
      @mrnargil 2 ปีที่แล้ว

      @@JohnWatsonRooney I just found out TH-cam deleted my comment because I linked to a thumbnail photo 😑

  • @irfanshaikh262
    @irfanshaikh262 ปีที่แล้ว

    21:12 you don't say John😊

  • @mandarraut9565
    @mandarraut9565 3 ปีที่แล้ว +1

    Hi John,
    Thank you for all your videos.. It really helps.
    I have one request ,Can you help us on how to scrape dynamic website ,i mean the website which changes its query string parameters on pagination..
    If you want i can share the link with you..
    Just drop me your email id
    Thanks.. i am stuck and not able to understand how to scrape that website

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว +1

      Hey, glad you enjoy the videos. My email is on my main TH-cam page if you want to drop me a message I will have a look when I get a chance

    • @mandarraut9565
      @mandarraut9565 3 ปีที่แล้ว

      Thank you for taking time and responding to my comment.i have dropped you an email regarding the website, kindly have a look whenever it is convenient and Please keep making amazing knowledgeable content :D

  • @SkySesshomaru
    @SkySesshomaru 2 ปีที่แล้ว +1

    Yo...

  • @gutv
    @gutv 3 ปีที่แล้ว +1

    why my comment is being deleted?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว +1

      Was it a link? I haven’t deleted any comments from this video it must be TH-cam

    • @gutv
      @gutv 3 ปีที่แล้ว +1

      @@JohnWatsonRooney Ohh there was a link indeed! I didn't know youtube doesn't allow it. I had asked if you could help me to solve a scrape issue. I'm trying to scrape a supermarket webpage (carrefouruae) to get the name and the price of a product but because the data is rendered throught javascript and my script is running inside a container where it doesn't have a browser, I don't know which library I can use to scrape. Thank you in advance! I Thank you for your awesome videos!

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 ปีที่แล้ว +1

      You can render the page with requests-html or have a look in the page source code for “next_data” you might be able to get something useful from the script tag there

    • @gutv
      @gutv 3 ปีที่แล้ว

      @@JohnWatsonRooney thank for the reply! I've tried and i got "Access Denied" :(

  • @rayearth9760
    @rayearth9760 ปีที่แล้ว +1

    The video title says "Automate Excel Work..." but its content is all about csv data. This manipulation is disappointing.

    • @DevvratSingh007
      @DevvratSingh007 หลายเดือนก่อน

      No wonder it's called “data manipulation” ;)

  • @rohitguleria100
    @rohitguleria100 หลายเดือนก่อน

    Hi johny😊.
    I got a job recently as of data analyat but work is quite biring copy pasting in excel....
    So i want to know ia there any tool like chatgpt which can do this work for me....