Combine Data from Multiple PDF Files into a Single Excel File
ฝัง
- เผยแพร่เมื่อ 3 ก.ค. 2024
- Are you struggling to combine data from multiple PDF files into a single file while maintaining formatting? In this video, I will guide you through a step-by-step tutorial on how to combine PDF files in bulk using Excel. The manual copying and pasting process often results in messy data formatting. Learn how to efficiently extract and combine data from multiple PDFs into a clean Excel sheet using Power Query.
===== ONLINE COURSES =====
✔️ M Language Power Query -
goodly.co.in/learn-m-powerquery/
✔️ Mastering DAX in Power BI -
goodly.co.in/learn-dax-powerbi/
✔️ Power Query Course-
goodly.co.in/learn-power-query/
✔️ Master Excel Step by Step-
goodly.co.in/learn-excel/
===== LINKS 🔗 =====
Blog 📰 - www.goodly.co.in/blog/
Corporate Training 👨🏫 - www.goodly.co.in/training/
Need my help on a Project 💻- www.goodly.co.in/consulting/
Download File ⬇️ - goodly.co.in/combine-data-fro...
===== CONTACT 🌐 =====
Twitter - / chandeep2786
LinkedIn - / chandeepchhabra
Email - goodly.wordpress@gmail.com
===== WHO AM I? =====
A lot of people think that my name is Goodly, it's NOT ;)
My name is Chandeep. Goodly is my full-time venture where I share what I learn about Excel and Power BI.
Please browse around, you'd find a ton of interesting videos that I have created :) Cheers! - วิทยาศาสตร์และเทคโนโลยี
Download the file ⬇ - goodly.co.in/combine-data-from-multiple-pdf-files-excel
Tackle even the most challenging data-cleaning problems. Check out the M Language course and push beyond the user interface ↗ - rb.gy/a2zsnn
Thank you so much for such a great video! I was looking for google to solve this problem but didn't find any good solution. I am happy to watch your video which solved my problem.
awesome! and with good practices
Great video. Thanks !
This is super helpful! Especially for those working in an auditing background! great content!
Thanks! Very useful and perfect presentation 👏👏👏
Amazing Chandeep!!
Power Query Master. Deep explanations, going into detail of the matter. Outstanding explanation. Thanks Chandeep.
This is so useful thanks
Thank you very much for useful technics.
Thank you brother. This was the only tutorial that worked for me
Excellent 🎉...
Thanks for the awesome video.
Same as your example I have to just transpose each table before merging then.
Bravo 👍👍
Excellent Sir.. plz do more video on List Functions..
wonderful 🌹🌹
Great job! next step could be to add a step cleaning up the column names, if those are written al little different or having sometimes artificial white spaces.
you can do it in last step as well without adding new Rename steps, as you can see in had coded version there is a two list one that need to match with data source and second list goes for new names that you want to rename with
Please make a video on how we can bring in multiple bank statements with different format to a single power bi report.
Thank you so much! I get PO copies in PDF format ... There are different sections on the PO like Supplier address/buyer address, PO # section / PO issued date / Section for LE name of the Business unit from where the PO was issued, and then the Milestone description with the amount that needs to be billed once milestone is completed.
How can I put them on a table from different sections?
Wow, very informative content, you explained them very well.
Just curious, is this applicable if your pdf file is a scanned doc / form?
Thanks.
Sandeep, Your videos are realy very deep, simple and practical sdetailing all steps from 0 to last. Can u show how to combine pdf files with password ptotected, which is known to user. One way is to open those files , print them as pdf and then store in the folder, which is cumbersome. whether there can be any short cut.
Can you create a video on CO pilot like chat gpt ( ex: get sales for X Year) in power BI
Thanks Chandeep! Pls advise how to do the import such that each row of PDF becomes one cell in power query.
Sandeep can yu make a video to combine json files with password prtotected into excel. It would be very very helpful. Regards
How to import the date cleanly if each pdf has different page number and has additional table? Thank you.
Thanks as usual. But can you provide us an example if we need to cancel some data from that PDF at the rows? Also, every page has the name and id for each employee and we need to add both of them into column
Great Video... Question though...What if Power Query does not read the PDF in a workable format? I have an Invoice PDf that when I import into PQ, the columns get jumbled up & I am not able to clean the data for reconciliation. I have not used your method per this video yet, but I will. Any thoughts other than using 3rd Party Apps? Thank you!
Hard to say, look for some kind of pattern that you can use to split tex, replaced values, etc.
I've used PQ to import PDF data a lot. One thing I found is that the "Print to PDF" printer built into Chrome based browsers are the easiest to work with. I have a full license for Acrobat, but the Adobe PDF printer produces some of the most difficult PDFs to work with. The Microsoft Print to PDF isn't much better.
If anyone knows of settings to adjust in these printers to make the PDFs easier to work with, please reply!
Hello, Chandeep!
I have a question: doesn't the use of the Table.Combine function (at 6:07) have the same effect as using the Table.ColumnNames + Table.ExpandTableColumn (that you showed next)? It seems like the same result and it would be simpler, but I don't know if I am missing something here.
Thanks! Your videos are great!
I have a pdf challan for TDS deposit. I am trying to combine multiple challans but I am not able to do.
I have a pdf, which has tables side by side instead of one below other, how can I combine this?
Eg: table 1, table 2, table 3
Table 4, table 5,
I have a pdf file having total of 50 pages (page001 to page050). Each page contains same structure of columns with different record but every page contains header and footer. I have to remove all those header and footer rows plus remove some unwanted columns before able to combine all 50 pages as 1 table.
Created a function in powerquery to repeat those cleaning process to apply for all 50 pages but some pages detected lesser or more column numbers then the rest even though they are all the same structure if looked from pdf reader. How to deal with that issue?
Ahh....been facing a similar problem here, different number of columns detected even though they looked the same in the PDF reader app. Any solution would be appreciated
How to convert bank statement pdf to excel
I have invoice pdfs, 30 of them each month, with multiple tables and scattered data i tried a lot of manipulation but wasn't able to get the desired output
@goodly i have a PDF file wherein the data resides right below the columns there are 300 pages in that file how am goona get the data Please make video suggest otherwise
I had nested tables after grouping data. All the tables had the same number of rows (4). I needed one of the columns to be replaced with a fixed list of names (departments for example). I just could not get it to work. I was able to add an Index column (1,2,3,4) into the tables, expand them and then did a Merge from the external list of department names in another query. So the process was:
1. Create departments in excel sheet and make it a table.
2. Import this into a Query.
3. Import the other data from a table in Excel, grouped the data into nested tables, added a Index into these tables and expand.
4. Then I created a third query to then merge 2 and 3.
5. I am sure I should be able to get that list into the nested query tables the same way as I inserted the Index column into there.
Please help.
what if only one pdf have column names and others dont ??
bro bro brooo, in previous videos instead of adding new columns and deleting after you used table.transformcolumns which I liked a lot and now I am using this practice thanks to you. Is there a reason why you did by adding new columns this time? (is it more faster, effective etc?)
Just easier to explain 😉😜
@@GoodlyChandeep :D sometimes we are all lazy :D haha
Is there any possibility to get data from multiple PDFs which are password protected. If yes then could you please make a video on this?
Noice!
Permanent off-world relocation
please need a help when import data from pdf some words in table not english show as"ϱΩϳϣΣϟΩϭόγϣΩϣΣϣϱΩϋ" how i can solve it
HeLLo Goodly/All,
Would be wonderful if Someone confirms:
I was practicing along with Goodly. In the situation in the video, it seemed the following 2 are producing the same result.
✓ Table.ExpandTableColumn
&
✓ Table.Combine
Is My understanding correct?
Thank You!
No, Table.Combine keeps the content of the nested tables. Table.ExpandTableColumn keeps all columns. Just try it you will see the difference