Dear Ronald, To read multiple files you need to keep all your file in the same directory. After that, read all the files from the directory using Directory.GetFiles('Your Directory Path') function. Now, loop the whole process. I would recommend you to check Susana's answer in the following URL for more clarity: forum.uipath.com/t/read-all-pdf-files-from-folder/14799
Hello! I get the next error while trying to Debug the process: "Add Data Row: Object reference not set to an instance of an object.". What should I do? I've put the same thing in "ArrayRow".
but if we have multiple files of pdf so can we use this method plss help me i have multiple files so i want to extrac where pdf have both structure and unstructured data'
@myrpa 3: We are glad that you like it. To read all the PDF files you need to help of directory and Loop in UiPath. Following is the link we recommend you check: jd-bots.com/2021/04/30/get-all-files-in-a-directory-or-folder-using-uipath-studio/
@@AakarsoftTechnologies thanks for this reference, but I'm still having trouble if I want to apply the case in this video for looping in the same folder PDF..
We would recommend you first add all rows in DataTable by looping the AddDataRow activity. After that, pass the DataTable instance to WriteRange activity.
Hi, thanks for the clear explanation. Can u explain how to extract multiple words for a single field. For eg, the address here contains 3 words(seperated by 2 spaces) using \w will bring up the first part alone.
Hi.. Please check the following regex. Hope this will help regexstorm.net/tester?p=%28%3f%3c%3dAddress%3a%5cs%29.%2b&i=Address%3a+B-16%2f102+Jaydeep+Apartment%0d%0aMira+Road+East
Dear Ramshiva, Without knowing all the details, it would be difficult for us to figure out the problem. Still, we would recommend you to check the following post. forum.uipath.com/t/regex-output-in-matches-box/964/7
@@AakarsoftTechnologies I need to extract the data from the pdf to excel. I have completed the workflow design with no errors and finally after execution, when i open the excel file under the header(Name) its showing the output as (System.Linq.Enumerable+d__97`1[System.Text.RegularExpressions.Match]). How to rectify this? please suggest
In this particular tutorial, you need not add anything to DataRow, as we are handling all the data in a string array and passing the same. Just for your information, you need to pass the DataRow object if it is available.
I have a scenario where sometimes PDF Invoice do not carry any value for some fields, in that case i am getting an error message 'object reference not set to an instance'. I would like to have a solution from your end on how to mitigate this error either by getting output value as blank (In case of empty field) or output value (if value exist on the invoice). Example: If Purchase Order number field is blank on the invoice, then output should be blank in Excel
Hi, As per our understanding, you are getting this error because you have a NULL value in your variable. We would suggest you check the NULL value of the variable after extracting data using Regular Expression. If the variable has a NULL value, you assign a blank space(e.g. var="";) and try and write in a data table.
Thank you, your explanation helped me a lot!!!🤩
Glad it helped! :)
Thank you, well explained in simple terms. Easy to understand 👍🏻
Glad you liked it :)
Thank you, useful to me. I need to extract text from multiple pdf's and write it to an excel file. Can you please help me in this.
how to extract multiple pdf and read the multiple text file into 1 excel .. kindly need your help thank you
Dear Ronald,
To read multiple files you need to keep all your file in the same directory. After that, read all the files from the directory using Directory.GetFiles('Your Directory Path') function. Now, loop the whole process.
I would recommend you to check Susana's answer in the following URL for more clarity:
forum.uipath.com/t/read-all-pdf-files-from-folder/14799
Hello! I get the next error while trying to Debug the process: "Add Data Row: Object reference not set to an instance of an object.". What should I do? I've put the same thing in "ArrayRow".
but if we have multiple files of pdf so can we use this method plss help me i have multiple files so i want to extrac where pdf have both structure and unstructured data'
Multiple pages now how to get the overall get the match of the text
Thanks for detail, it's very useful,and now i have case to run this in multiple file PDF in on folder, what should i do? thanks.
@myrpa 3: We are glad that you like it. To read all the PDF files you need to help of directory and Loop in UiPath. Following is the link we recommend you check:
jd-bots.com/2021/04/30/get-all-files-in-a-directory-or-folder-using-uipath-studio/
@@AakarsoftTechnologies thanks for this reference, but I'm still having trouble if I want to apply the case in this video for looping in the same folder PDF..
i have many PDFs , only first row appeared other data not appeared in it's rows
How to go next row in excel.. If we have more than one invoice??
We would recommend you first add all rows in DataTable by looping the AddDataRow activity. After that, pass the DataTable instance to WriteRange activity.
Hi, thanks for the clear explanation. Can u explain how to extract multiple words for a single field. For eg, the address here contains 3 words(seperated by 2 spaces) using \w will bring up the first part alone.
Hi.. Please check the following regex. Hope this will help
regexstorm.net/tester?p=%28%3f%3c%3dAddress%3a%5cs%29.%2b&i=Address%3a+B-16%2f102+Jaydeep+Apartment%0d%0aMira+Road+East
Instead of required output I'm getting as (System.Linq.Enumerable+d__97`1[System.Text.RegularExpressions.Match]).. How to rectify it?
Dear Ramshiva,
Without knowing all the details, it would be difficult for us to figure out the problem. Still, we would recommend you to check the following post.
forum.uipath.com/t/regex-output-in-matches-box/964/7
@@AakarsoftTechnologies I need to extract the data from the pdf to excel. I have completed the workflow design with no errors and finally after execution, when i open the excel file under the header(Name) its showing the output as (System.Linq.Enumerable+d__97`1[System.Text.RegularExpressions.Match]). How to rectify this? please suggest
Dear Ramshiva,
Please visit the previously shared forum link. People have tried to answer and give some solutions. Hope you will get some solution.
Hi Ram...use variable(0) to get the text Ex: invoiceNumber(0). you might be missing (0). please check
Same scenario but how to extract specific data if have 10 pdf file
for each file in folder activity first
What is should type in datarow
In this particular tutorial, you need not add anything to DataRow, as we are handling all the data in a string array and passing the same. Just for your information, you need to pass the DataRow object if it is available.
But without filling that datarow I can't run the file so plz tell what I should type in that particular column
Please check ArrayRow is passed in the correct format to DataTable, as DataRow is optional.
I have a scenario where sometimes PDF Invoice do not carry any value for some fields, in that case i am getting an error message 'object reference not set to an instance'. I would like to have a solution from your end on how to mitigate this error either by getting output value as blank (In case of empty field) or output value (if value exist on the invoice).
Example: If Purchase Order number field is blank on the invoice, then output should be blank in Excel
Hi,
As per our understanding, you are getting this error because you have a NULL value in your variable. We would suggest you check the NULL value of the variable after extracting data using Regular Expression. If the variable has a NULL value, you assign a blank space(e.g. var="";) and try and write in a data table.