Building & Training Custom ML Models for Document Processing | RPA | UiPath

Lahiru Fernando

มุมมอง 6 052

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 1 ต.ค. 2024

ความคิดเห็น • 53

@petchiparamasivam6859 2 ปีที่แล้ว ⁺²
Hello Lehiru,
getting error "Pipeline failed due to ML Package Issue" but schema got created and split file also got created.
@LahiruFernando 2 ปีที่แล้ว ⁺²
Hello,
Sorry for my late reply. I was in Goa for the MVP meet.
Did you also do the data export from Data Manager before running the training pipeline?
@petchiparamasivam6859 2 ปีที่แล้ว ⁺¹
@@LahiruFernando thanks for replying. yes lehiru did the same.
@priyankasingh-ik5ux 2 ปีที่แล้ว ⁺¹
@@LahiruFernando I am also facing the same issue. Can you please help .
@LahiruFernando 2 ปีที่แล้ว
@@petchiparamasivam6859 Ok.. so when you create the training pipeline, which folder did you indicate as the dataset? In the dataset, did you select the Export folder? or a different folder?
Also one more thing, in the pipeline logs, scroll down to see if there are any other additional details about the error that you can find.. It should tell you what caused the failure there..
@LahiruFernando 2 ปีที่แล้ว
@@priyankasingh-ik5ux Ok.. so when you create the training pipeline, which folder did you indicate as the dataset? In the dataset, did you select the Export folder? or a different folder?
Also one more thing, in the pipeline logs, scroll down to see if there are any other additional details about the error that you can find.. It should tell you what caused the failure there..
@padmashreesandeep8264 2 ปีที่แล้ว ⁺¹
Hey Lahiru,
I have a horizontal table to capture. should I capture them as regular fields ? I see create column fields that I use for vertical table. But I don't see create row fields (for horizontal table). please explain
@LahiruFernando 2 ปีที่แล้ว
Hey.. im so sorry I missed this. Do you still have the issue?
I did not quite understand the exact requirement
Understand that you have a horizontal table.. Can you show me one sample so I have a better idea?
@shankra1970 2 ปีที่แล้ว ⁺¹
Hi Lahiru,
I trained a ML model on some pdf invoices and created some custom labels also in data manager/document manager.
and then export it with ''All labels" option. i created the pipeline and after this I also upgraded the version too.
but issue is this , when i am using it in uipath studio DU , in Machine learning extractor, ... I am not able to find out new labels which i created in data manager. it's only showing the pre available labels in configure which i used, not those which i created.
ANy idea what i am doing wrong ? please
@LahiruFernando 2 ปีที่แล้ว
HI Shankra,
Seems like the AI Center part got completed without any issue. So it looks like you haven't taken the latest capabilities into the workflow.
What you can do is: in the ML Extractor, click on the configure button. There you will have a Get Capabilities button where you can retrieve the latest capabilities of your trained model. Have you tried this option?
You can access the configuration option in Extractor Scope -> configure extractors -> and click on the Settings button of the extractor (located right blow the name)
Let me know if this doesn't work for you..
@shankra1970 2 ปีที่แล้ว
@@LahiruFernando Thanks, I missed that part. Thanks for your response.
@yashobantadash6670 2 ปีที่แล้ว ⁺¹
@@LahiruFernando great bro! i had the same problem. thankss a lot!
@sowmyarajan3137 2 ปีที่แล้ว ⁺¹
Hello Lahiru, thank you so much for sharing the video.
I followed the similar steps but while training for the pipeline, i got the following error:
"ImportError: cannot import name 'BaseResponse' from 'werkzeug.wrappers' (/home/aicenter/.local/lib/python3.9/site-packages/werkzeug/wrappers/__init__.py)"
Can you please help me out?
@LahiruFernando 2 ปีที่แล้ว
Hi Sawmya,
At which stage did you get this error? During the training pipeline execution? Along with this issue, in the AI Center logs, do you see any other logs that relate to the error?
@nirmalsomasundaram7373 2 ปีที่แล้ว ⁺¹
Hi Lahiru,
Thanks for the good session.While creating new pipeline it's showing status as Waiting for resources?
@LahiruFernando 2 ปีที่แล้ว
Hello Nirmal, Does it stay in the Waiting for Resources status for a very long time? If that's the case, check whether you have enough AI licenses to run the pipeline. Per pipeline, it should have at least 1 AI licenses available (not utilized by models).
Let me know if this doesn't work.. We could try to figure out together..
@thanuthomas7003 2 ปีที่แล้ว ⁺¹
Hi Lahiru, for evaluation run do we have to import batch of invocies, mark the fields to extract and export same as we did for training run?
@LahiruFernando 2 ปีที่แล้ว
Hi Thanu,
Yes you have to do the same thing just like you do for training. Once you export, in the pipeline creation, just use the evaluation option
@iamraj9419 3 ปีที่แล้ว ⁺¹
Hi lahiru,
I have multiple invoices in a single PDF file how to train the model in this case
@LahiruFernando 3 ปีที่แล้ว ⁺³
Hello Iamraj,
In this case, the best is to split the PDf file into multiple PDF files so each file contain a single document. Now use the split files you created and do the training on those..
Hope this helps
@iamraj9419 3 ปีที่แล้ว ⁺¹
@@LahiruFernando Thanks for the quick response Lahiru.
Once training completed If we feed the file which contains multiple invoice model could extract both the invoices?
@LahiruFernando 3 ปีที่แล้ว ⁺¹
@@iamraj9419 Hi.. Yes.. in a normal run, if you feed multiple invoices it will automatically pick it up. Just make sure you have a classification activity in the workflow and that would do the job :)
@shivakumarag3226 ปีที่แล้ว
Hi Lahiru, thanks a lot for such a useful video, you saved my lot of time!!!... I have pdf files with multiple pages and each page is having different values. I can see in the video, Backwards-compatible export option in Document Understanding, how can we use that for multiple page extraction?
@raymondlee8951 ปีที่แล้ว
Hi Lahiru
Can you show us how to extract bank statements out of image files?
@Tabvincoholic 2 ปีที่แล้ว ⁺¹
Hi Lahiru , Thanks a lot for the teachings , just one doubt that if i am training a multi page PDF single invoice and the data of 1st page - last line is half on the same page and half on the next page , so it does not put in the same line, putting half in the next line , can you share your view, how to deal with this problem
@LahiruFernando 2 ปีที่แล้ว ⁺¹
Hey Vinay.. I'm not sure how I missed your comment.. Sorry for the late reply..
Yes.. I have come across this too.. What i did is, I got more samples on similar invoices, so that I can train the model to extract data from the two pages. Another option is, if you see these kinda variations a lot, probably also introduce a Form Extractor as a supporting extraction for the ML extractor.. So that in case ML fails, the other one can still extract the data.
You can of course send this data back for retraining as well..
@Tabvincoholic 2 ปีที่แล้ว
@@LahiruFernando Thanks
@Tabvincoholic 2 ปีที่แล้ว ⁺¹
You did huge help
@yashobantadash6670 2 ปีที่แล้ว ⁺¹
best ever videos on ml!! great job bro!!
@LahiruFernando 2 ปีที่แล้ว ⁺¹
Thanks a lot bro!!
@hemantbonde6348 ปีที่แล้ว ⁺¹
Hi Lahiru, I created a custom model in AI Center and everything seems to working fine. But when i added some more fields to the existing model and did the labelling and export and created the pipeline which was also successful. But when I'm trying to update the ML Skill, it is getting failed with message MLPackage v#22.10.1 Deployment Failed. Kubernetes operation failed to create deployment Remittance_Skill.
@LahiruFernando ปีที่แล้ว
Hey bro..
Yes, This is a problem on UiPath side. This started coming yesterday or day before. We have informed UiPath and they are fixing it. It should be resolved soon.. Keep trying time to time brother..
@hemantbonde6348 ปีที่แล้ว ⁺¹
@@LahiruFernando Thank you for your prompt reply. Could you please let me know once this issue gets resolved and you get any update from UiPath
@LahiruFernando ปีที่แล้ว
@@hemantbonde6348 yes sure.. I can do that
@hemantbonde6348 ปีที่แล้ว ⁺¹
@@LahiruFernando It seems that this issue is still not resolved. Do you have any update?
@LahiruFernando ปีที่แล้ว
@@hemantbonde6348 Hi,
I'm actually on vacation bro. I didn't get a chance to check it. Can you create a support ticket if it is not working for you? Ideally it should be fixed by now because it was affecting all environments including enterprise
@yashobantadash6670 2 ปีที่แล้ว ⁺¹
is it a must to add predefined schema when training model?
@LahiruFernando 2 ปีที่แล้ว ⁺¹
Hey.. yep.. the schema is the one that tells the model that these are the fields that it needs to train on and its data types. So schema is always important and one of the initial things that we set up during custom model creation.
@yashobantadash6670 2 ปีที่แล้ว ⁺¹
@@LahiruFernando thanks a lot bro! how are you? listened from you after a long time!
@LahiruFernando 2 ปีที่แล้ว ⁺¹
@@yashobantadash6670 Yeah bro.. Im doing good.. I was a bit not well, and was travelling.. so was away for couple of days. But now Im back full time.. :)
@hemantbonde6348 2 ปีที่แล้ว ⁺¹
Hi lahiru, Thanks for sharing the knowledge.
Have few questions:
1. Is it compulsory to upload mininum 10 documents while initial training?
2. If yes, Out of 10, Can we upload few same type of invoices(having different values) or all the invoices should be of different types?
@LahiruFernando 2 ปีที่แล้ว ⁺²
Hi Hemant,
Thank you so much for your thoughts..
Here are my answers for your questions..
1. Yes.. it is compulsory to have minimum 10 unique documents for the initial training on Data Manager. Else, it will not do the export for the training.
2. Yea. When we say 10 unique documents, it doesn't need to be of different layouts altogether. Basically what it is looking for is unique values for all the fields.
For example, let's say you have 10 documents and you want to extract one field. So, basically, it needs to have 10 unique values for the field to meet the minimum requirement.
While you upload the documents to Data Manager, it will check for duplicates. In case you have the same document multiple times, it will not consider those. So, you can have documents with different values, and layouts.. (mainly the text on the doc has to be different)
Also one more thing: While you do the training in the Training Pipeline configuration, always select the MINIMUM VERSION as 0 as a best practice.
@hemantbonde6348 2 ปีที่แล้ว
@@LahiruFernando Thanks a lot. It is really helpful.
@hemantbonde6348 2 ปีที่แล้ว ⁺¹
Hi Lahiru,
I'm getting the following error while creating the Pipeline.
Exception: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
Please guide to create the pipeline successfully
@LahiruFernando 2 ปีที่แล้ว ⁺¹
Hello @@hemantbonde6348
Sure. You get this error when you point to a different folder other than the one that you exported from Data Manager.
That's why it says the folder has a different structure that what it expects.
So in the training pipeline creation, click on the dataset, and navigate to the proper folder until you see the export folder with the name that you gave during Export on Data Manager. Select that folder and try.. it should work :)
Also make sure to select the Minor Version as ×0
@hemantbonde6348 2 ปีที่แล้ว ⁺¹
@@LahiruFernando Thank you. It worked !!

ต่อไป

เล่นอัตโนมัติ

UiPath Machine Learning Model Training - Best Practices | RPA | Artificial Intelligence