0

We're using google OCR to read PDF or Images that are Loan Estimates. We're defining multiple fields such as loanTerm, loanPurpose loanPurpose

but we're also labeling multiple checkboxes that can be on the same page like loanTypeConventional, loanTypeUSDA, etc loanTypeConventional

or also rateLockNo and rateLockYes rateLockNo

the problem is that, using pre-trained foundation models, the AI is not detecting fields correctly. as seen here, the AI is detecting loanTypeUSDA even when not present

I already have multiple PDFs in the dataset having at least 10 documents with the label I defined

having 41 labels defined

but the OCR is still failing to process simple things as checkboxes. what Im doing wrong ?

previously we were using Eve ai (formerly known as Butler ai) and it's working way much better even with less examples (just 40-50), but the google OCR is so painful and hard to setup.

any recommendation? someone got Loan Estimates processed on this OCR?

3
  • In my own testing, I've found recently that using Gemini to extract entities from documents is more accurate than using a DocumentAI extractor. If it suits your use case, you may wish to give it a try. Commented Jul 18, 2024 at 13:48
  • @LeoGlowacki That's odd. You'd think Google would make Gemini's entity extractor model available as an option in Document AI. Commented Sep 6, 2024 at 1:30
  • @TheAddonDepot I agree. I am wondering if we will see this as a release soon Edit: There have been a few releases since my prior comment. pretrained-ocr-v2.1-2024-08-07 is supposed to have better checkbox detection, though I cannot say if custom extraction has improved in this regard or not Commented Sep 6, 2024 at 17:07

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.