Skip to main content

All Questions

0 votes
0 answers
40 views

Document AI not returning tax_rate

I am trying to use the Invoice Parser Processor of Document AI. The problem is that I am not getting the tax_rate field in the response when I use the API in my code. However, when I upload the same ...
Andrés Acerta's user avatar
0 votes
0 answers
2k views

Document AI - Processor location issue [duplicate]

I'm using a Mac and I have created a simple Document AI processor on the Google Cloud Platform (PDF splitter). This processor was trained, tested and deployed. I'm now desperately trying to make use ...
AlexCT's user avatar
  • 35
0 votes
0 answers
105 views

DocumentAI OCR Error: Invalid Document Content

I am calling DocumentAI OCR batch processing from Workflows generally quite successfully, however, I occasionally get the following error: { "caughtError": { "message": "...
Leo Glowacki's user avatar
0 votes
1 answer
893 views

How to Batch Process Long Documents Exceeding the Google Document AI Page Limit?

I'm working with Google Document AI to process long documents, where the number of pages exceeds the processor limit (~8k pages). The current documented page limit for Enterprise OCR is 500 pages for ...
Leo Glowacki's user avatar
0 votes
0 answers
47 views

How can I run more than two Thread to parse multiple documents with DocumentProcessorServiceAsyncClient - python

As such, the code works, but only with two Threads, if I add another one, the process stops and then takes a time out. I don't know if DocumentProcessorServiceAsyncClient will have a limit of two ...
Jeison Jose Bolano Pabon's user avatar
0 votes
1 answer
103 views

GCP API for AI Documents

I'm having issues with the API, there is no response whatsoever. I have created  the service account with the corresponding API key with its JSON file, however, I cannot seem to get any response when ...
Keagan Gilmore's user avatar
1 vote
1 answer
569 views

Document AI "400 No valid schema provided for processing" with Cloud Function

I’ve been experiencing an issue with the Google Cloud Document AI API in my Firebase Cloud Function that handles documents uploaded to Google Cloud Storage. The function triggers correctly upon PDF ...
HaZeust's user avatar
  • 13
1 vote
1 answer
211 views

Using Batch Processing Document AI inside the google cloud function

I have a scenario where I am uploading a local file to a Cloud Storage bucket, triggering a Cloud Function (xyz). Within this Cloud Function, I am performing a batch processing task using Google Cloud ...
Manish gupta's user avatar
0 votes
1 answer
308 views

google.api_core.exceptions.InvalidArgument: 400 The resource projects/{my-proj-id}/locations/eu is not located in us

I am trying to use the Google DocAI Warehouse sample Python code and it looks like that the location parameter is always ignored and just assumes the 'us' location. My prototype project has 'eu' as ...
caoimhinmacg's user avatar
0 votes
1 answer
183 views

How do I iterate through JSON files stored in GCP bucket in different folders. Example; | Bucket/Dict/Folder2/file.json Bucket/Dict/Folder1/file.json

I have dumped JSON files from DOCAI to GCP but each file is stored in individual folder, although they are in the same bucket on Cloud Storage. I am not able to iterate through the JSON files stored ...
Vedant Patil's user avatar
0 votes
1 answer
1k views

How to locally process a batch of files using Document AI with the Python client?

I'm trying to use the Python console to use the Document OCR processor to locally process a large amount of pdf documents (native and scanned) to extract the text and some metadata. The documents are ...
Vojta Partík's user avatar
0 votes
1 answer
4k views

Extract Table with structure maintained from PDF for feeding into LLM's

I am trying to feed in LLM Model more specifically Vertex AI from Google a context from PDF. Generally GCP Document AI can do OCR to get text from the PDF, that text I pass on to LLM model as context ...
Sarthak Pan's user avatar
2 votes
1 answer
118 views

Is there a solution to select the first and the last character of certain regex patterns?

There is a very long text in xml format like: ><span class='ocrx_word' id='word_1_21_0_1_0' title='bbox 409 912 417 927'><</span><span class='ocrx_word' id='word_1_21_0_1_1' title=...
kang's user avatar
  • 23
1 vote
1 answer
310 views

Original File Name - GCP - Document AI

I'm using Document AI to perform OCR on some thousands of pdf documents with their python client. I'm uploading them into a bucket, batch processing them and a .json output is generated in another ...
Camillo's user avatar
  • 11
1 vote
1 answer
482 views

How can I extract information from "google.cloud.documentai_v1.types.evaluation.Evaluationt"

I am new in the world of cloud and I am trying to use the DOCUMENT AI from GOOGLE but I stucked on how to extract information like precision, accuracy and others from a training evaluation. Here is ...
Atilio's user avatar
  • 115

15 30 50 per page