All Questions
Tagged with cloud-document-ai python
40 questions
0
votes
0
answers
40
views
Document AI not returning tax_rate
I am trying to use the Invoice Parser Processor of Document AI. The problem is that I am not getting the tax_rate field in the response when I use the API in my code. However, when I upload the same ...
0
votes
0
answers
2k
views
Document AI - Processor location issue [duplicate]
I'm using a Mac and I have created a simple Document AI processor on the Google Cloud Platform (PDF splitter). This processor was trained, tested and deployed.
I'm now desperately trying to make use ...
0
votes
0
answers
105
views
DocumentAI OCR Error: Invalid Document Content
I am calling DocumentAI OCR batch processing from Workflows generally quite successfully, however, I occasionally get the following error:
{
"caughtError": {
"message": "...
0
votes
1
answer
893
views
How to Batch Process Long Documents Exceeding the Google Document AI Page Limit?
I'm working with Google Document AI to process long documents, where the number of pages exceeds the processor limit (~8k pages). The current documented page limit for Enterprise OCR is 500 pages for ...
0
votes
0
answers
47
views
How can I run more than two Thread to parse multiple documents with DocumentProcessorServiceAsyncClient - python
As such, the code works, but only with two Threads, if I add another one, the process stops and then takes a time out. I don't know if DocumentProcessorServiceAsyncClient will have a limit of two ...
0
votes
1
answer
103
views
GCP API for AI Documents
I'm having issues with the API, there is no response whatsoever. I have created the service account with the corresponding API key with its JSON file, however, I cannot seem to get any response when ...
1
vote
1
answer
569
views
Document AI "400 No valid schema provided for processing" with Cloud Function
I’ve been experiencing an issue with the Google Cloud Document AI API in my Firebase Cloud Function that handles documents uploaded to Google Cloud Storage. The function triggers correctly upon PDF ...
1
vote
1
answer
211
views
Using Batch Processing Document AI inside the google cloud function
I have a scenario where I am uploading a local file to a Cloud Storage bucket, triggering a Cloud Function (xyz). Within this Cloud Function, I am performing a batch processing task using Google Cloud ...
0
votes
1
answer
308
views
google.api_core.exceptions.InvalidArgument: 400 The resource projects/{my-proj-id}/locations/eu is not located in us
I am trying to use the Google DocAI Warehouse sample Python code and it looks like that the location parameter is always ignored and just assumes the 'us' location.
My prototype project has 'eu' as ...
0
votes
1
answer
183
views
How do I iterate through JSON files stored in GCP bucket in different folders. Example; | Bucket/Dict/Folder2/file.json Bucket/Dict/Folder1/file.json
I have dumped JSON files from DOCAI to GCP but each file is stored in individual folder, although they are in the same bucket on Cloud Storage. I am not able to iterate through the JSON files stored ...
0
votes
1
answer
1k
views
How to locally process a batch of files using Document AI with the Python client?
I'm trying to use the Python console to use the Document OCR processor to locally process a large amount of pdf documents (native and scanned) to extract the text and some metadata. The documents are ...
0
votes
1
answer
4k
views
Extract Table with structure maintained from PDF for feeding into LLM's
I am trying to feed in LLM Model more specifically Vertex AI from Google a context from PDF. Generally GCP Document AI can do OCR to get text from the PDF, that text I pass on to LLM model as context ...
2
votes
1
answer
118
views
Is there a solution to select the first and the last character of certain regex patterns?
There is a very long text in xml format like:
><span class='ocrx_word' id='word_1_21_0_1_0' title='bbox 409 912 417 927'><</span><span class='ocrx_word' id='word_1_21_0_1_1' title=...
1
vote
1
answer
310
views
Original File Name - GCP - Document AI
I'm using Document AI to perform OCR on some thousands of pdf documents with their python client.
I'm uploading them into a bucket, batch processing them and a .json output is generated in another ...
1
vote
1
answer
482
views
How can I extract information from "google.cloud.documentai_v1.types.evaluation.Evaluationt"
I am new in the world of cloud and I am trying to use the DOCUMENT AI from GOOGLE but I stucked on how to extract information like precision, accuracy and others from a training evaluation. Here is ...