Newest 'pdf-scraping+text-extraction+python' Questions

0 votes

0 answers

471 views

Extract only the body text of the PDF, not the bulleted points, headings and subheadings using python pdfplumber library

Code import pdfplumber ecdata = "" with pdfplumber.open("XYZ Transcript.pdf") as pdf: for i in range(len(pdf.pages)): print("Page No.: ", i+1) ...

Kituva Ravindran Praveen

45

asked Aug 12, 2022 at 5:15

2 votes

2 answers

2k views

Python PdfMiner - How to get the info on the orientation of each word/sentence included in a pdf?

Target: I want to extract the info on the orientation of each word or sentence from a PDF like the attached one. The reason for this is that i want to keep the text only from the orientation with zero ...

Vagelis

66

asked Sep 24, 2020 at 9:53

1 vote

0 answers

49 views

trying to extract data from pdf and make sense of it and upload it to a database

Ive got many PDF's which contain data like name , Address , Contact info , Email Id's and many more details. i am trying to write a program to convert this data into Text file and using different ...

suyash joshi

61

asked Nov 16, 2019 at 7:00

420 votes

13 answers

467k views

Python module for converting PDF to text [closed]

Is there any python module to convert PDF files into text? I tried one piece of code found in Activestate which uses pypdf but the text generated had no space between and was of no use.

cnu

37.3k

asked Aug 25, 2008 at 4:44

Collectives™ on Stack Overflow

All Questions