All Questions
4 questions
0
votes
0
answers
471
views
Extract only the body text of the PDF, not the bulleted points, headings and subheadings using python pdfplumber library
Code
import pdfplumber
ecdata = ""
with pdfplumber.open("XYZ Transcript.pdf") as pdf:
for i in range(len(pdf.pages)):
print("Page No.: ", i+1)
...
2
votes
2
answers
2k
views
Python PdfMiner - How to get the info on the orientation of each word/sentence included in a pdf?
Target:
I want to extract the info on the orientation of each word or sentence from a PDF like the attached one. The reason for this is that i want to keep the text only from the orientation with zero ...
1
vote
0
answers
49
views
trying to extract data from pdf and make sense of it and upload it to a database
Ive got many PDF's which contain data like name , Address , Contact info , Email Id's and many more details.
i am trying to write a program to convert this data into Text file and using different ...
420
votes
13
answers
467k
views
Python module for converting PDF to text [closed]
Is there any python module to convert PDF files into text? I tried one piece of code found in Activestate which uses pypdf but the text generated had no space between and was of no use.