All Questions
3 questions
0
votes
1
answer
411
views
Scrapy script that was supposed to scrape pdf, doc files is not working properly
I am trying to implement a similar script on my project following this blog post here:
https://www.imagescape.com/blog/scraping-pdf-doc-and-docx-scrapy/
The code of the spider class from the source:
...
6
votes
3
answers
58k
views
How to scrape PDFs using Python; specific content only
I am trying to get data from PDFs available on the site
https://usda.library.cornell.edu/concern/publications/3t945q76s?locale=en
For example, If I look at November 2019 report
https://downloads....
0
votes
2
answers
624
views
How to read line by line in pdf file and create a CSV
Here is my pdf
I found THIS and I used it to scrap my pdf.
6 BEDROOMS
NameAddressUnitSizeKeyRentSq FtMove in DateNotesTenant
Prop #
Texan 261009 West 26th3076x3$4,6952,1368/15/14$1,000 Bonus (1) ...