Issue in Pdf download using request module in python

Question

import requests

pdf_url = "https://www.alexandrina.sa.gov.au/__data/assets/pdf_file/0028/1619614/Council-Special-Meeting-Agenda-11-June-2024.pdf"
pdf_path = 'Test.pdf'
response = requests.get(pdf_url)
pdf_content = response.content 

with open(pdf_path, 'wb') as pdf_file:
    pdf_file.write(pdf_content)

using this code not able to download pdf because haivng 403 response but when i open it mannualy on chrome it opens and also download in my locals but when i use request module im not able to download or if use any proxy or scrape do it download but it got currupted so i cant access this pdf, can you please help what should i do?

Welcome to Stack Overflow. My guess is the server is expecting some sort of User Agent or the server has some other check that you're not providing information for. — ewokx, Commented Jun 14, 2024 at 7:24
are you attempting to read the contents or to download a file ? — darren, Commented Jun 14, 2024 at 11:15

wenbo - Finding Job · Accepted Answer · 2024-06-14 07:42:49Z

0

Seems no issue in your code. I have just changed to another pdf url, it works well.

import os
import requests

save_dir = os.getcwd()
file_name = 'test.pdf'

#url = 'https://www.alexandrina.sa.gov.au/__data/assets/pdf_file/0028/1619614/Council-Special-Meeting-Agenda-11-June-2024.pdf'

url2 = 'https://bitcoin.org/bitcoin.pdf'


outfile = os.path.join(save_dir, file_name)
response = requests.get(url2, stream=True)
with open(outfile,'wb') as output:
  output.write(response.content)

As someone mentioned here. The pdf source server block downloading using code which can prevent bots.

PDF the web server is providing you with a web page intended to prevent bots from downloading data from the site.

answered Jun 14, 2024 at 7:42

wenbo - Finding Job

1,6042 gold badges10 silver badges11 bronze badges

can you try with the same url i have given because i can download any other url with the code but i am having issue with url i given in code
– Krupesh Pandya
Commented Jun 14, 2024 at 8:31
It may be the PDF server block from auto downloading from code. So it may be the build-in error. I have just tried your url, also download an can't-open pdf the same as you
– wenbo - Finding Job
Commented Jun 14, 2024 at 8:34
You could try to add some header information in your request call, so the request looks like from a browser. See this url for an example: scrapfly.io/blog/python-requests-headers-guide
– jottbe
Commented Mar 29 at 22:41

Add a comment |

Collectives™ on Stack Overflow

Issue in Pdf download using request module in python

1 Answer 1

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Linked

Related