0

I am new to Python. So please dont be angry if my question is too noob and my english is bad T.T I want to scrape data from the admin page of our company CMS which based on ASP. I read a lot of tut on Internet about BeautifulSoup and Request Module. But is doens't work for me. Could you guy give me some help/hints? Thanks in advance.

The login url is:

http://thuvientulap.org/login.aspx

and my code:

#import libraries
import csv
import requests
from bs4 import BeautifulSoup

URL="http://thuvientulap.org/login.aspx"

username="user"
password="password"

s=requests.Session()
r=s.get(URL)

soup=BeautifulSoup(r.content,'html.parser')

VIEWSTATE=soup.find(id="__VIEWSTATE")['value']
EVENTVALIDATION=soup.find(id="__EVENTVALIDATION")['value']
VIEWSTATEGENERATOR=soup.find(id="__VIEWSTATEGENERATOR")['value']

login_data={"__VIEWSTATE":VIEWSTATE,
"txt_name_login":username,
"txt_password_ogin":password,
"__VIEWSTATE":VIEWSTATE,
"__EVENTVALIDATION":EVENTVALIDATION,
"__VIEWSTATEGENERATORT":VIEWSTATEGENERATOR,
}

r = s.post(URL, data=login_data)

admin_url =("http://thuvientulap.org/admin.aspx")
r = s.get(admin_url)

print (r.url)
print (r.text)
0

1 Answer 1

1

You are not passing any headers:

import requests
s=requests.Session()
url ="http://thuvientulap.org/login.aspx"
r=s.get(url)
dct=s.cookies.get_dict()#you will get a ASP.net cookie pass it in header 
                         along with other headers


aid=dct["ASP.NET_SessionId"]
head = {ASP.NET_SessionId=aid,.....}
r = s.post(url, data=login_data,headers=head)

To get info about which specific headers you have to pass and all the parameters required for POST

  • Open link in google chrome.
  • Open Developers Console(fn + F12).
  • There search for login doc (if cannot find, input wrong details and submit).
  • You will get info about request headers and POST parameters.
Sign up to request clarification or add additional context in comments.

2 Comments

Sorry for being so noob. I dont understand your answer. But i found the problem. I just added into array login data "BtnLogin":"". Now it works like a charm. But if you can help me with a more detail answer, i will be very happy. I am so sorry for asking too much.
ohh...no problem In my answer I suggested you to pass a parameter called Header as all POST and GET method have headers it's good to pass headers as sometimes it may contain important information such as cookie values

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.