1
$\begingroup$

I am new to python, I have extracted some reviews from a website and I used the api of the webscrapping tool to import my data in python and the format is in csv. I want to convert this csv to a dataframe in python. Can someone guide me on how to perform this please.

Below is the code for importing the api extraction in csv format.

import requests

params = {
  "api_key": "abc",
  "format": "csv"
}
r = requests.get('https://www.parsehub.com/api/v2/runs/ttx8PT-EL6Rf/data', params=params)
print(r.text)

My output for the above codes are as follows:

"selection1_name","selection1_url","selection1_CommentID_name","selection1_CommentID_Date","selection1_CommentID_comment"
"A","https://www..html","137","February 02, 2020","I enjoy the daily package from the start with the welcoming up to the end.
I recommend this hotel."
"A","https://www.e a lot. Relaxing moments with birds chirping, different swings to chill. Overall, I shall visit again. Thanks Azuri & Marideal."
"A","https://www.html","17","June 12, 2019","Had an amazing stay for 2 nights.
The cleanliness of the room is faultless"
"B","https://www.html","133","April 16, 2019","Had a good time. Food is good."

etc... Can you please help me to convert this into a dataframe in python please please.

$\endgroup$
16
  • $\begingroup$ What is in r.json()? $\endgroup$ Commented Feb 9, 2020 at 14:43
  • $\begingroup$ I didnt understand your question. $\endgroup$ Commented Feb 9, 2020 at 17:22
  • $\begingroup$ What is returned when you print r.json()? And is r.text one long list or does it contain multiple lists? $\endgroup$ Commented Feb 9, 2020 at 17:47
  • $\begingroup$ {'selection1': [{'name': 'Radisson Blu Azuri Resort & Spa', 'url': 'marideal.mu/hotel-deals/…', 'CommentID': [{'name': '137', 'Date': 'February 02, 2020', 'comment': 'I enjoy the daily package from the start with the welcoming up to the end.\nI recommend this hotel.'}, {'name': '136', 'Date': 'September 07, 2019', 'comment': 'enjoy a lovely moment'}, {'name': '135', 'Date': 'July 15, 2019', 'comment': 'I was there for my honeymoon. The hotel was simply wooww and wonderful. ALL the hotel staff was extremely friendly and made......}]}]} $\endgroup$ Commented Feb 9, 2020 at 17:52
  • $\begingroup$ the above is appeared when I print r.json(). $\endgroup$ Commented Feb 9, 2020 at 17:53

1 Answer 1

2
$\begingroup$

Try the following code:

import requests
import pandas as pd
import io

params = {
  "api_key": "abc",
  "format": "csv"
}
r = requests.get('https://www.parsehub.com/api/v2/runs/ttx8PT-EL6Rf/data', params=params)
r = r.content
rawData = pd.read_csv(io.StringIO(r.decode('utf-8')))
$\endgroup$
5
  • $\begingroup$ I got the table but it is not complete. This is the screenshot: [1]: i.sstatic.net/2ocNn.png. What should I do to have the complete dataframe? $\endgroup$ Commented Feb 9, 2020 at 18:49
  • $\begingroup$ It looks good to me, what are you missing? Looking at the example you showed in your OP all columns seem to be present. $\endgroup$ Commented Feb 9, 2020 at 18:54
  • $\begingroup$ it is showing dot dot dot... why am I not having the whole table? $\endgroup$ Commented Feb 9, 2020 at 18:56
  • $\begingroup$ They are there, if you look at the bottom of your screenshot it shows the dimensions of the dataframe, which is 1209 rows by 5 columns. The dots just indicate that there is data there but is it not printing it because there is no space to print all data. $\endgroup$ Commented Feb 9, 2020 at 18:58
  • $\begingroup$ oh.. Thank you very much Sir. I am very grateful to you. $\endgroup$ Commented Feb 9, 2020 at 18:59

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.