3
$\begingroup$

I am trying to work on the Titanic competition to get hands on experience with data science & machine learning. I tried to load up the datasets from GitHub but pandas threw the following error:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 32, saw 2

I tried to follow the advice of other SO users so I added skiprows=1 parameter in my pd.csv() call to skip the first row but it didn't work.

import pandas as pd

train_dataset = pd.read_csv("https://github.com/oo92/titanic-files/blob/master/train.csv", skiprows=1)
test_dataset = pd.read_csv("https://github.com/oo92/titanic-files/blob/master/test.csv", skiprows=1)
ground_truths = pd.read_csv("https://github.com/oo92/titanic-files/blob/master/gender_submission.csv", skiprows=1)

train_dataset.head()
$\endgroup$

1 Answer 1

2
$\begingroup$

The path that you are accessing from is a Github repository page which is a webpage, it does not return CSV. You have to click on 'raw' option in Github and then pass the URL which in your case is:

test = pd.read_csv('https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/test.csv')
$\endgroup$
1
  • $\begingroup$ Thank you. But Spyder doesn't let me print the head with train_dataset.head(). I have to explicitly place it inside a print() cal. $\endgroup$ Commented May 10, 2019 at 1:26

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.