0

I am starting to play with pandas.

I downloaded a google sheet.

When reading some data from excel in win7:

xls = pd.ExcelFile('C:/Users/file.xlsx')
data = xls.parse('Sheet 1', index_col=None, na_values=['NA'])
print "Data", data 

I am a getting:

Decode error - output not utf-8

The original excel file has text and numbers.

What is wrong?

Thanks,

2 Answers 2

0

Try adding a different encoding argument such as iso-8859-1. Here is an exhaustive list from the Internet Assigned Numbers Authority (IANA). Though data may look like legitimate Latin numbers and text, one character could require a different character set, depending on origination.

Also you can either use the two step process, ExcelFile or one-step process, read_excel:

ExcelFile

xls = pd.ExcelFile('C:/Users/file.xlsx')
data = xls.parse('Sheet 1', index_col=None, na_values=['NA'], encoding='iso-8859-1')
print data.head()

read_excel

data = pd.read_excel('C:/Users/file.xlsx', 'Sheet 1', encoding='iso-8859-1')
print data.head()
Sign up to request clarification or add additional context in comments.

3 Comments

thank you for the answer. Unfortunately so far none worked. I will keep on trying.
Try this popular encoding list here. Usually character sets depend on language of file's origin.
Thank you for the list. The google sheet I am importing is mine. I think there might be a format issue when I download it to an excel file on my pc. What do you think?
0

This is because, the encoding of your data changes from ASCII to latin1. try this encoding cp1252

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.