10

I have a uint64 column in my DataFrame, but when I convert that DataFrame to a list of python dict using DataFrame.to_dict('record'), what's previously a uint64 gets magically converted to float:

In [24]: mid['bd_id'].head()
Out[24]:
0                0
1    6957860914294
2    7219009614965
3    7602051814214
4    7916807114255
Name: bd_id, dtype: uint64

In [25]: mid.to_dict('record')[2]['bd_id']
Out[25]: 7219009614965.0

In [26]: bd = mid['bd_id']

In [27]: bd.head().to_dict()
Out[27]: {0: 0, 1: 6957860914294, 2: 7219009614965, 3: 7602051814214, 4: 7916807114255}

How can I avoid this strange behavior?

update

strangely enough, if I use to_dict() instead of to_dict('records'), the bd_id column will be of type int:

In [43]: mid.to_dict()['bd_id']
Out[43]:
{0: 0,
 1: 6957860914294,
 2: 7219009614965,
...

2 Answers 2

18

It's because another column has a float in it. More specifically to_dict('records') is implemented using the values attribute of the data frame rather than the columns itself, and this implements "implicit upcasting", in your case converting uint64 to float.

If you want to get around this bug, you could explicitly cast your dataframe to the object datatype:

df.astype(object).to_dict('record')[2]['bd_id']
Out[96]: 7602051814214

By the way, if you are using IPython and you want to see how a function is implemented in a library you can brink it up by putting ?? at the end of the method call. For pd.DataFrame.to_dict?? we see

    ...
    elif orient.lower().startswith('r'):
        return [dict((k, v) for k, v in zip(self.columns, row))
                for row in self.values]
Sign up to request clarification or add additional context in comments.

Comments

1

You can use this

from pandas.io.json import dumps
import json
output=json.loads(dumps(mid,double_precision=0))

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.