10

I have a string in this format:

"A1","B1","C1","D1","E1","F1","G1","H1"\n"A2","B2","C2","D2","E2","F2" etc

where A to H are columns and the numbers refer to the rows.

I'm looking for the quickest way to create a pandas dataframe.

A long (in time to complete) approach I tried is to use:

df = pd.DataFrame()
for row in data:
    reader = csv.reader(row)
    mylist = []
    for element in reader:
        if element!=['','']:
            mylist.append(element[0])
    df2 = pd.DataFrame([mylist])
    df = df.append(df2)

I'm looking for a quicker way.

1 Answer 1

20

I believe you need StringIO with read_csv:

import pandas as pd

data = '"A1","B1","C1","D1","E1","F1","G1","H1"\n"A2","B2","C2","D2","E2","F2"'
df = pd.read_csv(pd.compat.StringIO(data), header=None)

print (df)


    0   1   2   3   4   5    6    7
0  A1  B1  C1  D1  E1  F1   G1   H1
1  A2  B2  C2  D2  E2  F2  NaN  NaN
Sign up to request clarification or add additional context in comments.

2 Comments

pd.compat.StringIO() was removed from Pandas some time ago (github.com/pandas-dev/pandas/pull/25954). Newer Pandas versions with Python 3 require from io import StringIO
So, it would look like: df = pd.read_csv(StringIO(data), header=None)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.