1

My program writes data into a CSV file using the pandas' to_csv function. At first run, the CSV file is originally empty and my code wrote data in it (which is supposed to be). At the second run, (take note that I'm still using the same CSV file), my code wrote data in it again (which is good). The problem is, there is a large number of empty rows between the data from the first run and the data from the second run.

Below is my code:

#place into a file
csvFile = open(file, 'a', newline = '',encoding='utf-8')

if file_empty == True:
  df.to_csv(csvFile, sep=',', columns=COLS, index=False, mode='ab', encoding='utf-8') #header true
else:
  df.to_csv(csvFile, sep=',', columns=COLS, header=False, index=False, mode='ab', encoding='utf-8') #header false

I used the variable file_empty in order for the program to not write column headers if there is already data present in the CSV file.

Below is the sample output from the CSV file:

Last data from first run is in line 396 of CSV file, first row data from second run is in line 1308 of the same CSV file.

So there are empty rows starting from line 397 up to line 1307. How can I remove them so that when the program is run again, there is no empty rows between them?

8
  • Why are you using mode='ab' inside to_csv? Couldn't you just do df.to_csv(file, columns=COLS, index=False, mode='a') without using csvFile and with mode='a'? Commented May 14, 2020 at 8:53
  • Hello @Giorgio! Originally, it was at mode='a' but my data before has alternating empty rows. I searched for a solution and I found out that it was related to being it binary. Anyway, I removed it and changed it back to mode = 'a'. Thanks for pointing that out. But I still have a chunk of empty rows when the program is run again using the same CSV file Commented May 14, 2020 at 10:37
  • I'm not sure I understand everything, but perhaps the problem is due to the fact that you are using open incorrectly: if you want to use open (but it's not needed, because to_csv already takes care of opening the file) you should also call csvFile.close() after each time you write something to it. The new rows added to the file are only visible after closing the file with csvFile.close(). Commented May 14, 2020 at 12:34
  • use append data option ,you can delete blank rows from a frame search for the solution Commented May 14, 2020 at 13:33
  • Hello @Giorgio, this is the guide I used for my program (that's why I used open): dropbox.com/s/5enwrz2ggswns56/Telemedicine_twitter_v3.5.py?dl=0 I added csvFile.close() after the to_csv but I still got the same results. I tried removing the open(file) for the CSV but I still got the same results. Commented May 15, 2020 at 15:23

1 Answer 1

0

Here is the data sample and code to append the data and remove blank lines..enter image description here

below are the lines may help you

import pandas
conso_frame = pandas.read_csv('consofile1.csv')
df_2 = pandas.read_csv('csvfile2.csv')

# Column Names should be same
conso_frame = conso_frame.append(df_2)
print(conso_frame)

conso_frame.dropna(subset = ["Intent"], inplace=True)
print(conso_frame)
conso_frame.to_csv('consofile1.csv', index=False)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.