I have a bunch of csv files read from a teensy adc onto an SD card and am trying to extract them to be able to do some basic stats over each row.
I have tried everything I can think of to try and fix this, but I cannot get my csv to be read correctly. The column names won't line up correctly. Heres the code I'm using:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy import stats
### Manual input of csv file and a short name for plot title
filename = "data.csv"
### Read in data to a data frame with the correct formatting. index_col=0 was not working for all data files tested
data = pd.read_csv(filename,skiprows=1,header=1,index_col=None)
print(data.head()) # To check that the columns are correctly lined up
For some reason I cannot get the code to read the header correctly and it keeps reading the header as one column longer than the data, resulting in an entire column of NaN's. This same thing happens when I do index_col=0 and index_col="SampleNumber" also.
I've tried several iterations of the read_csv line (changing the header=,index_col=, etc) but haven't been able to correct this. The only solution I have is to manually go through and delete the first column of all my CSV files, but that does not seem efficient. Ideally I should have the "SampleNumber" column become the index column (since not all data.csv files have consistent numbering for the SampleNumber), but if that doesn't work it is fine to remove them altogether.
How do I get the SampleNumber column to be read in correctly? I suspect this is mostly an issue with how my csv files are being created but I couldn't figure out a way to upload one of them for someone else to try.
What is currently being output:
SampleNumber C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15
0 3472 3030 2813 2695 2649 2636 2634 2632 2635 2635 2626 2624 2625 2623 2633 2597 NaN
1 2582 2581 2576 2561 2538 2511 2498 2490 2487 2484 2481 2481 2475 2475 2469 2475 NaN
2 2472 2474 2472 2474 2474 2474 2478 2474 2476 2484 2485 2490 2484 2485 2478 2486 NaN
3 2485 2483 2488 2488 2485 2486 2485 2484 2485 2483 2485 2483 2485 2483 2490 2473 NaN
4 2475 2472 2474 2477 2479 2482 2482 2482 2483 2487 2483 2482 2484 2483 2477 2483 NaN
What I want to be outputted:
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15
SampleNumber
0 3472 3030 2813 2695 2649 2636 2634 2632 2635 2635 2626 2624 2625 2623 2633 2597
1 2582 2581 2576 2561 2538 2511 2498 2490 2487 2484 2481 2481 2475 2475 2469 2475
2 2472 2474 2472 2474 2474 2474 2478 2474 2476 2484 2485 2490 2484 2485 2478 2486
3 2485 2483 2488 2488 2485 2486 2485 2484 2485 2483 2485 2483 2485 2483 2490 2473
4 2475 2472 2474 2477 2479 2482 2482 2482 2483 2487 2483 2482 2484 2483 2477 2483
Raw CSV:
Start of new file:,,,,,,,,,,,,,,,,
MISCOUNT: 0,,,,,,,,,,,,,,,,
SampleNumber,C0,C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12,C13,C14,C15
0,3472,3030,2813,2695,2649,2636,2634,2632,2635,2635,2626,2624,2625,2623,2633,2597
1,2582,2581,2576,2561,2538,2511,2498,2490,2487,2484,2481,2481,2475,2475,2469,2475
2,2472,2474,2472,2474,2474,2474,2478,2474,2476,2484,2485,2490,2484,2485,2478,2486
3,2485,2483,2488,2488,2485,2486,2485,2484,2485,2483,2485,2483,2485,2483,2490,2473
4,2475,2472,2474,2477,2479,2482,2482,2482,2483,2487,2483,2482,2484,2483,2477,2483
5,2481,2482,2482,2465,2455,2450,2442,2443,2441,2448,2444,2465,2470,2467,2440,2467