Skip to main content

All Questions

Tagged with
2 votes
2 answers
62 views

Pandas: Fill in missing values with an empty numpy array

I have a Pandas Dataframe that I derive from a process like this: df1 = pd.DataFrame({'c1':['A','B','C','D','E'],'c2':[1,2,3,4,5]}) df2 = pd.DataFrame({'c1':['A','B','C'],'c2':[1,2,3],'c3': [np.array((...
cbw's user avatar
  • 289
0 votes
1 answer
62 views

Writing complex Pandas DataFrame to HDF5 using h5py

I have a Pandas DataFrame with mixed scalar and array-like data of different raw types (int, float, str). The DataFrame's types look like this: 'col1', dtype('float64') 'col2', dtype('O') <-- array,...
WolfiG's user avatar
  • 1,163
1 vote
2 answers
92 views

Pandas indexing

Can someone explain what is meant by Both loc and iloc [in Pandas] are row-first, column-second. This is the opposite of what we do in native Python, which is column-first, row-second. Because I ...
Obase Oyeni Ayomobi's user avatar
0 votes
0 answers
62 views

Different values in group by columns before and after group by in pandas dataframe [duplicate]

I havw following code: sum_columns = ['p', 'q', 'r', 'Ax', 'Ay', 'Az'] avg_columns = ['Bx', 'By', 'Bz', 'G2 C03'] agg_map = {col: 'sum' for col in sum_columns} agg_map.update({col: 'mean' for col in ...
MsA's user avatar
  • 3,017
0 votes
2 answers
47 views

Create Pivot table and add additional columns from another dataframe

Given two identically formatted dataframes: df1 Counterparty Product Deal Date Value foo bar Buy 01/01/24 10.00 foo bar Buy 01/01/24 10.00 foo ...
iBeMeltin's user avatar
  • 1,885
-1 votes
1 answer
212 views

How to filter array in Pandas dataframe? [duplicate]

Is it possible to filter array without creating new columns? For example i have this dataframe: userID goalsID 25 [1,2,4,5] 188 [3,6] 79 [1,9] How to filter array by digit &...
Stepan P.'s user avatar
0 votes
0 answers
19 views

Convert Array to Pandas Dataframe Columns [duplicate]

I have a column "ym:s:goalsID" in my dataframe, how to convert this columnt to 4 separate columns? Screenshot of Dataframe Now it is: ym:s:goalsID [26783434,282511740,26783434,282511740] I ...
Stepan P.'s user avatar
1 vote
0 answers
94 views

Python Iterating over Numpy Tile and for-loops

Goal: Here is a sample of a dataset that has "ID", "PHASENAME", "CDAYS", "MULTI_FACTOR", "DAY_COUNTER", and "DAILY_LABOR_PERCENT". I was ...
Ty Kendall's user avatar
1 vote
1 answer
84 views

Convert String to Array[Int] in a Hive column using Spark or Hive

I have sample data as in below string format in Hive table: +----------------------+ | col1 | +----------------------+ | 160-80-40 sec| | 160-80-40 sec| | 10-10-10-...
nagraj036's user avatar
  • 175
0 votes
1 answer
90 views

pandas quantile vs the Excel equivalent calculation difference

There is probably a way to do this the problem is I don't know it. I have data sets that are often between 80 to 120 values long. I am trying to compute the 90% value for each separate data set. I was ...
Shane S's user avatar
  • 2,323
0 votes
4 answers
109 views

Efficient way to iterate rows in two arrays and then copy array back into a dataframe

I am learning numpy and I have a dataframe of asset prices and thought it might be better to do a calculation in numpy and then put the data back into a dataframe when done. I have a working program ...
chirob's user avatar
  • 91
0 votes
0 answers
34 views

Python function : return array that was named automatically [duplicate]

My final goal is to recover data from a table, and store it in numpy arrays to work with it later. I had to automatically name each columns of my dataframe (I've opened my file using pandas) due to my ...
Brandon Begue's user avatar
1 vote
1 answer
59 views

How to split an array using its minimum entry

I am trying to split a dataset into two separate ones by finding its minimum point in the first column. I have used idxmin to firstly identify the location of the minimum entry and secondly iloc to ...
John278's user avatar
  • 25
1 vote
1 answer
46 views

pandas to_csv function changing 2d array to a single string

I am trying to precalculate sentence embeddings and I want to store it in a csv file, so that I can reuse it later. I create a Pandas dataframe, and I have the embeddings stored correctly as a 2d ...
abhishekkuber's user avatar
1 vote
1 answer
77 views

Adding numpy arrays to cells of a pandas DataFrame depends on initialisation

I was trying to add a list of numpy arrays as elements to the pandas DataFrame: DataFrame using: df.loc[df['B']==4,'A'] = [np.array([5, 6, 7, 8]),np.array([2,3])] Whether or not this is allowed seems ...
Anita Karsa's user avatar

15 30 50 per page
1
2 3 4 5
111