Skip to main content

All Questions

-1 votes
0 answers
35 views

Getting different results from Groupby for different sized Dataframes

I'm running the same functions on these two dfs that are identical except that they have different lengths (same number of columns and data types). When I run the larger one I get exactly as I would ...
5sWithCrackScreen's user avatar
0 votes
1 answer
62 views

Apply different aggregate functions to different columns of a pandas dataframe, and run a pivot/crosstab?

The issue In SQL it is very easy to apply different aggregate functions to different columns, e.g. : select item, sum(a) as [sum of a], avg(b) as [avg of b], min(c) as [min of c] In Python, not so ...
Pythonista anonymous's user avatar
2 votes
2 answers
116 views

Conditional running total based on date field in Pandas

I have a dataframe with below data. DateTime Tag Qty 2025-01-01 13:00 1 270 2025-01-03 13:22 1 32 2025-01-10 12:33 2 44 2025-01-22 10:04 2 120 2025-01-29 09:30 3 182 2025-02-02 15:05 1 216 To be ...
Abdul Gaffoor G K's user avatar
0 votes
0 answers
28 views

dask: looping over groupby groups efficiently

Example DataFrame: import pandas as pd import dask.dataframe as dd data = { 'A': [1, 2, 1, 3, 2, 1], 'B': ['x', 'y', 'x', 'y', 'x', 'y'], 'C': [10, 20, 30, 40, 50, 60] } pd_df = pd....
tommy.carstensen's user avatar
2 votes
2 answers
58 views

Pandas Group by without performing aggregation

I have a pandas dataframe as follows: Athlete ID City No. of Sport Fields 1231 LA 81 4231 NYC 80 2234 NJ 64 1223 SF 75 4531 LA 81 2345 NYC. 80 ... I want to print the City and No. of Sport Fields ...
Manish's user avatar
  • 35
0 votes
2 answers
40 views

Issue in Pandas Dataframe grouping and getting the difference of a column

I'm struck with a problem, i have date frame as below, it has data for distributor who supply the items for different locations, now i want to calculate, for a particular day, does any item ( example: ...
Abhishek K M's user avatar
0 votes
0 answers
27 views

Transpose several columns into one and groupby several columns

I have a dataset which contains two time stamps and several data columns. My aim is to put anything in three columns: two time columns and one data column which results having several rows of ...
Swawa's user avatar
  • 233
2 votes
1 answer
60 views

Subtle mistake in pandas .apply(lambda g: g.shift(1, fill_value=0).cumsum())

I have a dataframe that records the performance of F1-drivers and it looks like Driver_ID Date Place 1 2025-02-13 1 1 2024-12-31 1 1 2024-11-03 2 1 ...
Ishigami's user avatar
  • 580
2 votes
3 answers
75 views

Pandas groupby with tag-style list

I have a dataset with 'tag-like' groupings: Id tags 0 item1 ['friends','family'] 1 item2 ['friends'] 2 item3 [] 3 item4 ['family','holiday'] So a row can belong to ...
Sanjay Manohar's user avatar
2 votes
1 answer
60 views

How to use vectorized calculations in pandas to find out where a value or category is changing with corrected first row?

With a dataset with millions of records, I have items with various categories and measurements, and I'm trying to figure out how many of the records have changed, in particular when the category or ...
hydrodan's user avatar
2 votes
2 answers
55 views

Why does summing data grouped by df.iloc[:, 0] also sum up the column names?

I have a DataFrame with a species column and four arbitrary data columns. I want to group it by species and sum up the four data columns for each one. I've tried to do this in two ways: once by ...
Ray's user avatar
  • 55
-1 votes
1 answer
50 views

Grouping Rows of Data to Generate analytical

I am working with a data set of NHS attendance data (a snippet of the columns and rows are included). The data continues all the way until the final hour of Sunday. I have successfully cleaned the ...
HEB's user avatar
  • 1
2 votes
2 answers
82 views

How to use numpy.where in a pipe function for pandas dataframe groupby?

Here is a script to simulate the issue I am facing: import pandas as pd import numpy as np data = { 'a':[1,2,1,1,2,1,1], 'b':[10,40,20,10,40,10,20], 'c':[0.3, 0.2, 0.6, 0.4, 0....
learner's user avatar
  • 656
-1 votes
1 answer
48 views

Iterate over multiple dataframe and grouped them based on mean value

I have a list of dataframes with 81 different dataframes. I would like to calculate the average value of the same column in each dataframes. Based on the mean values I would like to compare and ...
Lehel Tompos's user avatar
2 votes
1 answer
158 views

Resampling By Group in Polars

I'm trying to build a Monte Carlo simulator for my data in Polars. I am attempting to group by a column, resample the groups and then, unpack the aggregation lists back in their original sequence. I'...
nybhh's user avatar
  • 83

15 30 50 per page
1
2 3 4 5
295