All Questions
7,073 questions
3
votes
1
answer
60
views
How to properly extract all duplicated rows with a condition in a Polars DataFrame?
Given a polars dataframe, I want to extract all duplicated rows while also applying an additional filter condition, for example:
import polars as pl
df = pl.DataFrame({
"name": ["...
3
votes
1
answer
55
views
How can I sort order of index based on my preference in multi-index pandas dataframes
I have a pandas dataframe df. It has multi-index with Gx.Region and Scenario_Model.
The Scenario_Model index is ordered in alphabetical order des, pes, tes. When I plot it, it comes in the same order. ...
0
votes
1
answer
54
views
Matplotlib Plotting repeating steps in x axis
I have a dataframe like the following:
ID Parm_1 Parm_2 Result
0 100 100 0.2
1 100 200 0.4
3 100 300 0.9
4 100 400 0.45
5 ...
-1
votes
2
answers
81
views
Copying certain cells of an excel file to fix the report in Python
In the below table how can we copy the column tempx cells for each test from partition column long cell to the corresponding test cell
For example when we filter Scenario Column cell A1.results.0....
1
vote
1
answer
68
views
Choose rows from pandas dataframe based on a condition for many columns
I have a pandas DataFrame df that has very many columns, including some named "S-xx" with xx ranging from 1 to 20. All these 20 columns contain labels; let's say they're A,B,C and N. What I ...
0
votes
0
answers
46
views
Convert Struct to Map in Python for Writing to S3 in Parquet Format
I am using Apache Airflow to backfill historical data from BigQuery to S3 in Parquet format. The existing data in S3 is written by an Apache Flink job and follows a specific schema.
I have transformed ...
0
votes
1
answer
53
views
Pandas Dataframe Prints starting at Fourth Row in Excel
I am using iloc to print every third row in a Pandas Dataframe, but now when it prints to Excel, it prints starting in the fourth row (third excluding header). I want it to print starting at the first ...
-3
votes
2
answers
100
views
Pandas DataFrame Cannot use assign function - Why?
I am encountering some odd behavior in pandas, and I am hoping someone could shed some light on specifics from the df.assign(...) function in a pandas dataframe. I am getting a ValueError when trying ...
0
votes
3
answers
229
views
How to separate multiple tickers into individual dataframes with yfinance downloaded data
I'm trying to download stock data information using yfinance. Currently, I can successfully download a single ticker using yf.download which returns a dataframe with information I can use. This API ...
0
votes
0
answers
24
views
Reassigning pandas columns in chained .assign() gives incorrect values [duplicate]
I often follow the convention (for better or worse) of loading data and preprocessing manipulations in a single line of chained pandas commands. In one such manipulation, I need to multiply a set of ...
0
votes
0
answers
117
views
How to Specify return_dtype for Aggregation and Sorting in Polars LazyFrame?
Let's assume I have a Polars LazyFrame or DataFrame.
In the first step, I execute a with_columns / struct / map_elements / lambda function combination to create dict objects from the columns specified ...
0
votes
2
answers
52
views
Adding values from 2 cells in the previous row in to current row in dataframe
I have a dataframe like below
Name Value
====================
A 2400
B -400
C 400
D 600
And i need the df to be in the below format
Name ...
-2
votes
1
answer
87
views
Cannot convert dataframe column to a int64 data type
I have a problem.
In my Pandas DataFrame, I have a column called 'job' column. I've created a simple and custom transformer that will map values in that column that corresponds to the type of job. The ...
0
votes
0
answers
55
views
Remove O/P without complete session and not the duplicates
def prepare_data(df):
entry_df = df[df['Direction'] == 'enter'][['vrm', 'Direction']]
entry_df.columns = ['vrm', 'Entry_Direction']
exit_df = df[df['Direction'] == 'leave'][['vrm', '...
1
vote
1
answer
188
views
ValueError: Incompatible indexer with Series in pandas DataFrame [duplicate]
python: 3.11
pandas: 2.2.2
I need to assign a dict value to 4-th row in df:
df = pd.DataFrame({'agg': [None] * 5})
df['agg'] = df['agg'].astype(object)
df.loc[3, 'agg'] = {'mm': 4}
It gives an error:...