Skip to main content

Questions tagged [pandas]

pandas is a python library for Panel Data manipulation and analysis, e.g. multidimensional time series and cross-sectional data sets commonly found in statistics, experimental science results, econometrics, or finance.

3 votes
0 answers
249 views

I have data in pandas as below: 123-543-2345 876|678|3469 304-762-2467 Trying to change all to this format: 123-543-2345 I ...
Alfred's user avatar
  • 39
5 votes
1 answer
104 views

I have a dataframe with two columns: 'HAD_DISEASE' (which stands for if the subject has had said disease) and it has either 1 or 2 as a value, 1 stands for yes and 2 for no. 'VNR', also an integer (...
celepharn's user avatar
3 votes
3 answers
265 views

I am currently working on the dataset where I am supposed to work on the prediction of the rides that might be cancelled. If it is predicted that it will be cancelled(because of drivers), then the ...
RushHour's user avatar
  • 187
4 votes
1 answer
77 views

I am trying to model the arch of a basketball free throw projectory. Usually per person, this dataset has 6 points each where it is the height of the basketball via various seconds after the player ...
ChairmanMeow's user avatar
4 votes
2 answers
145 views

Editing to add one key information ( df and dailyRet ), which I noticed how imp it is... after solving this issue. ...
Vineet Tripathi's user avatar
4 votes
1 answer
102 views

I've been using pandas.NamedAgg all over my Python script, but I'm still a newbie to both Python and pandas. Today, I went to the documentation to see if I can streamline my code by leaving out the <...
user2153235's user avatar
7 votes
2 answers
159 views

I'm building a machine learning model to predict loan approval rate. My dataset includes features like: Credit_History ...
Muhammed Erbay's user avatar
4 votes
0 answers
41 views

For a college project for my data science course I am trying to fit a model based on the U.S. DOT's 2015 Kaggle Flight Cancellations dataset, but am not having great luck with model performance (MSE ...
Jake Malis's user avatar
2 votes
0 answers
71 views

After merging DataFrames, my model gives worse performance even when using the same original features. Minimal Example: ...
Ayoub Mokeddem's user avatar
1 vote
0 answers
39 views

I'm training a Deep Q-Network (DQN) to trade crypto using historical data. My model keeps outputting NaN values for the Q-values during prediction. I'm using a custom function getState2() to generate ...
user29255210's user avatar
2 votes
1 answer
65 views

Description: Input is a CSV file CSV file contains columns of different data types: Ordinal Values, Nominal Values, Numerical Values and Multi Value For the multivalue columns. Minimum is 1, ...
DILF Unboxing's user avatar
7 votes
1 answer
97 views

I am currently working on a dataset that has two columns: customerID and date. I want to find the minimum date for each customerID. Initially, I used the following code: ...
Guna's user avatar
  • 897
6 votes
1 answer
207 views

As part of my internship, I am working on a project where I need to process two Excel files: File 1 contains names and numbers. File 2 contains names and an empty column for amounts. The goal is to ...
Etienne Reverchon's user avatar
1 vote
1 answer
122 views

So, I downloaded this Ecommerce dataset from kaggle here: https://www.kaggle.com/datasets/kolawale/focusing-on-mobile-app-or-website After converting it to a csv file, there seems to be an issue. The ...
Majoka's user avatar
  • 11
2 votes
2 answers
87 views

I'm running a model to do binary classification, 75% of the data is FALSE and 25% of the data is TRUE. I get 100% Training Accuracy, 96.5% validation accuracy, but only 40% accuracy on the test set. ...
Noah101's user avatar
  • 21

15 30 50 per page
1
2 3 4 5
90