All Questions
Tagged with sklearn-pandas python-3.x
222 questions
0
votes
1
answer
71
views
Timestamp issue while creating the model using pipeline in Vertex AI
I am currently utilizing the XGBoost classifier within a pipeline that includes normalization and the XGBoost model itself. The model has been successfully developed in the Notebook environment.
The ...
0
votes
1
answer
34
views
How to count number of occurences of each element using groupby in pandas column [duplicate]
Following is the df
from io import StringIO
import pandas as pd
df = pd.read_csv(StringIO("""
Group Date Rank
A 01-01-2023 1
A 01-02-2023 2
A 01-03-2023 3
A 01-04-2023 2
A 01-05-2023 1
...
0
votes
1
answer
456
views
I got the following error: 'DataFrame' object has no attribute 'year'
picture of csv file containing raw data I am trying to plot a scatter graph using an online csv file i downloaded in inorder to get the linear regression.
%matplotlib inline plt.scatter(df.year, df....
-1
votes
1
answer
285
views
Do we need to exclude OneHotEncoded columns while standardizing or normalizing using MinMaxScaler() or StandardScaler()?
This is the final cleaned DataFrame (df2) before Standardizing
my code:
scaler=StandardScaler()
df2[list(df2.columns)]=scaler.fit_transform(df2[list(df2.columns)])
df2
This returns a DataFrame after ...
2
votes
1
answer
88
views
Complicated double sum using groupby in Pandas dataframe
I have a dataframe that looks like
Race_ID Date Student_ID a b
1 1/1/2023 1 3 1
1 1/1/2023 2 2 2
1 1/1/...
0
votes
1
answer
68
views
Python sklearn confusion matrix
I am trying to create a confusion matrix with probabilities.
y_pred_train = logistic.predict_proba(X_train)
confusion_matrix(y_train, y_pred_train)
ValueError: Classification metrics can't handle a ...
0
votes
0
answers
849
views
Annotating clustering from DBSCAN to original Pandas DataFrame
I have working code that is utilizing dbscan to find tight groups of sparse spatial data imported with pd.read_csv.
I am maintaining the original spatial data locations and would like to annotate the ...
-1
votes
1
answer
600
views
RuntimeWarning: invalid value encountered in divide in ML By Sklearn in Python
After I run my project these error shown and i don't know what am i doing?
:\Users\Alir\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\utils\extmath.py:1047: RuntimeWarning: invalid ...
1
vote
4
answers
172
views
I get the same output for a classifier algorithm with sklearn and pandas
Problem
I get the same output everytime regardless of the input.
Context
I have a .csv with IDs that represent a team of 5 persons (previously formed teams) like this:
0, 1, 2, 3, 4
5, 6, 7, 3, 8
2, 5,...
1
vote
1
answer
842
views
Sklearn KNN Imputer is missing some values
I was trying to impute a column with some NaNs using KNN imputer from Sk-learn. Things seemed to be working properly, but I realized that I still have some of the NaNs in the imputed column. What ...
0
votes
1
answer
348
views
How to install sklearn2pmml correctly in python?
I have been trying to install sklearn2pmml using pip
pip install --upgrade sklearn2pmml==0.83.0
pip install --upgrade sklearn2pmml
pip install sklearn2pmml
python3 -m pip install sklearn2pmml
I have ...
0
votes
1
answer
83
views
How do I confirm the output from the sklearn transform model (sc.transform)?
I am learning about the StandardScaler module in sklearn. I understand the sc.fit obtains the mean of the data and uses it to transform the train and the test of the data, but I do not understand ...
0
votes
1
answer
54
views
Users' trip time over a particular period of time
The Geolife dataset is a GPS trajectories of users logged as they move. Thanks to Sina Dabiri for providing a repository of the preprocessed data. I work with his preprocessed data and created a ...
0
votes
3
answers
533
views
Splitting strings of tuples of different lengths to columns in Pandas DF
I have a dataframe that looks like this
id
human_id
1
('apples', '2022-12-04', 'a5ted')
2
('bananas', '2012-2-14')
3
('2012-2-14', 'reda21', 'ss')
..
..
I would like a "pythonic" way to have ...
0
votes
0
answers
58
views
Getting "Building wheel for Scorer (setup.py) ... error"
I am installing "Scorer". Commands used:
pip install sklearn metrics Scorer
pip install sklearn metrics Scorer --no-cache-dir
Any idea on the below error. I already tried referring options ...