1,316 questions
-4
votes
0
answers
22
views
Handling test files with no matching training data in fairness evaluation of subgroups [closed]
Workflow Summary
1.Training the Model
I load the cleaned adult_cleaned.data as my training data.
I preprocess the data (e.g., converting income into a binary label, handling missing values).
I train a ...
0
votes
0
answers
36
views
Create a new line for comma separated values in pandas column - I dont want to add new rows, I want to have same rows in output [duplicate]
I have a dataframe like this,
df
col1 col2
1 'abc,pqr'
2 'ghv'
3 'mrr, jig'
Now I want to create a new line for each comma separated values in col2, so the output would look ...
0
votes
1
answer
71
views
Timestamp issue while creating the model using pipeline in Vertex AI
I am currently utilizing the XGBoost classifier within a pipeline that includes normalization and the XGBoost model itself. The model has been successfully developed in the Notebook environment.
The ...
0
votes
1
answer
35
views
Cross-Validation Function returns "Unknown label type: (array([0.0, 1.0], dtype=object),)"
Here is the full error:
`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[33], line 2
...
0
votes
0
answers
35
views
Issues with Converting Sklearn Logistical Regression Predicted Probabilities into Scores
I'm trying to convert a logistical regression model into user-level scores, based on this article.
y_pred_df['sub_primary'] = logreg.predict_proba(y_pred_df.loc[:, [col for col in y_pred_df.columns if ...
11
votes
2
answers
122k
views
How to use DataFrameMapper to delete rows with a null value in a specific column?
I am using sklearn-pandas.DataFrameMapper to preprocess my data. I don't want to impute for a specific column. I just want to drop the row if this column is Null. Is there a way to do that?
1
vote
2
answers
69
views
ElasticNetCV in Python: Get full grid of hyperparameters with corresponding MSE?
I have fitted a ElasticNetCV in Python with three splits:
import numpy as np
from sklearn.linear_model import LinearRegression
#Sample data:
num_samples = 100 # Number of samples
num_features = 1000 ...
2
votes
3
answers
96
views
Pandas takes all columns of a dataframe even when some columns are specified
I am trying to train KMeans model using Scikit-Learn.
I am stuck on this issue for 2 days.
Pandas is selecting all columns of a dataframe even though I specified 2 columns.
Here is the dataframe in ...
0
votes
0
answers
25
views
_fit_method for KNN gives KD-tree even though I'm working in a high dimensional spce
so since KNeighborsClassifier class in sklearn find the best algorithm depending on the values from fit method when using auto (which is the default), when accessing the algorithm using ._fit_method I ...
1
vote
2
answers
60
views
Using SKLearn KMeans With Externally Generated Correlation Matrix
I receive a correlation file from an external source. It is a fairly straightforward file and looks like the following.
A sample csv can be found here
https://www.dropbox.com/scl/fi/...
0
votes
2
answers
86
views
Using a Mask to Insert Values from sklearn Iterative Imputer
I created a set of random missing values to practice with a tree imputer. However, I'm stuck on how to overwrite the missing values into the my dataframe. My missing values look like this:
from ...
0
votes
1
answer
206
views
model.fit() class weights do not work when training the model
when calculating classes_weight with
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight(class_weight="balanced",
classes=np.unique(...
0
votes
1
answer
36
views
Data cardinality is ambiguous sklearn.train
model.fit(x_train, y_train, epochs=1000)
i'm trying to make a ai but mine code gives a error and i don't how to fix it?
this is the error
ValueError: Data cardinality is ambiguous:
x sizes: 455
y ...
0
votes
1
answer
186
views
Mlflow log_figure deletes artifact
I am running mlflow with autologging to track an xgboost model. By default, under artifacts it saves the model, requirements, and feature importances. Cool stuff I want to keep.
But, if I try to add ...
1
vote
1
answer
69
views
multiple linear regression house price r2 score problem
I Have Sample House Price Data And Simple Code :
import pandas as pd
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from sklearn....