Skip to main content

All Questions

-2 votes
1 answer
80 views

Dummy Variable as Boolean rather than Integer [closed]

I'm working on a machine learning project in Python. Using pandas pd.get_dummies I'm trying to create dummy variables for a categorical column in my data but the variables are being converted to ...
Victor Olusegun's user avatar
-1 votes
1 answer
79 views

How can I achieve accurate imputation of missing values in a dataset?

I'm working with a dataset containing details about used cars, and I've encountered several missing values in the Fuel_Type column. The possible values include 'Gasoline', 'E85 Flex Fuel', 'Hybrid', '...
user27500319's user avatar
-1 votes
1 answer
56 views

KeyError when using array as feature in language detection

I am following this tutorial for language detection using machine learning. In the dataset I am using, however, there are multiple variables as features. I tried, in the place of X = data["Text&...
harry's user avatar
  • 111
0 votes
1 answer
76 views

How to make Isolation Forest detect anomaly at the peak of the difference, instead of the first value seen

I am using Isolation Forest to identify anomalies in a very large data frame. The data is noisy, so I have conducted many filtering operations to smooth out the noise so that the true anomalies ...
Zach Tynes's user avatar
1 vote
1 answer
1k views

Write data directly to blob storage from an Azure Machine Learning Studio notebook

I'm working on some interactive development in an Azure Machine Learning notebook and I'd like to save some data directly from a pandas DataFrame to a csv file in my default connected blob storage ...
Matt_Haythornthwaite's user avatar
0 votes
0 answers
63 views

Low Validation and Test Accuracy with Random Forest on ECG Data

I'm working on a project involving ECG data classification using a Random Forest model. Unfortunately, my model's performance is significantly lower than expected, and I'm struggling to understand why....
MEJRI Rawaa's user avatar
0 votes
0 answers
54 views

How to convert Xarray Dataset to a Darts Time Series

I have an Xarray dataset object with lat/lon/time coordinates This is a map of climate data. I want to convert this to an Darts TimeSeries object in order to train models on it. There is a function to ...
David Flasterstein's user avatar
1 vote
0 answers
83 views

How to save and load TensorFlow Decision forest model for incremental learning?

I am developing TensorFlow decision forest regression model for incremental learning, So I have developed the model and have saved the model. When I retrain with new data the error is coming like &...
Swasthik Shivananda's user avatar
0 votes
0 answers
85 views

Problems with custom dataset using Minirocket classification

I'm working on a bigger school project, trying to classify timeseries measurements with Minirocket/Rocket. My trainingdata consists of a 1D matrix containing the measurements, and a seperate 1D matrix ...
Michael's user avatar
0 votes
0 answers
45 views

How to use standardscaler() for the predict() function for a single row having multiple columns?

I am trying to build a house price prediction system. Data has outliers and is non-gaussian and for target feature y, log transform is used. I have used StandardScaler() to fit for my model before ...
vizzy bhagat's user avatar
1 vote
1 answer
328 views

How does prediction for Ordered logit regression work?

I am learning about Ordered logit regression and I was wondering how the prediction works mathematically and how can I do it in python by myself. I know that in python i can just simply use predict ...
KriSt0f's user avatar
  • 13
0 votes
0 answers
46 views

how to label a multicolumn, multi category dataset and save it to a CSV or Parquet and train it using SVM

I am working with audio classification using OPENSMILE library. After preprocessing the audio data i am getting a 800x25 shaped data which is just for one file (each files is around 15 seconds long) ...
Nugget's user avatar
  • 60
-1 votes
2 answers
72 views

How can I improve the RMSE in my car price estimate?

How can I improve the RMSE in my car price estimate? First, I will fill in the missing condition values ​​by estimating it based on the number of kilometers driven. ` new_condition_df = df[df['...
Aaron7's user avatar
  • 277
0 votes
0 answers
83 views

MACHINE LEARNING : numpy , issue dealing with NaN values

ISSUE: Dealing with Nan values I have tried replacing the Nan values with 0 to test if it would output anything. However , even with zeros filling the Nan slots the MSE returns Nan which makes me ...
angryhorse's user avatar
0 votes
1 answer
165 views

LogisticRegression model producing 100 percent accuracy

I have fetched Amazon Reviews for a product and now trying to train a logistic regression model on it to categorize customer reviews. It gives 100 percent accuracy. I am unable to understand the issue....
JAMSHAID's user avatar
  • 1,357

15 30 50 per page
1
2 3 4 5
114