All Questions
Tagged with machine-learning data-science
1,694 questions
-7
votes
0
answers
44
views
Is there model overfitting or not? Also how good is the model performance based on these graphs? Is it good enough? [closed]
These are the graphs:
Model Accuracy , Model Loss , Model Precision
, Model Recall , Confusion Matrix , Receiver Operating Characteristics (ROC) , Precision-Recall Curve
Problem: I am doing ...
-2
votes
0
answers
27
views
Streamlit Crashes on Image Upload [closed]
"OSError: cannot identify image file <_io.BytesIO object at 0x7fce02c76b80>"
I'm trying to deploy a ML model on Streamlit and I encountered this error as displayed above while trying ...
0
votes
0
answers
43
views
MiniBatchKMeans BERTopic not returning topics for half of data
I am trying to topic a dataset of tweets. I have around 50 million tweets. Unfortunately, such a large dataset will not fit in ram (even 128GB) due to the embeddings. Therefore, I have been working on ...
-1
votes
1
answer
51
views
How to encode item features with high number of categories for recommendation
For the recommendation problem I am working on, there are around 50000 unique brands and 3 level product categories, level_1_cat (50 categories), level_2_cat (100 categories) and level_3_cat (1000 ...
1
vote
0
answers
67
views
How to use Python to replicate Random Forest Regression prediction using decision paths?
I'm trying to test whether I've understood the way RandomForestRegressor produces forecast after a model's fitted. I used the California housing example to train a simple model and predict the first ...
0
votes
0
answers
132
views
Create Kedro PartitionedDataset of PartitionedDatasets
I'm working in a kedro project where I want to automatically label thousands of audio files, apply transformations to them and then store them in a folder of folders, each subfolder corresponding to ...
0
votes
0
answers
35
views
Techniques for adaptive prediction with feedback in an evolving feature space
I am working on a prediction problem where the target variable 𝑦 is drawn from a normal distribution, and the relationship between the continuous feature space 𝑋 and 𝑦 remains stable over time. ...
-1
votes
1
answer
73
views
Getting ValueError: All arrays must be of the same length
I have been trying to convert a dictionary into a dataframe but everytime i keep getting ValueError: All arrays must be of the same length. i Have checkde the length of each array and confirmed them ...
0
votes
1
answer
93
views
Stuck in handling incorrect input data on web app for model training
I am trying to add an exception feature in an ML project I am working on, I create a web app which accepts student performance data as a CSV file and then performs different machine learning ...
0
votes
1
answer
57
views
How can we map catagorical codes in a dataframe back to the original data points in the original dataframe?
I have a simple dataframe that looks like this.
import pandas as pd
# Intitialise data of lists
data = [{'Year': 2020, 'Airport':2000, 'Casino':5000, 'Stadium':9000, 'Size':'Small'},
{'Year':...
0
votes
1
answer
955
views
I'm getting an import error with ydata-profiling-4.4.0: `BaseSettings` has been moved to the `pydantic-settings` package
I know that Pydantic V2 introduced new things which make it incompatible with V1, so I switched from pandas_profiling to ydata_profiling. Because of that, I had to switch versions of the dependencies, ...
0
votes
1
answer
82
views
How to perform Time series forecasting in short Interval of time data (only 8 years are given) for multiple locations?
I want to find the production of field forecasting in year 2018 and 2019 at multiple location for a short interval of data?
At each Harvesting Site(donated by index in the image and by its Latitute ...
1
vote
1
answer
833
views
My test and train data has the same number of columns but OneHotEncoder creates different size of matrixes
I am trying to create a model with train and test datasets which are seperate. They have same number of columns. When I try to encode categorical features the created matrix by OneHotEncoder is comes ...
-1
votes
1
answer
66
views
Is there anything to increase the accuracy of this predictive model?
I want to improve the accuracy of my trained model. I tried to create an ML model to predict whether a test sample belongs to someone with or without disease, based on the gene expression profiling, ...
-1
votes
1
answer
33
views
There is a problem in the encoding the string variable to float or integer in sklearn during pipeline building
I was building a pipeline in sklearn using the column transformer ,I was using the column transform for the encoding and the code runs well but during training it is showing error that "could not ...