Questions tagged [machine-learning]
Machine Learning is a subfield of computer science that draws on elements from algorithmic analysis, computational statistics, mathematics, optimization, etc. It is mainly concerned with the use of data to construct models that have high predictive/forecasting ability. Topics include modeling building, applications, theory, etc.
11,331 questions
4
votes
1
answer
172
views
Different classifiers are yielding same metric results, is it normal?
I am trying to implement the strategy of hierarchical classification with chained classifiers and a Bayesian network, as in the paper of Serrano-Perrez & Sucar.
The data in my case are ontologies, ...
1
vote
0
answers
13
views
Kolmogorov-Arnold Network fitting software that allows exchangeability assumption?
I have a problem in which I'd like to try to approximate an unknown function using the a specific version of the Kolmogorov representation theorem. I will have somwhere between at minimum dozens of ...
2
votes
0
answers
20
views
Kaggle competition differentiation among competitors
Given there are so many kaggle competitions, how does the winner win technically? Does he/she invent a completely new algorithm to solve the problem? By now there are 8 million sincere students who ...
6
votes
2
answers
303
views
How does Validation work for Time-Series Forecasting?
What is the standard method for splitting time series into train/validation/test in time-series forecasting?
Example 1000 time series and total time steps is 300. Forecast horizon is at time step 200 ...
5
votes
1
answer
30
views
How should I approach feature selection when working with a very large scraped dataset for regression?
I recently scraped a large dataset from several websites and ended up with around 25–30 potential features that might influence the target variable. The dataset is fairly large (hundreds of thousands ...
2
votes
0
answers
15
views
Trying to understand if I'm implementing GluonTS sliding window splitting and validation set correctly
The documentation is a little bit confusing so I thought I would ask here to make sure, I'm using:
...
7
votes
4
answers
336
views
Calculating next row in binary matrix
if I have the binary matrix which looks something like this (this is only 10 rows of binary matrix, I have a dataset of a million rows, so you can see what the binary matrix looks like):
...
1
vote
0
answers
81
views
Has anyone tried to concatenate Features embeddings with Topological Data Analysis vectorized Persistence Diagrams?
In the contest of binary graph classification, we were thinking about considering the last layer features (before the outputs) of a GIN (Graph Isomorphism Network) and concatenate them with the ...
5
votes
0
answers
30
views
Unable to predict values for test data
I have build and trained a NMT model using Rnn in Google colab and Now when I am trying to predict for my test data my Google colab session keeps on crashing . The shape of my test data is 47838×55
...
4
votes
1
answer
43
views
Why does my model perform worse after transforming the target?
I have a target with skewed distribution. So, i tried applying TransformedTargetRegressor from scikit using np.log1p as the function and np.expm1 as the inverse function. However, when i evaluate it, ...
6
votes
1
answer
129
views
When attempting to maximize F1 score for a decision tree on test data using cost-complexity pruning why is it yielding the fully grown tree?
I'm learning about classification using decision trees. I'm using DecisionTreeClassifier function in the scikit-learn library in Python to train the model on training data (yields fully grown tree), ...
1
vote
0
answers
181
views
How to properly predict goals in soccer matches using match statistics?
This is my first time posting here
I'm a beginner in Data Science and currently trying to apply what I've learned to a real-world problem.
I built a web scraping script to collect statistics from ...
5
votes
1
answer
120
views
Unable to run pandas/modin[ray] code on sagemaker unified studio
I am working on a movie recommendation problem where I get multiple files from the source, and the total data size is around 900 MB. I am using the ...
0
votes
0
answers
37
views
Guide me with my major project titled Satellite-Based Agricultural Vulnerability Monitoring
I am working on a major project titled Utilizing Satellite Data and Deep Learning to Monitor Agricultural Vulnerabilities to Climate Change. My goal is to develop a system to monitor agricultural ...
2
votes
0
answers
48
views
Feature selection for unsupervised learning with a One-Class SVM
I am trying to build a solution to detect a particular sound against all possible other sounds occuring in nature.
My approach is to train a One-Class SVM only on my class of interest, hoping it will ...