Questions tagged [normalization]
Usually "normalization" means re-expressing univariate data to make values lie within a specified range.
1,240 questions
3
votes
3
answers
113
views
which normalization strategy is preferable when clustering territorial units according to hazard exposure?
I am working on a project to cluster provinces according to their exposure to river floods. I am considering the following indicators:
Total number of flood events / total provincial area
Total ...
2
votes
1
answer
109
views
Normalizing Tables for GPA Evaluation
I am working on a PowerBI report to display student data. The primary objectives are to filter based upon the academic level (undergraduate, graduate, doctorate), School in the university (Arts & ...
2
votes
1
answer
28
views
How should distribution shift in docking-derived energy features be handled when ligand size changes?
I’m using docking-derived binding energy values as input features in a machine-learning model.
All of the original data was generated from molecules of similar size, but our new dataset contains much ...
0
votes
0
answers
24
views
How to normalize values to a new range [duplicate]
I have a set of values with the range -22.28 to 32.65
I need to normalize them to the range 1 to 10.
The expected result of the min and max values after the normalization is -22.28 --> 1 and 32.65 -...
4
votes
1
answer
534
views
Should I normalize both train and valdiation sets or only the train set?
I have a question about normalization when merging training and validation sets for cross-validation.
Normally, I normalize using re-scaling (Min-Max Normalization) calculated from the training set ...
1
vote
0
answers
49
views
Running statistics standardization in reinforcement learning
so i'm training DDPG agent on 6 axis arm robot to move an object from A to B. The inputs are the coordinate of the joints along with the coordinate of the object that need to be moved.
So, i'm kinda ...
0
votes
1
answer
98
views
Scale difference between predictions and real values?
I am comparing time series of volumes of products in different orders of magnitudes (some have volumes of ~1k, others 10k, 50k etc).
I have real values and predictions and want to compare the ...
4
votes
1
answer
89
views
Compare Hb values measured by two different hemoglobin meters
I have two sets of hemoglobin measurements from two different machines, each measuring different individuals. I don't have calibration information, but I do know the minimum and maximum values for ...
416
votes
6
answers
1.8m
views
How to normalize data to 0-1 range?
I am lost in normalizing, could anyone guide me please.
I have a minimum and maximum values, say -23.89 and 7.54990767, respectively.
If I get a value of 5.6878 how can I scale this value on a scale ...
1
vote
0
answers
100
views
XGBoost is NOT invariant to feature scaling [closed]
I am using the python library xgboost to predict a continuous variable Y from a continuous variable X and some other (class-) features. I suspect that my X has low predictive power, if any.
If I scale ...
2
votes
0
answers
89
views
Expected value of the outer product of normalized, non-centered Student t vector
I was studying the expected value of the outer product of a normalized non-centered Gaussian vector and I found this very interesting and solved question and I am looking to generalize to a Student t ...
4
votes
1
answer
257
views
Transformation of skewed independent variables for GLMMs
I am building generalized linear mixed models with random effects using the glmmTMB R package. I have a set of continuous fixed variables, many of which exhibit high skewness and kurtosis. I have read ...
0
votes
2
answers
186
views
CLT and Distribution of Sums: Reconciling Standardization vs. Direct Application
I'm working through an exercise involving the Central Limit Theorem (CLT) and am running into a conceptual conflict.
Suppose I have $X_i \sim U[0,1]$ for $i = 1, ..., n$, with $n = 100$. Let $Y = \...
1
vote
1
answer
142
views
Potential Sign Issues in a Composite Performance Metric for Model Selection
I am analyzing the results of various machine learning models for a regression task, using four metrics: RMSE, MAE, MAPE, and $R^2$. My approach involves two types of analyses:
Individual Metric ...
1
vote
1
answer
374
views
Way too large MSE for random forest model
I'm currently working on building Random Forest Models in python. My topic is to investigate the Imoportance of specific variables for the accuracy of machine learning to explain the market ...