Skip to main content

Questions tagged [standardization]

Usually refers to "z-standardization" which is shifting and rescaling data to assure they have zero mean and unit variance. Other "standardizations" are possible, too.

422 votes
7 answers
434k views

In some literature, I have read that a regression with multiple explanatory variables, if in different units, needed to be standardized. (Standardizing consists in subtracting the mean and dividing ...
mathieu_r's user avatar
  • 4,631
23 votes
2 answers
12k views

Say I have a data set with say 5000 rows and about 150 columns (5000 samples, 150 predictors/features) and I'm interested in a applying a ridge or lasso regression. (Let us assume using a logit link ...
Marcel's user avatar
  • 1,430
70 votes
3 answers
36k views

In what circumstances would you want to, or not want to scale or standardize a variable prior to model fitting? And what are the advantages / disadvantages of scaling a variable?
Andrew's user avatar
  • 6,378
177 votes
5 answers
301k views

At work we were discussing this as my boss has never heard of normalization. In Linear Algebra, Normalization seems to refer to the dividing of a vector by its length. And in statistics, ...
Chris's user avatar
  • 1,779
61 votes
1 answer
33k views

Do I transform all my data or folds (if CV is applied) at the same time? e.g. (allData - mean(allData)) / sd(allData) Do I transform trainset and testset ...
DerTom's user avatar
  • 807
74 votes
7 answers
156k views

I am trying to predict the outcome of a complex system using neural networks (ANN's). The outcome (dependent) values range between 0 and 10,000. The different input variables have different ranges. ...
Boris Gorelik's user avatar
48 votes
4 answers
24k views

For the LASSO (and other model selecting procedures) it is crucial to rescale the predictors. The general recommendation I follow is simply to use a 0 mean, 1 standard deviation normalization for ...
László's user avatar
  • 997
17 votes
6 answers
14k views

I've come across a very good text on Bayes/MCMC. IT suggests that a standardisation of your independent variables will make an MCMC (Metropolis) algorithm more efficient, but also that it may reduce (...
user avatar
52 votes
3 answers
71k views

I have read three main reasons for standardising variables before something such as Lasso regression: 1) Interpretability of coefficients. 2) Ability to rank the ...
Jase's user avatar
  • 2,306
66 votes
4 answers
41k views

A common good practice in Machine Learning is to do feature normalization or data standardization of the predictor variables, that's it, center the data substracting the mean and normalize it dividing ...
SkyWalker's user avatar
  • 925
46 votes
1 answer
49k views

I have 2 simple questions about linear regression: When is it advised to standardize the explanatory variables? Once estimation is carried out with standardized values, how can one predict with new ...
teucer's user avatar
  • 2,071
20 votes
2 answers
25k views

I realise this is probably a very simple question but after searching I can't find the answer I am looking for. I have a problem where I need to standardize the variables run the (ridge regression) ...
Baz's user avatar
  • 1,793
82 votes
4 answers
169k views

My question is do we need to standardize the data set to make sure all variables have the same scale, between [0,1], before fitting logistic regression. The formula is: $$\frac{x_i-\min(x_i)}{\max(...
user1946504's user avatar
  • 1,397
35 votes
4 answers
107k views

I have a question in which it asks to verify whether if the Uniform distribution (${\rm Uniform}(a,b)$) is normalized. For one, what does it mean for any distribution to be normalized? And two, how ...
Ada's user avatar
  • 519
22 votes
4 answers
35k views

Is it that in standardization variance is known while in studentization it is not known and therefore estimated? Thank you.
58485362's user avatar
  • 221

15 30 50 per page
1
2 3 4 5
8