2
$\begingroup$

I can't seem to wrap my head around this: What is the glm() equivalent for lm(log(y) ~ x1 + x2, data=data)? Is it?

  • a. glm(y ~ x1 + x2, data=data, family=gausssian(link="log"))
  • b. glm(log(y) ~ x1 + x2, data=data, family=gausssian(link="identity"))
  • c. other
$\endgroup$
3
  • $\begingroup$ When you try those three models on your data, what results do you get? P.S. gaussian only has 2 s's. $\endgroup$ Commented Nov 2, 2025 at 10:47
  • $\begingroup$ This has been thoroughly discussed here. Note that they are not the same. $\endgroup$ Commented Nov 2, 2025 at 13:43
  • $\begingroup$ Also somewhat relevant stats.stackexchange.com/questions/77579/… $\endgroup$ Commented Nov 2, 2025 at 18:30

2 Answers 2

8
$\begingroup$

Model b matches the lm() model. Both of those assume that log(y) has a Gaussian distribution with mean a0 + a1*x1 + a2*x2.

Model a assumes that y has a Gaussian distribution, with mean exp(a0 + a1*x1 + a2*x2).

In both cases a0, a1, a2 are the coefficients you are estimating, and the Gaussian variance is constant.

$\endgroup$
0
5
$\begingroup$

B is equivalent; A is not.

Let’s write out the math.

A $$ \log(\mathbb E[y]) = \beta_0+\beta_1x_1+\beta_2x_2\\\iff \mathbb E[y]=e^{\beta_0+\beta_1x_1+\beta_2x_2} $$

B $$ \mathbb E[\log(y)]=\beta_0+\beta_1x_1+\beta_2x_2 $$

B certainly looks like it is equivalent, especially considering that the estimation specified by your code will be minimization of square loss (same as in OLS, equivalent to Gaussian maximum likelihood estimation).

By the strong form of Jensen’s inequality, $\log(\mathbb E[y])<\mathbb E[\log(y)]$ with strict inequality, so the two are not equivalent to each other, ruling out the possibility that A is also equivalent to the lm specification.

Overall, B is equivalent to the lm specification while A is not.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.